Hey HN! We've built a system that lets you run any application on spot instances without worrying about preemption. It works by running VMs on top of VMs - without the need for nested virtualization or hardware acceleration support - by integrating our open-source projects: Drafter (https://github.com/loopholelabs/drafter - handles VM live migration), PVM (https://github.com/loopholelabs/linux-pvm-ci - enables nested virtualization without hardware support), and Silo (https://github.com/loopholelabs/silo - provides efficient live storage migration over the public internet). The cool part is that we can migrate workloads between spot instances faster than they get preempted, with no dropped connections - even across different cloud providers and regions.
While there are other solutions that try to handle spot instance preemption through checkpointing, we take a fundamentally different approach by making preemption irrelevant through continuous state capture and seamless migration. We showed this off at KubeCon NA 2024 by migrating a Redis pod between AWS, GCP, and Azure while maintaining active client connections.
All core components are open source, including our Firecracker patches (https://github.com/loopholelabs/firecracker/tree/main-live-m...). We're currently in the process of launching with GitHub Actions runners that can safely run on spot instances (which are 75%+ cheaper!) without risk of interruption, even for long-running builds and stateful workloads at https://architect.run/.
More info in the linked blog post! Would love to hear your thoughts and feedback on the technical implementation and potential use cases.
Hey HN! We've built a system that lets you run any application on spot instances without worrying about preemption. It works by running VMs on top of VMs - without the need for nested virtualization or hardware acceleration support - by integrating our open-source projects: Drafter (https://github.com/loopholelabs/drafter - handles VM live migration), PVM (https://github.com/loopholelabs/linux-pvm-ci - enables nested virtualization without hardware support), and Silo (https://github.com/loopholelabs/silo - provides efficient live storage migration over the public internet). The cool part is that we can migrate workloads between spot instances faster than they get preempted, with no dropped connections - even across different cloud providers and regions.
While there are other solutions that try to handle spot instance preemption through checkpointing, we take a fundamentally different approach by making preemption irrelevant through continuous state capture and seamless migration. We showed this off at KubeCon NA 2024 by migrating a Redis pod between AWS, GCP, and Azure while maintaining active client connections.
All core components are open source, including our Firecracker patches (https://github.com/loopholelabs/firecracker/tree/main-live-m...). We're currently in the process of launching with GitHub Actions runners that can safely run on spot instances (which are 75%+ cheaper!) without risk of interruption, even for long-running builds and stateful workloads at https://architect.run/.
More info in the linked blog post! Would love to hear your thoughts and feedback on the technical implementation and potential use cases.