Kubernetes can bring great value to companies, but it's easy to fall into the trap of over-complicating its usage. For a streamlined approach, focus on core Kubernetes features like Deployments, Services, and CronJobs. Use Kubernetes to run multiple redundant processes with load balancing and to configure them as code. Avoid Kubernetes for use cases where a human is waiting on a pod to start or for directly storing critical data, as there are better-suited solutions for these scenarios.
Thursday, March 7, 2024This author switched a side project to a Kubernetes-based infrastructure, only to find it overly complex, expensive, and difficult to manage. Despite the promise of high availability, the system suffered from slow performance, difficult debugging, and downtime during node failures. While Kubernetes can be powerful, it's important to choose the right tools for the job and not get caught up in complexity for its own sake if it's not necessary.
This author successfully replaced OpenAI's API with an open-source alternative to reduce the cost of running large-scale AI applications. They tried using Ollama on a local machine to generate text summaries, but limitations with concurrent processing led to them using vLLM (a fast inference runner). To handle large volumes of requests, the author used a Kubernetes cluster to deploy and load balance vLLM.
This post breaks down the setup for a one-person SaaS tech startup, from load balancing to cron job monitoring to payments and subscriptions. The author uses a Kubernetes cluster on AWS, Cloudflare for load balancing and caching, ingress-nginx for routing traffic within the cluster, Flux for automatic deployments, and RDS for database management.
Figma successfully migrated its core services to Kubernetes within 12 months. Its team scoped the migration to focus on core system changes while maintaining abstraction for users. They implemented load testing, incremental rollout, and real-world workload testing, which led to better reliability with a three-cluster setup.