DevOps
Learning Corner
Advanced 18 min read

Kubernetes

Container orchestration at scale — deploy, scale, and manage containerized applications automatically

Think of it as an

Operating System

for your containers — it runs, heals, and scales them so you don't have to

The City Planner Analogy

Imagine you're the mayor of a city. You don't personally construct buildings, lay pipes, or direct traffic. Instead, you set policies — "we need 5 hospitals, 10 schools, and roads connecting them." City departments handle the rest. If a hospital burns down, they rebuild it automatically. If population grows, they build more schools. That's Kubernetes.

CITY HALL Control Plane Sets policies, makes decisions NEIGHBORHOOD A Node 1 (Worker) 🏥 Hospital 🏫 School Pod Pod NEIGHBORHOOD B Node 2 (Worker) 🏪 Store 🏥 Hospital Pod Pod NEIGHBORHOOD C Node 3 (Worker) 🏫 School 🗄️ Database Pod Pod
🏛️

City Hall

Control Plane

🏘️

Neighborhoods

Worker Nodes

🏢

Buildings

Containers

🧱

City Blocks

Pods

The Problem It Solves

Manual Container Management
# 3am. PagerDuty wakes you up.
$ ssh prod-server-12
$ docker ps | grep api
# Container crashed. Again.
$ docker run -d --restart=always \
    -p 8080:8080 myapp:v2.3.1
# Wait, was it v2.3.1 or v2.3.2?
# Which servers have the new version?
$ for server in prod-{1..20}; do
    ssh $server "docker ps"
  done
# 4am. Still debugging. 😩
  • Manual restarts when containers crash
  • No easy way to scale up or down
  • Version mismatches across servers
  • Load balancing is your problem
  • Deployments = fear and downtime
Kubernetes Orchestration
# deployment.yaml - the whole story
apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 5
  template:
    spec:
      containers:
      - name: api
        image: myapp:v2.3.2

# kubectl apply -f deployment.yaml
# Container crashes? K8s restarts it.
# Need more? Change replicas: 20
# Go back to sleep. 😴
  • Self-healing: crashed containers restart automatically
  • Declarative scaling: change a number, done
  • Rolling updates with zero downtime
  • Built-in service discovery and load balancing
  • Same config = same infrastructure, anywhere

Core Concepts

🫛

Pods

The smallest deployable unit. A pod wraps one or more containers that share networking and storage. Think of it as a "wrapper" around your container(s).

🖥️

Nodes

Physical or virtual machines that run your pods. Each node has a kubelet agent that communicates with the control plane.

🌐

Clusters

A set of nodes managed together. One cluster = one control plane + multiple worker nodes. Your entire Kubernetes environment.

🔗

Services

A stable networking endpoint for accessing pods. Pods come and go, but Services provide a fixed address and load balance traffic across them.

🚀

Deployments

Manages your pods — how many replicas to run, how to roll out updates, and how to roll back if things go wrong.

📁

Namespaces

Virtual clusters within a cluster. Isolate teams, environments (dev/staging/prod), or applications from each other.

🔐

ConfigMaps & Secrets

Decouple configuration from code. ConfigMaps hold non-sensitive data; Secrets hold passwords, tokens, and keys (base64-encoded).

💾

Volumes

Persistent storage that outlives containers. When a pod restarts, the data survives. Supports cloud disks, NFS, and more.

Architecture

CONTROL PLANE API Server Gateway to cluster etcd Key-value store Scheduler Assigns pods to nodes Controllers Reconciliation loops The brain of K8s. API Server is the single entry point. etcd stores all state. Scheduler decides WHERE. Controllers ensure WHAT. WORKER NODE 1 kubelet Node agent kube-proxy Network rules Pod A nginx Pod B api Pod C worker Container Runtime: containerd / CRI-O WORKER NODE 2 kubelet Node agent kube-proxy Network rules Pod D api Pod E db Pod F cache Container Runtime: containerd / CRI-O

API Server

Front door for all operations

etcd

Cluster's source of truth

Scheduler

Picks the best node for pods

Controllers

Keep desired = actual state

How It Works

When you deploy an application, here's the chain of events inside the cluster:

1

You write a manifest

A YAML file describing your desired state — "I want 3 replicas of my API running."

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-api
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-api
  template:
    metadata:
      labels:
        app: my-api
    spec:
      containers:
      - name: api
        image: myapp:v1.0.0
        ports:
        - containerPort: 8080
2

kubectl sends it to the API Server

The API Server validates the manifest, authenticates you, and stores the desired state in etcd.

$ kubectl apply -f deployment.yaml
deployment.apps/my-api created

# Behind the scenes:
# 1. kubectl → API Server (HTTPS)
# 2. API Server validates YAML
# 3. API Server → etcd (stores desired state)
# 4. API Server confirms back to you
3

Scheduler assigns pods to nodes

The Scheduler watches for unassigned pods and picks the best node based on available resources, constraints, and affinity rules. It considers CPU, memory, disk, and even custom rules you define.

4

kubelet pulls the image and starts containers

On each assigned node, the kubelet pulls the container image and tells the container runtime (containerd) to start the containers. The pod is now Running.

5

Controllers ensure desired = actual state

The Deployment controller continuously watches. If a pod crashes, it creates a new one. If you change replicas from 3 to 5, it spins up 2 more. This is the reconciliation loop — the heart of Kubernetes.

$ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
my-api-7d9b4c5f6-abc12   1/1     Running   0          2m
my-api-7d9b4c5f6-def34   1/1     Running   0          2m
my-api-7d9b4c5f6-ghi56   1/1     Running   0          2m

# Kill a pod — watch K8s bring it back
$ kubectl delete pod my-api-7d9b4c5f6-abc12
pod "my-api-7d9b4c5f6-abc12" deleted

$ kubectl get pods
NAME                      READY   STATUS    RESTARTS   AGE
my-api-7d9b4c5f6-def34   1/1     Running   0          3m
my-api-7d9b4c5f6-ghi56   1/1     Running   0          3m
my-api-7d9b4c5f6-xyz99   1/1     Running   0          5s  ← New pod!

Code Examples

1

Deploy an Application

Write YAML manifests and apply them directly. The most explicit approach — you see exactly what's being created.

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: web-app
  labels:
    app: web-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: web-app
  template:
    metadata:
      labels:
        app: web-app
    spec:
      containers:
      - name: web
        image: nginx:1.25
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 250m
            memory: 256Mi
---
# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: web-app
spec:
  selector:
    app: web-app
  ports:
  - port: 80
    targetPort: 80
  type: ClusterIP
$ kubectl apply -f deployment.yaml -f service.yaml
deployment.apps/web-app created
service/web-app created
2

Scale Your App

Scale imperatively with a command, or declaratively by editing the YAML.

# Imperative — quick and direct
$ kubectl scale deployment web-app --replicas=10
deployment.apps/web-app scaled

# Or auto-scale based on CPU
$ kubectl autoscale deployment web-app \
    --min=3 --max=20 --cpu-percent=70
horizontalpodautoscaler.autoscaling/web-app autoscaled
# hpa.yaml — Horizontal Pod Autoscaler
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: web-app
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: web-app
  minReplicas: 3
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
3

Rolling Updates

Update the image and watch Kubernetes gradually replace old pods with new ones — zero downtime.

# Update the image
$ kubectl set image deployment/web-app web=nginx:1.26
deployment.apps/web-app image updated

# Watch the rollout
$ kubectl rollout status deployment/web-app
Waiting for deployment "web-app" rollout to finish:
  2 out of 3 new replicas have been updated...
  3 of 3 updated replicas are available.
deployment "web-app" successfully rolled out

# Something broke? Roll back instantly
$ kubectl rollout undo deployment/web-app
deployment.apps/web-app rolled back

# Check rollout history
$ kubectl rollout history deployment/web-app
# Control rollout strategy in YAML
spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1        # Max extra pods during update
      maxUnavailable: 0  # Always maintain full capacity
4

Expose to the Internet

Use an Ingress resource to route external traffic to your service with TLS termination.

# ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-app
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - app.example.com
    secretName: web-app-tls
  rules:
  - host: app.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: web-app
            port:
              number: 80

kubectl vs Helm vs Kustomize

kubectl

Direct communication with the Kubernetes API. Write plain YAML, apply it. No abstraction layer.

Learning & debugging Simple deployments Full control

Helm

The "package manager" for Kubernetes. Charts bundle templates + values for reusable, versionable deployments. Huge ecosystem of pre-made charts.

Complex apps Third-party software Release management
🔧

Kustomize

Overlay-based configuration. Keep plain YAML (no templates), layer environment-specific patches on top. Built into kubectl since v1.14.

Multi-environment No templating GitOps friendly

When to use what?

Just learning?

Start with kubectl. Understand the raw YAML before adding abstractions.

Installing software?

Use Helm. Install Prometheus, Grafana, nginx-ingress in one command.

Multi-env configs?

Use Kustomize. Same base, different overlays for dev/staging/prod.

Networking

🌍 INTERNET Ingress Controller LoadBalancer NodePort ClusterIP Pod 1 Pod 2 Pod 3 External IP from cloud Opens port on all nodes Internal only (default)

ClusterIP (default)

Internal-only. Pods can reach each other via service name. Not accessible from outside the cluster.

Use for: internal APIs, databases, caches

NodePort

Opens a static port (30000-32767) on every node. External traffic hits NodeIP:Port.

Use for: development, bare-metal clusters

LoadBalancer

Provisions a cloud load balancer (AWS ALB/NLB, GCP LB). Gets an external IP automatically.

Use for: exposing a single service on cloud

Ingress

HTTP/HTTPS routing rules. One load balancer → multiple services via host/path rules. TLS termination.

Use for: production — the standard approach

Storage

Containers are ephemeral — when they restart, data is gone. Kubernetes Volumes solve this with a layered abstraction: admins provision storage, developers request it.

💽

PersistentVolume (PV)

A piece of storage provisioned by an admin or dynamically. Cluster-level resource.

📋

PersistentVolumeClaim (PVC)

A request for storage by a pod. "I need 10Gi of fast SSD storage."

⚙️

StorageClass

Defines how storage is dynamically provisioned. "fast" = SSD, "standard" = HDD.

# pvc.yaml — Request storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: db-storage
spec:
  accessModes:
  - ReadWriteOnce
  storageClassName: fast-ssd
  resources:
    requests:
      storage: 10Gi
---
# Use it in a pod
spec:
  containers:
  - name: postgres
    image: postgres:16
    volumeMounts:
    - name: data
      mountPath: /var/lib/postgresql/data
  volumes:
  - name: data
    persistentVolumeClaim:
      claimName: db-storage

Production Readiness

Getting containers running is step one. Running them reliably in production requires health checks, resource limits, auto-scaling, and access control.

💓

Health Probes

How Kubernetes knows if your app is alive, ready for traffic, and started successfully.

containers:
- name: api
  image: myapp:v1.0.0
  
  # Is the container alive? (restart if not)
  livenessProbe:
    httpGet:
      path: /healthz
      port: 8080
    initialDelaySeconds: 10
    periodSeconds: 15
  
  # Is it ready for traffic? (remove from LB if not)
  readinessProbe:
    httpGet:
      path: /ready
      port: 8080
    initialDelaySeconds: 5
    periodSeconds: 10
  
  # Has it started? (for slow-starting apps)
  startupProbe:
    httpGet:
      path: /healthz
      port: 8080
    failureThreshold: 30
    periodSeconds: 10

livenessProbe

Container stuck? Restart it.

readinessProbe

Not ready? Stop sending traffic.

startupProbe

Still booting? Don't kill it yet.

📊

Resource Requests & Limits

Guarantee minimum resources and cap maximum usage. Prevents noisy neighbors and OOM kills.

resources:
  # Guaranteed minimum — scheduler uses this to place pods
  requests:
    cpu: 250m      # 0.25 CPU cores
    memory: 256Mi  # 256 MB RAM
  # Hard ceiling — container is killed if it exceeds memory limit
  limits:
    cpu: 500m      # 0.5 CPU cores (throttled, not killed)
    memory: 512Mi  # 512 MB RAM (OOMKilled if exceeded)

✅ Best Practice

  • Always set requests (scheduling depends on it)
  • Set memory limits (prevents OOM cascades)
  • CPU limits are optional (throttling vs. killing)

❌ Anti-Pattern

  • No limits = one pod can starve a whole node
  • Limits too low = constant OOMKills and restarts
  • Requests too high = wasted cluster capacity
🛡️

RBAC (Role-Based Access Control)

Control who can do what in your cluster. Essential for multi-team environments.

# role.yaml — Define permissions
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  namespace: dev
  name: pod-reader
rules:
- apiGroups: [""]
  resources: ["pods", "pods/log"]
  verbs: ["get", "list", "watch"]
---
# rolebinding.yaml — Assign to user
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  namespace: dev
  name: read-pods
subjects:
- kind: User
  name: jane@example.com
roleRef:
  kind: Role
  name: pod-reader
  apiGroup: rbac.authorization.k8s.io

Why Use Kubernetes?

🔄

Self-Healing

Crashed containers restart automatically. Failed nodes get their pods rescheduled. No 3am pages.

📈

Auto-Scaling

Scale pods based on CPU, memory, or custom metrics. Scale nodes based on demand. Handle traffic spikes automatically.

🚀

Zero-Downtime Deployments

Rolling updates replace pods gradually. If a new version fails health checks, rollback happens automatically.

🌐

Service Discovery

Pods find each other by name. No hardcoded IPs. Built-in DNS resolves service names to pod endpoints.

🔐

Secrets Management

Store and inject credentials without baking them into images. Rotate secrets without redeploying code.

📦

Run Anywhere

AWS EKS, GCP GKE, Azure AKS, bare metal, or your laptop. Same YAML works everywhere.

When to Use It

Use when:

  • Running microservices that need to scale independently
  • You need auto-scaling for variable traffic patterns
  • Multi-cloud or hybrid deployments are required
  • Team needs standardized deployment workflows
  • Zero-downtime deployments are a hard requirement
  • You're running 10+ services in production
  • You need strong isolation between teams/environments

⚠️ Skip if:

  • You have a simple app that runs on one server
  • Your team is small and doesn't have K8s experience
  • The app is a monolith that doesn't need scaling
  • You're prototyping or building an MVP
  • A PaaS (Heroku, Railway, Fly.io) would suffice
  • You can't dedicate time to learn and maintain it
  • Your workload is serverless (Lambda/Cloud Functions)

Trade-offs

Pros

  • Industry standard — massive community and ecosystem
  • Cloud-agnostic — same YAML runs on AWS, GCP, Azure
  • Self-healing and auto-scaling out of the box
  • Declarative — describe what you want, not how to get there
  • Extensible — Custom Resource Definitions (CRDs) for anything
  • Battle-tested — runs at Google, Spotify, Airbnb scale

Cons

  • Steep learning curve — lots of concepts to internalize
  • Operational overhead — clusters need maintenance, upgrades
  • Resource hungry — control plane alone needs 2+ CPU, 4GB+ RAM
  • YAML fatigue — verbose configs even for simple things
  • Debugging is hard — distributed systems are inherently complex
  • Overkill for simple apps — sometimes docker compose is enough

Key Takeaways

1

Kubernetes is a container orchestrator

It doesn't run containers — it manages them. It decides where they run, restarts them when they fail, and scales them when needed.

2

Declarative, not imperative

You describe the desired state ("I want 5 replicas"). Kubernetes figures out how to get there and keeps it that way.

3

Pods are the smallest unit

You don't deploy containers directly — you deploy Pods (which wrap containers). But you usually don't create Pods directly either — you use Deployments.

4

Services provide stable networking

Pods are ephemeral — they get new IPs every time. Services give you a stable endpoint that load-balances across healthy pods.

5

Pick the right tool for the job

kubectl for learning & debugging, Helm for packaging complex apps, Kustomize for environment-specific overlays. They're not mutually exclusive.

6

Production needs more than just deploying

Health probes, resource limits, RBAC, and monitoring are not optional. A cluster without these is a ticking time bomb.

7

Start with managed Kubernetes

Don't run your own control plane. Use EKS, GKE, or AKS — they handle upgrades, etcd backups, and high availability. Focus on your apps.

8

It's complex, but worth it at scale

Kubernetes has a real learning curve. But once you're running 10+ services that need to scale, heal, and update independently — nothing else comes close.