2. Architecture & Control Plane
The Two Halves of Kubernetes
Every Kubernetes cluster splits into two layers: the control plane and the worker nodes.
The control plane is the brain. It stores desired state, makes scheduling decisions, and runs reconciliation loops. It never runs your application containers directly.
Worker nodes are the muscle. They run your actual workloads — your pods, your containers. Each node has an agent (kubelet) that takes orders from the control plane and reports back on status.
In managed services like GKE, EKS, or AKS, the cloud provider runs the control plane for you. You only manage the worker nodes. In self-managed clusters (kubeadm, k3s), you manage both.
# See this split in action:
kubectl get nodes
# NAME STATUS ROLES AGE VERSION
# master-1 Ready control-plane 30d v1.29.2
# worker-1 Ready <none> 30d v1.29.2
# worker-2 Ready <none> 30d v1.29.2Let's walk through every component, starting with the control plane.
API Server: The Front Door
The API server (kube-apiserver) is the only component that talks to everything. Every interaction — from kubectl commands to internal component communication — goes through it.
It does three things:
# The API server runs as a pod in kube-system
kubectl get pods -n kube-system | grep apiserver
# kube-apiserver-master-1 1/1 Running 0 30d
# Check its config
kubectl describe pod kube-apiserver-master-1 \
-n kube-system | grep -A5 "Command:"Think of the API server as a bouncer + receptionist + filing clerk. Nothing happens in the cluster without it knowing.
etcd: The Source of Truth
etcd is a distributed key-value store that holds all cluster state. Every Deployment, Service, ConfigMap, Secret, and Pod definition lives here.
Only the API server reads from and writes to etcd directly. No other component touches it. This is by design — it centralizes access control and consistency.
# etcd runs as a pod (self-managed clusters)
kubectl get pods -n kube-system | grep etcd
# etcd-master-1 1/1 Running 0 30d
# Check etcd health
kubectl exec -it etcd-master-1 -n kube-system -- \
etcdctl endpoint health \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.keyProduction rule: Always back up etcd on a schedule. In managed K8s (GKE, EKS), the provider does this for you. In self-managed clusters, this is your responsibility. Test your restores.
Scheduler: Pod Placement
The scheduler (kube-scheduler) watches for newly created pods that don't have a node assigned yet, then picks the best node for each one.
It runs a two-phase algorithm:
# The scheduler uses resource requests for placement
spec:
containers:
- name: api
resources:
requests: # ← scheduler looks at these
memory: "512Mi"
cpu: "250m"
limits: # ← kubelet enforces these
memory: "1Gi"
cpu: "500m"
# If no node has 512Mi free, the pod stays PendingThe scheduler doesn't move running pods. It only makes placement decisions for new ones. If a node dies, the controller manager creates new pods, and the scheduler places those.
Controller Manager: The Reconciliation Engine
The controller manager (kube-controller-manager) runs dozens of controllers in a single process. Each controller is a loop that watches the current state, compares it to the desired state, and takes action to close the gap.
ReplicaSet Controller
Desired: 5 replicas Actual: 3 running
Action: Create 2 more pods
Node Controller
Watches node heartbeats (every 10s)
Node stops reporting → mark NotReady (40s)
Still gone → evict pods (5min default)
Deployment Controller
Manages ReplicaSets for rolling updates
Old RS: scale down New RS: scale up
Job Controller
Runs pods to completion
Retries on failure up to backoffLimit
Endpoint Controller
Updates Service → Pod IP mappings
When pods come/go, endpoints update automaticallyThis is the magic of Kubernetes: you describe the desired state, and controllers continuously work to make it real. It's not a one-time action — it's a loop that never stops.
Kubelet: The Node Agent
The kubelet runs on every worker node. It's the bridge between the control plane and the actual containers. It watches the API server for pods assigned to its node, then makes them happen.
Kubelet responsibilities:
1. Watches API server for pod assignments
"Is there a new pod for my node?"
2. Pulls container images via container runtime
"containerd, pull myregistry/api:v2"
3. Creates and starts containers
"Start this container with these env vars,
mounts, and resource limits"
4. Runs health checks (liveness, readiness, startup)
"GET /health every 10s, restart if 3 failures"
5. Reports pod status back to API server
"Pod nginx is Running, container ready"
6. Reports node status (capacity, allocatable)
"I have 4 CPU, 16Gi memory, 110 pods max"Key distinction: the kubelet is the only component that runs directly on the host, not as a pod. It's a systemd service. This makes sense — you need the kubelet running before any pods can exist on that node.
Kube-proxy: Service Networking
kube-proxy runs on every node and implements Kubernetes Service networking. When you create a Service, kube-proxy sets up rules so that traffic to the Service IP gets routed to the right pods.
# You create a Service:
apiVersion: v1
kind: Service
metadata:
name: api
spec:
selector:
app: api # ← matches pods with this label
ports:
- port: 80 # ← Service listens on :80
targetPort: 3000 # ← forwards to pod :3000
# kube-proxy creates iptables/IPVS rules:
# Traffic to api:80 (ClusterIP) →
# round-robin to pod IPs on port 3000
#
# Other pods just call http://api:80
# DNS resolves "api" → ClusterIP
# kube-proxy handles the restIn newer clusters, you may see Cilium or Calico replacing kube-proxy entirely with eBPF-based networking. Same job, better performance.
Container Runtime: Where Containers Actually Run
The container runtime is what actually pulls images and starts containers. The kubelet talks to it via the Container Runtime Interface (CRI) — a standardized API so Kubernetes doesn't care which runtime you use.
containerd (most common)
- Default in GKE, EKS, AKS, k3s
- Lightweight, purpose-built for K8s
- Spun out of Docker in 2017
- Uses: containerd → runc → Linux namespaces
CRI-O
- Built specifically for Kubernetes
- Default in OpenShift (Red Hat)
- Minimal, follows OCI standards exactly
Docker (deprecated as CRI in K8s 1.24)
- dockerd → containerd → runc
- Extra layer of indirection, removed in 1.24
- Your Docker images still work everywhere
- Only the runtime shim was removedCommon misconception: "Kubernetes dropped Docker support." Not quite. K8s dropped the dockershim (the CRI adapter for Docker's runtime). Docker-built images are OCI-compliant and work on any runtime. You build with Docker, run with containerd.
How a Pod Gets Scheduled: Step by Step
Let's trace what happens when you run kubectl apply -f pod.yaml, from keystroke to running container.
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
app: web
spec:
containers:
- name: nginx
image: nginx:1.25
ports:
- containerPort: 80
resources:
requests:
cpu: "100m"
memory: "128Mi"The entire flow takes seconds. No human intervention. Every component did exactly one job: API server validated, etcd stored, scheduler placed, kubelet executed. This separation of concerns is what makes Kubernetes reliable at scale.
Inspect Every Component Live
Theory is good. Running commands on a real cluster is better. Here are the commands that let you see each architectural component in action.
# Cluster info and health
kubectl cluster-info
# Kubernetes control plane is running at https://...
# CoreDNS is running at https://...
# All nodes and their roles
kubectl get nodes -o wide
# NAME STATUS ROLES VERSION INTERNAL-IP OS-IMAGE
# master-1 Ready control-plane v1.29.2 10.0.0.1 Ubuntu 22.04
# worker-1 Ready <none> v1.29.2 10.0.0.2 Ubuntu 22.04
# Control plane components (running as pods)
kubectl get pods -n kube-system
# kube-apiserver-master-1 1/1 Running
# kube-controller-manager-master-1 1/1 Running
# kube-scheduler-master-1 1/1 Running
# etcd-master-1 1/1 Running
# kube-proxy-xxxxx 1/1 Running (per node)
# coredns-xxxxx 1/1 Running