1. VPC Networking
What Is a VPC?
A VPC (Virtual Private Cloud) is your own isolated network inside a cloud provider like AWS, GCP, or Azure.
Think of it like renting a floor in an office building. The building (cloud provider) has shared infrastructure: electricity, elevators, security guards. But your floor is completely private. You decide who gets in, how rooms are connected, and what goes where.
Every resource you launch (Compute Engine VMs, Cloud SQL databases, Cloud Functions, or their AWS equivalents: EC2, RDS, Lambda) lives inside a VPC. Understanding VPC networking is foundational to cloud architecture.
Subnets: Public vs Private
A VPC is divided into subnets, logical partitions of your network. Each subnet lives in a specific zone (GCP) or Availability Zone (AWS).
Think of subnets as different rooms on your office floor. The lobby is public: anyone can walk in. The back office requires a badge, employees only. The vault has no outside access at all, only internal staff with special clearance.
Has a route to the internet. Hosts resources that need to be reachable from outside: load balancers, bastion hosts, web servers.
No direct internet access. Hosts your application servers and internal services. Nothing here should be publicly reachable.
No internet access, not even outbound. Databases (Cloud SQL / RDS, Memorystore / ElastiCache) live here. Maximum isolation.
This is the core mental model: public-facing resources in public subnets, everything else in private or isolated subnets.
Internet Gateway & NAT Gateway
Two gateways control how your VPC talks to the internet.
The IGW is like the front door of your building: visitors can come in and employees can go out. The NAT is like a mail room: employees in the back office can send letters out, but no one outside can walk in through the mail room.
Attaches to your VPC and allows bidirectional internet access for public subnets. Inbound traffic from users reaches your ALB through the IGW.
Allows private subnet resources to make outbound requests (pull Docker images, call external APIs) without being exposed to inbound internet traffic. Called Cloud NAT in GCP, NAT Gateway in AWS.
In the diagram, the App Server in the private subnet routes outbound traffic through NAT → IGW → Internet. But no one on the internet can initiate a connection back to it.
NAT costs money. Cloud NAT charges per VM + data; AWS NAT Gateway runs ~$32/mo plus data processing. For dev environments, some teams use NAT instances or Private Service Connect / VPC endpoints instead.
Route Tables
Every subnet has a route table, a set of rules that decide where traffic goes.
Like signs in the hallway: “Floor 2? Take the stairs. Outside? Go through the front door. Restricted area? No exit.” Route tables tell every packet where to go next.
# Public subnet route table
10.0.0.0/16 → local
0.0.0.0/0 → igw-abc123 ← internet
# Private subnet route table
10.0.0.0/16 → local
0.0.0.0/0 → nat-xyz789 ← outbound only
The 10.0.0.0/16 → local rule means all traffic within the VPC stays internal. The 0.0.0.0/0 rule is the default route, which is where traffic goes when there's no more specific match.
This is what makes a subnet “public” or “private” It's not a property of the subnet itself, it's the route table. Point the default route at an IGW = public. Point it at a NAT = private. No default route = isolated.
Firewall Rules
Firewall rules control traffic at the network level. The implementation differs by provider.
Think of security guards at different checkpoints. One guard sits at each room's door (instance-level) and remembers who they let in: if you entered, you can leave freely. Another guard sits at the hallway entrance (subnet-level) and checks both directions independently, badge in, badge out, every single time.
Stateful, applied at the VPC level and targeted by network tags or service accounts. Each rule specifies direction (ingress/egress), priority, source/destination, protocol, and port. Deny-all-ingress and allow-all-egress are implied defaults.
Two layers: Security Groups (stateful, per-instance: if you allow inbound 443, the response is auto-allowed) and NACLs (stateless, per-subnet: must explicitly allow both directions). Most teams rely on Security Groups and leave NACLs at defaults.
In the diagram, the red bars are NACLs at the subnet boundary, and the orange rings are Security Groups around each instance.
# Example Security Group rules
ALB: Allow TCP 443 from 0.0.0.0/0
App: Allow TCP 8080 from ALB-SG only
DB: Allow TCP 5432 from App-SG only
In practice: most teams rely primarily on Security Groups and leave NACLs at their defaults. Use NACLs when you need to explicitly block specific IPs or ranges at the subnet level. Firewall rules define source/destination IP, protocol, port, and allow/deny, covering both ingress (incoming) and egress (outgoing) traffic.
VPC Peering & Transit Gateway
What if your services span multiple VPCs? Maybe you have separate VPCs for staging/production, or different teams own different VPCs.
VPC Peering is like building a private skybridge between two office buildings: direct and fast, but you need a separate bridge for every pair. A Transit Gateway is like a central bus terminal: every building connects once to the hub, and the hub routes between all of them.
A direct 1:1 connection between two VPCs. Traffic stays on the cloud provider's backbone and never touches the public internet. Simple but doesn't scale: N VPCs need N(N-1)/2 peering connections.
A central hub that connects multiple VPCs (and on-prem networks) through a single attachment point. Much cleaner at scale: add a new VPC with one connection instead of peering with every other VPC. Called Network Connectivity Center (NCC) in GCP, Transit Gateway in AWS.
Key limitation: VPC Peering is non-transitive. If A peers with B, and B peers with C, A cannot reach C through B. Transit Gateway solves this.
A GCP-native concept where a networking team owns the VPC (host project) centrally, while application teams deploy resources into shared subnets (service projects). AWS achieves similar with RAM (Resource Access Manager). Common in enterprise environments for centralized network control with decentralized deployment.
VPCs can connect to on-premises networks via Cloud VPN / Site-to-Site VPN (encrypted tunnel over internet) or Dedicated Interconnect / Direct Connect (private physical link). This enables hybrid cloud architectures and gradual cloud migration.
CIDR Blocks & IP Planning
Every VPC needs a CIDR block, which defines the range of IP addresses available inside it.
Think of it like postal codes. Your building gets a zip code range (the VPC CIDR), and each floor gets its own sub-range (subnet CIDRs). Two buildings can't share the same zip codes or mail gets lost. Same goes for IP addresses and peering.
10.0.0.0/16 → 65,536 IPs (VPC)
10.0.1.0/24 → 256 IPs (public subnet)
10.0.2.0/24 → 256 IPs (private subnet)
10.0.3.0/24 → 256 IPs (isolated subnet)
The /16 means the first 16 bits are the network prefix; the remaining 16 bits are for hosts. Smaller number = more IPs.
Critical rule: VPCs that need to peer cannot have overlapping CIDR blocks. If VPC A is 10.0.0.0/16 and VPC B is also 10.0.0.0/16, they can never be peered. Plan your IP space upfront.
Cloud providers reserve IPs per subnet: GCP reserves 4 (network, gateway, and 2 for future use), AWS reserves 5 (adds DNS). A /24 gives you roughly 251-252 usable IPs, not 256.
Putting It All Together
A typical production VPC looks like this:
- →Public subnet: ALB receives traffic from the Internet Gateway
- →Private subnet: App servers process requests, use NAT for outbound
- →Isolated subnet: Cloud SQL/Memorystore (or RDS/ElastiCache) with no internet access at all
- →Firewall rules: least-privilege per resource (LB allows 443, app allows LB only, DB allows app only)
This is the classic 3-tier architecture: web tier (public) → app tier (private) → data tier (isolated).
Request flow:
Internet → IGW → LB (public) → App (private) → DB (isolated)
Full production stack:
VPC + NAT + Firewall rules + Peering + VPN for hybrid
A well-designed VPC is defense in depth. Multiple layers of network isolation mean that even if one layer is compromised, the blast radius stays contained.