dmai/blog

dmai/blog

Overview
1. Query Execution Pipeline2. EXPLAIN ANALYZE3. Performance Techniques4. Processes & Memory
Back to Articles
Database

3. Performance Techniques

Mar 20, 2026·14 min read

You know how to read an EXPLAIN ANALYZE plan. Now what? This article is a collection of practical techniques to make your PostgreSQL queries faster. Not theoretical advice, but things you can apply to your schema and queries today.

Each technique includes the why, the how, and a realistic example showing the before and after.

Use Integer Enums, Not Strings

A common pattern is storing status values as strings: status TEXT DEFAULT 'pending'. It is readable, but it wastes space and slows down comparisons. Every string comparison is a byte-by-byte check. An integer comparison is a single CPU instruction.

The storage difference:

-TEXT 'pending' - 7 bytes + 1 byte length header = 8 bytes per row
-TEXT 'delivered' - 9 bytes + 1 byte length header = 10 bytes per row
-SMALLINT (1, 2, 3) - 2 bytes per row, always. No matter how many statuses you have.

On a table with 10 million rows, that is 80MB vs 20MB just for one column. Smaller rows mean more rows fit per 8KB page, which means fewer pages to read, fewer cache misses, and faster sequential scans.

sql
-- Instead of this:
CREATE TABLE orders (
    id         BIGINT PRIMARY KEY,
    status     TEXT NOT NULL DEFAULT 'pending'
);

-- Do this:
CREATE TABLE orders (
    id         BIGINT PRIMARY KEY,
    status     SMALLINT NOT NULL DEFAULT 0
    -- 0 = pending, 1 = processing, 2 = shipped, 3 = delivered, 4 = cancelled
);

-- Keep a reference table for readability:
CREATE TABLE order_status (
    id    SMALLINT PRIMARY KEY,
    name  TEXT NOT NULL UNIQUE
);

INSERT INTO order_status VALUES
    (0, 'pending'),
    (1, 'processing'),
    (2, 'shipped'),
    (3, 'delivered'),
    (4, 'cancelled');

Index lookups are faster too. B-tree index nodes hold more integer keys per page than string keys, so the tree is shallower and requires fewer I/O operations to traverse.

What about PostgreSQL's native ENUM type?

PostgreSQL has CREATE TYPE status AS ENUM (...). Internally it stores enums as 4-byte OIDs, which is better than strings but worse than SMALLINT (2 bytes). The real problem is that adding or reordering enum values requires ALTER TYPE, which takes an ACCESS EXCLUSIVE lock on every table using that type. On a busy production database, that lock can block all reads and writes. A SMALLINT with a reference table is safer to evolve.

UUIDv7 Over UUIDv4 for Primary Keys

UUIDv4 is fully random. That randomness is great for uniqueness but terrible for B-tree indexes. Every new insert lands on a random leaf page, causing constant page splits and cache thrashing.

UUIDv4 (random)

550e8400-e29b-41d4-a716-446655440000

f47ac10b-58cc-4372-a567-0e02b2c3d479

7c9e6679-7425-40de-944b-e07fc1f90ae7

No ordering. Each insert goes to a different part of the index. The B-tree leaf pages are constantly split and rearranged.

UUIDv7 (timestamp-prefixed)

018f6b60-c000-7000-8000-000000000001 (2024-04-01 10:00:00)

018f6b60-c100-7000-8000-000000000002 (2024-04-01 10:00:01)

018f6b60-c200-7000-8000-000000000003 (2024-04-01 10:00:02)

First 48 bits are a millisecond-precision Unix timestamp. IDs are naturally sorted by creation time. New inserts always go to the rightmost leaf page, just like a BIGSERIAL.

The performance difference is dramatic on write-heavy tables:

sql
-- UUIDv4: random inserts scatter across the index
-- Each insert may need to load a different leaf page from disk

CREATE TABLE events_v4 (
    id   UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    data JSONB
);

-- UUIDv7: sequential inserts append to the rightmost leaf
-- The hot pages stay in shared_buffers

CREATE TABLE events_v7 (
    id   UUID PRIMARY KEY DEFAULT uuidv7(),  -- pg 17+ or use an extension
    data JSONB
);

Why this matters at scale:

-Write amplification - UUIDv4 causes ~2-3x more page splits than UUIDv7 on a 100M-row table. Each split writes two pages instead of one.
-Cache efficiency - With UUIDv4, your shared_buffers hold random leaf pages from all over the index. With UUIDv7, the hot pages (rightmost) stay cached.
-Range queries are free - WHERE id > '018f6b60-c000-...' is effectively a time range query. You get time-based filtering without a separate created_at index.
-WAL volume - Fewer page splits means less WAL generated, which means replication lag stays lower.
sql
-- PostgreSQL 17+ has built-in uuidv7() function
-- For older versions, use the pg_uuidv7 extension:
CREATE EXTENSION IF NOT EXISTS pg_uuidv7;

-- Or generate in your application layer:
-- Python: import uuid; uuid.uuid7()  (uuid6 package)
-- Node:   import { v7 } from 'uuid';

When to still use BIGSERIAL:

If you do not need distributed ID generation (single database, no client-side IDs), BIGSERIAL is still the most storage-efficient option at 8 bytes vs 16 bytes for UUID. Use UUIDv7 when you need IDs generated across multiple services or want to avoid exposing auto-increment ordering to clients.

Covering Indexes (Index-Only Scans)

A regular index helps PostgreSQL find rows, but it still has to go to the heap (table) to fetch the actual column values. A covering index includes all the columns your query needs, so the heap is never touched.

sql
-- Your query:
SELECT customer, amount FROM orders WHERE status = 2;

-- Regular index: finds rows, then fetches customer + amount from heap
CREATE INDEX idx_orders_status ON orders(status);

-- Covering index: includes the columns you SELECT
CREATE INDEX idx_orders_status_covering ON orders(status)
    INCLUDE (customer, amount);
text
-- With regular index:
Index Scan using idx_orders_status on orders
    (actual time=0.03..15.21 rows=33289 loops=1)
  Index Cond: (status = 2)
  Buffers: shared hit=33380

-- With covering index:
Index Only Scan using idx_orders_status_covering on orders
    (actual time=0.02..4.12 rows=33289 loops=1)
  Index Cond: (status = 2)
  Heap Fetches: 0
  Buffers: shared hit=185

The covering index read 185 pages vs 33,380 pages for the regular index. That is a 180x reduction in I/O. The Heap Fetches: 0 confirms the heap was never touched.

INCLUDE vs compound index:

CREATE INDEX ON orders(status, customer, amount) also enables index-only scans, but it indexes all three columns. The INCLUDE clause stores customer and amount in the index leaf pages without indexing them. This keeps the B-tree smaller (faster lookups on status) while still avoiding heap fetches.

The visibility map requirement:

Index-only scans work best on tables with low update frequency. PostgreSQL must check the visibility map to confirm rows are visible to the current transaction. If a page has been recently updated, PostgreSQL falls back to fetching from the heap for those rows. Run VACUUM to update the visibility map after bulk updates.

Partial Indexes

If you only query a subset of rows, why index the entire table? A partial index only includes rows that match a WHERE clause, making it smaller and faster.

sql
-- You always query pending orders, never delivered ones.
-- Full index: indexes all 10M rows (including 8M delivered)
CREATE INDEX idx_orders_status ON orders(status);

-- Partial index: only indexes the ~200K pending rows
CREATE INDEX idx_orders_pending ON orders(created_at)
    WHERE status = 0;  -- 0 = pending

-- Your query matches the partial index exactly:
SELECT * FROM orders WHERE status = 0 ORDER BY created_at DESC LIMIT 20;
text
-- Full index size:    214 MB (indexes every row)
-- Partial index size:  4.3 MB (only pending rows)

Index Scan Backward using idx_orders_pending on orders
    (actual time=0.02..0.03 rows=20 loops=1)
  Buffers: shared hit=4

The partial index is 50x smaller and fits entirely in memory. It also stays small as delivered orders accumulate, because they are never added to this index.

Common partial index patterns:

-Active records only - WHERE deleted_at IS NULL or WHERE is_active = true
-Unprocessed queue items - WHERE processed_at IS NULL
-Recent data - WHERE created_at > '2024-01-01' (re-create periodically)
-Unique constraint on subset - CREATE UNIQUE INDEX ... WHERE deleted_at IS NULL (allows soft deletes with unique emails)

Stop Using SELECT *

SELECT * forces PostgreSQL to read every column from the heap, even if you only need two fields. This has three concrete costs:

-Wider rows = more I/O - If your table has a JSONB column averaging 2KB, SELECT * reads it even when you only need the id and status. That is 2KB per row of wasted I/O.
-Prevents index-only scans - SELECT * always requires a heap fetch because no index covers all columns. SELECT id, status could use a covering index and skip the heap entirely.
-More memory for sorts and hashes - Wider rows mean work_mem fills up faster. A sort on 10 columns spills to disk sooner than a sort on 2 columns.
sql
-- Table has 15 columns including a large JSONB payload

-- Bad: reads all 15 columns from heap
SELECT * FROM events WHERE user_id = 42 ORDER BY created_at DESC LIMIT 10;
-- Buffers: shared hit=4250, width=1843

-- Good: reads only what you need
SELECT id, event_type, created_at
FROM events WHERE user_id = 42 ORDER BY created_at DESC LIMIT 10;
-- Buffers: shared hit=12, width=36 (with covering index)

Column Order Matters for Storage

PostgreSQL aligns columns in memory based on their type. Poor column ordering wastes bytes to padding. This is called alignment padding, and it adds up.

sql
-- Bad ordering: wastes space due to alignment padding
CREATE TABLE bad_layout (
    a SMALLINT,    -- 2 bytes + 6 bytes padding (next column needs 8-byte alignment)
    b BIGINT,      -- 8 bytes
    c SMALLINT,    -- 2 bytes + 6 bytes padding
    d BIGINT,      -- 8 bytes
    e BOOLEAN      -- 1 byte + 7 bytes padding
);
-- Per row: 2+6+8+2+6+8+1+7 = 40 bytes

-- Good ordering: largest to smallest
CREATE TABLE good_layout (
    b BIGINT,      -- 8 bytes
    d BIGINT,      -- 8 bytes
    a SMALLINT,    -- 2 bytes (no padding, next is also 2-byte aligned)
    c SMALLINT,    -- 2 bytes
    e BOOLEAN      -- 1 byte + 3 bytes padding to reach 4-byte tuple boundary
);
-- Per row: 8+8+2+2+1+3 = 24 bytes

That is 40% less storage per row, just from reordering columns. On a table with 100 million rows, this saves over 1.5 GB of disk and shared_buffers.

Rule of thumb:

Order columns from largest fixed-size to smallest: BIGINT (8), INTEGER/FLOAT (4), SMALLINT (2), BOOLEAN/CHAR(1) (1). Put variable-length columns (TEXT, JSONB, BYTEA) at the end. They have a 1-4 byte length header and do not cause alignment issues.

You can check actual tuple size with:

sql
SELECT pg_column_size(ROW(
    1::bigint, 2::bigint, 3::smallint, 4::smallint, true
));  -- Returns 29 (24 data + 5 header)

SELECT pg_column_size(ROW(
    1::smallint, 2::bigint, 3::smallint, 4::bigint, true
));  -- Returns 45 (40 data + 5 header)

EXISTS Over IN for Subqueries

When checking if related rows exist in another table, EXISTS is almost always faster than IN because it short-circuits after finding the first match.

sql
-- Slower: IN materializes the entire subquery result into a list
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE amount > 1000);

-- Faster: EXISTS stops at the first match per customer
SELECT * FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o
    WHERE o.customer_id = c.id AND o.amount > 1000
);
text
-- IN plan: builds a hash of ALL matching customer_ids, then probes
Hash Semi Join  (cost=2891.00..3012.50 rows=500 width=52)
  ->  Seq Scan on customers c
  ->  Hash
        ->  Seq Scan on orders o
              Filter: (amount > 1000)
              Rows Removed by Filter: 95000

-- EXISTS plan: for each customer, probe the index and stop at first hit
Nested Loop Semi Join  (cost=0.42..1650.00 rows=500 width=52)
  ->  Seq Scan on customers c
  ->  Index Scan using idx_orders_customer on orders o
        Index Cond: (customer_id = c.id)
        Filter: (amount > 1000)

The Semi Join in both plans means PostgreSQL only needs one match per outer row. But with EXISTS and an index, it finds that match instantly via the index instead of scanning all orders first.

When IN is fine:

For small, static lists like WHERE status IN (1, 2, 3), there is no difference. PostgreSQL optimizes literal IN lists into = ANY(ARRAY[...]) which is efficient. The problem is with subqueries that return large result sets.

Connection Pooling

Every PostgreSQL connection is a separate OS process. Creating one takes ~100ms and allocates ~10MB of memory. If your application opens a new connection per request, you are paying this cost on every single query.

The real cost of too many connections:

-Memory - 500 connections = ~5GB of memory just for process overhead, before any shared_buffers.
-Context switching - The OS scheduler thrashes between hundreds of processes. CPU cache lines are constantly evicted.
-Lock contention - More connections means more concurrent transactions competing for the same row locks, leading to longer wait times.
-Snapshot overhead - PostgreSQL tracks active transactions for MVCC. More connections = more snapshots = slower snapshot creation.
sql
-- Check your current connections:
SELECT count(*) AS total_connections,
       count(*) FILTER (WHERE state = 'active') AS active,
       count(*) FILTER (WHERE state = 'idle') AS idle,
       count(*) FILTER (WHERE state = 'idle in transaction') AS idle_in_tx
FROM pg_stat_activity
WHERE backend_type = 'client backend';

Use a connection pooler like PgBouncer or the built-in pooling in your application framework. The typical setup is 20-50 database connections serving hundreds of application threads.

How many connections do you actually need?

A good starting point is connections = (CPU cores * 2) + effective_spindle_count. For a machine with 8 cores and SSDs, that is around 20 connections. More connections do not make queries faster; they make them slower through contention. If you need more concurrency, add a pooler, not more connections.

JSONB: Use It Wisely

JSONB is powerful for flexible schemas, but it has performance traps. The biggest one: you cannot use a regular B-tree index on deeply nested fields.

sql
-- This query cannot use a B-tree index efficiently:
SELECT * FROM events WHERE payload->>'user_id' = '42';

-- It does a full table scan and extracts user_id from every JSONB document.
-- Even with a GIN index, the lookup is slower than a native column.

-- Better: extract frequently queried fields into columns
ALTER TABLE events ADD COLUMN user_id BIGINT
    GENERATED ALWAYS AS ((payload->>'user_id')::bigint) STORED;

CREATE INDEX idx_events_user_id ON events(user_id);

-- Now this is a fast index scan on a native BIGINT column:
SELECT * FROM events WHERE user_id = 42;

JSONB indexing options:

-GIN index (default) - CREATE INDEX ON events USING gin(payload). Supports @>, ?, ?&, ?| operators. Good for 'does this key exist' or 'does this value contain'. Slower to update than B-tree.
-GIN with jsonb_path_ops - CREATE INDEX ON events USING gin(payload jsonb_path_ops). Smaller and faster than default GIN, but only supports the @> (contains) operator.
-Expression index - CREATE INDEX ON events ((payload->>'user_id')). B-tree index on a specific extracted value. Fastest for point lookups on a known field.
-Generated column - Best option if you query the field often. Extracts into a native column with proper type. Full B-tree performance with no runtime JSON parsing.

The rule: if you query a JSONB field in WHERE clauses more than occasionally, extract it into a column or at minimum create an expression index.

Keep Tables Lean with VACUUM

PostgreSQL uses MVCC (Multi-Version Concurrency Control). When you update a row, the old version is not deleted. It is marked as dead and a new version is written. Over time, dead rows (called dead tuples) accumulate and bloat the table.

What bloat does to performance:

-Larger table on disk - Seq scans read dead rows along with live ones. A table with 50% bloat takes twice as long to scan.
-Index bloat - Indexes also accumulate pointers to dead rows. Larger indexes mean slower lookups and more memory usage.
-Wasted cache - Dead rows occupy space in shared_buffers, pushing live data out of cache.
sql
-- Check bloat on your tables:
SELECT relname,
       n_live_tup,
       n_dead_tup,
       round(n_dead_tup::numeric / greatest(n_live_tup, 1) * 100, 1) AS dead_pct,
       last_vacuum,
       last_autovacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC
LIMIT 10;

-- If dead_pct is high (>20%), autovacuum may be falling behind.
-- Tune autovacuum for hot tables:
ALTER TABLE orders SET (
    autovacuum_vacuum_scale_factor = 0.01,    -- vacuum after 1% dead rows (default 20%)
    autovacuum_analyze_scale_factor = 0.005   -- analyze after 0.5% changes (default 10%)
);

VACUUM vs VACUUM FULL:

Regular VACUUM marks dead space as reusable but does not shrink the table file. It runs concurrently and does not block reads or writes. VACUUM FULL rewrites the entire table to reclaim disk space, but takes an ACCESS EXCLUSIVE lock that blocks everything. On production, avoid VACUUM FULL. Use pg_repack extension instead, which can reclaim space without blocking.

Batch Your Writes

Every single INSERT statement has overhead: parse, plan, execute, write WAL, flush. Doing 10,000 individual inserts is catastrophically slower than one batched insert.

sql
-- Terrible: 10,000 round trips, 10,000 WAL flushes
INSERT INTO events (user_id, type) VALUES (1, 'click');
INSERT INTO events (user_id, type) VALUES (2, 'view');
INSERT INTO events (user_id, type) VALUES (3, 'click');
-- ... 9,997 more

-- Good: single statement, single WAL flush
INSERT INTO events (user_id, type) VALUES
    (1, 'click'),
    (2, 'view'),
    (3, 'click'),
    -- ... 9,997 more
;

-- Even better for bulk loads: COPY
COPY events (user_id, type) FROM STDIN WITH (FORMAT csv);
1,click
2,view
3,click
\.

Rough performance comparison (100K rows):

-Individual INSERTs - ~45 seconds. Each statement is parsed, planned, and committed separately.
-Batched INSERT (multi-row VALUES) - ~2 seconds. One parse, one plan, one commit.
-COPY - ~0.5 seconds. Binary protocol, minimal parsing, optimized for bulk loading.

For updates, use UPDATE ... FROM with a values list instead of looping:

sql
-- Instead of 1000 individual UPDATEs:
UPDATE orders SET status = 3
FROM (VALUES (101), (102), (103), /* ... */) AS t(id)
WHERE orders.id = t.id;

Keyset Pagination Over OFFSET

OFFSET 100000 LIMIT 20 looks simple, but PostgreSQL has to scan and discard 100,000 rows to get to your 20. The deeper the page, the slower the query.

sql
-- Slow: gets worse linearly with page number
SELECT * FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 100000;
-- Scans 100,020 rows, returns 20. Page 5000 scans 100,000 rows.

-- Fast: constant time regardless of page depth
SELECT * FROM orders
WHERE created_at < '2024-06-15 10:30:00'  -- cursor from last row of previous page
ORDER BY created_at DESC
LIMIT 20;
-- Scans exactly 20 rows using the index.
text
-- OFFSET plan:
Limit (actual time=89.12..89.15 rows=20)
  ->  Sort (actual time=45.20..89.10 rows=100020)
        Sort Key: created_at DESC
        ->  Seq Scan on orders (actual time=0.01..28.50 rows=100000)

-- Keyset plan:
Limit (actual time=0.03..0.05 rows=20)
  ->  Index Scan Backward using idx_orders_created_at on orders
        Index Cond: (created_at < '2024-06-15 10:30:00')
        (actual time=0.02..0.04 rows=20)

Keyset pagination requires a unique, sortable column as the cursor. If created_at is not unique, add a tiebreaker:

sql
-- Compound cursor for non-unique columns:
SELECT * FROM orders
WHERE (created_at, id) < ('2024-06-15 10:30:00', 50432)
ORDER BY created_at DESC, id DESC
LIMIT 20;

-- Needs a matching compound index:
CREATE INDEX idx_orders_pagination ON orders(created_at DESC, id DESC);

When OFFSET is acceptable:

For small datasets (under 10K rows) or admin dashboards where users rarely go past page 5, OFFSET is fine and simpler to implement. The performance cliff only appears with deep pagination on large tables.

Quick Reference

1.Integer enums over strings - Smaller rows, faster comparisons, denser indexes
2.UUIDv7 over UUIDv4 - Sequential inserts, no page splits, range queries for free
3.Covering indexes (INCLUDE) - Skip the heap entirely with index-only scans
4.Partial indexes - Index only the rows you actually query
5.Explicit column selection - Enable index-only scans, reduce I/O and memory
6.Column ordering - Largest types first to minimize alignment padding
7.EXISTS over IN - Short-circuits at first match with index support
8.Connection pooling - 20-50 connections beats 500 connections
9.JSONB generated columns - Extract hot fields into native indexed columns
10.Tune autovacuum - Prevent bloat before it slows your queries
11.Batch writes - COPY > multi-row INSERT > individual INSERTs
12.Keyset pagination - Constant time vs linear degradation with OFFSET

Every optimization here follows the same principle: make PostgreSQL read less data. Smaller rows, smaller indexes, fewer pages, fewer heap fetches. The fastest I/O is the I/O you never do.

© 2026 dmai/blog Engineer Notes. All rights reserved.