PostgreSQL Partitioning – Real-World Lessons from Hussein's Database Lectures

Introduction

These are my personal notes from Hussein Nasser’s lecture series on database fundamentals. This lecture covered horizontal partitioning in PostgreSQL, how to use it, and when it helps.

Core Concepts / Overview

What is Partitioning?

Partitioning splits one big table into smaller sub-tables (partitions). The database decides which partition to use at query time, depending on the WHERE clause.

Horizontal partitioning = split by rows
Vertical partitioning = split by columns
Focus was mostly on horizontal partitioning

Key Characteristics

Partition Types

Range: rows fall into partitions based on a numeric or date range
List: rows split by fixed values (e.g., states, zip codes)
Hash: rows assigned by hash result
Vertical: columns are split; useful for large blob/text fields

PostgreSQL Facts

Parent table is empty. It’s just metadata.
Indexes created on the parent propagate to all partitions (Postgres 11+)
Partition pruning: planner ignores irrelevant partitions (must be enabled)

Practical Implementation

Setup Postgres in Docker

docker run --name pgdemo -e POSTGRES_PASSWORD=secret -d postgres
docker exec -it pgdemo bash
psql -U postgres

Create Original Table & Insert Data

CREATE TABLE grades_original (
  id SERIAL NOT NULL,
  g INT NOT NULL
);

INSERT INTO grades_original(g)
SELECT floor(random()*100)::int
FROM generate_series(1, 10000000);

CREATE INDEX idx_grades_original_g ON grades_original(g);

Create Partitioned Table

CREATE TABLE grades_parts (
  id SERIAL NOT NULL,
  g INT NOT NULL
) PARTITION BY RANGE (g);

Create & Attach Partitions

CREATE TABLE g_00_35  (LIKE grades_parts INCLUDING ALL);
CREATE TABLE g_36_60  (LIKE grades_parts INCLUDING ALL);
CREATE TABLE g_61_80  (LIKE grades_parts INCLUDING ALL);
CREATE TABLE g_81_100 (LIKE grades_parts INCLUDING ALL);

ALTER TABLE grades_parts ATTACH PARTITION g_00_35  FOR VALUES FROM (0)  TO (35);
ALTER TABLE grades_parts ATTACH PARTITION g_36_60  FOR VALUES FROM (35) TO (60);
ALTER TABLE grades_parts ATTACH PARTITION g_61_80  FOR VALUES FROM (60) TO (80);
ALTER TABLE grades_parts ATTACH PARTITION g_81_100 FOR VALUES FROM (80) TO (100);

Insert Data Into Partitions

INSERT INTO grades_parts SELECT * FROM grades_original;

PostgreSQL decides which partition each row goes to based on the value of g.

Indexing Partitions

CREATE INDEX idx_grades_parts_g ON grades_parts(g);

The above command creates indexes on all child partitions.

Query & Benchmark

EXPLAIN ANALYZE
SELECT COUNT(*) FROM grades_parts WHERE g = 30;

Hits only 1 partition (g_00_35) if partition pruning is enabled.
Partition pruning reduces query time from seconds to milliseconds for large tables.

Automating Partition Creation with Node.js

Hussein used Node.js to generate 100 partitions programmatically:

import pkg from "pg";
const { Client } = pkg;

const TOTAL = 1_000_000_000;
const STEP = 10_000_000;

async function main() {
  const client = new Client({
    connectionString: "postgres://postgres:secret@localhost:5432",
  });
  await client.connect();
  await client.query("CREATE DATABASE customers;");
  await client.end();

  const db = new Client({
    connectionString: "postgres://postgres:secret@localhost:5432/customers",
  });
  await db.connect();
  await db.query(`
    CREATE TABLE customers (
      id INT NOT NULL,
      name TEXT
    ) PARTITION BY RANGE (id);
  `);

  for (let i = 0; i < TOTAL / STEP; i++) {
    const from = i * STEP;
    const to = from + STEP;
    const name = `customers_${from}_${to}`;
    await db.query(`CREATE TABLE ${name} (LIKE customers INCLUDING ALL);`);
    await db.query(
      `ALTER TABLE customers ATTACH PARTITION ${name} FOR VALUES FROM (${from}) TO (${to});`
    );
  }

  await db.end();
}

main();

Performance Notes

Partition vs Non-partition Index Size

Full table index (10M rows): ~69 MB
The Smallest partition index: ~24 MB
Smaller indexes are faster to query time on full table: ~2–3 seconds
Query time on partition: ~1 second or less

Partition Pruning

SHOW enable_partition_pruning; -- Should be ON

If off, the planner hits all partitions, which defeats the purpose.

Advantages & Disadvantages

Pros

Query speed increases
Smaller, faster indexes
Easy to bulk load into individual partitions
Move old partitions to slower disks (archival)

Cons

Row updates that change partition key = delete + insert (slow)
Bad queries (e.g., WHERE id > 1) = scans all partitions
Schema changes may need more care depending on DBMS

Conclusion

Partitioning is one of the best techniques to handle large datasets, and Hussein’s examples made it super clear. You can see actual performance boosts only when the table is large enough and memory is limited (I/O-bound scenarios). For smaller tables or well-tuned memory, it may not show immediate speed gains.

Biggest lesson:

“The fastest way to query a table with a billion rows is to avoid querying a table with a billion rows.” — Hussein

Last updated Apr 19, 2025

This work is licensed under a Attribution-NonCommercial 4.0 International license.

Venmo	Paypal

PREVIOUSB+ Trees vs B-Trees: How Database Indexing Really Works

NEXTDatabase Sharding and Consistent Hashing Explained