Why SQL Pagination With Offset is Slow and How to Optimize It
Introduction
Pain Point: Your web app becomes sluggish when users request page 1000 of results. The database groans under the weight of OFFSET 10000
, delivering results slower than a dial-up connection.
This lecture explains why OFFSET
-based pagination is inefficient for large datasets, demonstrates its performance pitfalls, and introduces keyset pagination as a faster alternative.
Key Revelation: OFFSET
forces databases to process all skipped rows, making it fundamentally unscalable. There’s a better way.
Core Concepts
How OFFSET
Works
- Definition:
OFFSET X
instructs the database to fetch and discard the firstX
rows before returning results. - Example:
SELECT * FROM news ORDER BY id DESC OFFSET 1000 LIMIT 10
forces the database to:- Read 1010 rows from disk/index.
- Discard the first 1000.
- Return the remaining 10.
Why OFFSET
Fails at Scale
- I/O Overhead: Fetching 100,010 rows to return 10 wastes disk I/O and CPU cycles.
- Inconsistent Results: Newly inserted rows shift offsets, causing duplicates (e.g., page 11 shows rows from page 10).
- Lock Contention: Large offsets may trigger full table scans, blocking writes.
Key Characteristics of Efficient Pagination
Keyset (Cursor-Based) Pagination
- Principle: Use a unique, indexed column (e.g.,
id
) to “remember” the last seen record. -
Implementation:
-- First page SELECT title, id FROM news ORDER BY id DESC LIMIT 10; -- Next page (using last seen id=12345) SELECT title, id FROM news WHERE id < 12345 ORDER BY id DESC LIMIT 10;
- Benefits:
- Index Scan Only: Uses the index to skip rows instantly, avoiding full data scans.
- Consistency: Immune to mid-pagination data changes.
- Speed: Processes only
LIMIT
rows, notOFFSET + LIMIT
.
Practical Implementation
Example: News Article Pagination
Problem: Retrieve page 1000 of news articles sorted by recency.
Bad Approach (Offset):
SELECT title FROM news ORDER BY id DESC OFFSET 10000 LIMIT 10;
- Execution Time: 620ms (for 1M rows).
- Reason: Reads 10,010 rows but returns 10.
Optimized Approach (Keyset):
SELECT title, id FROM news
WHERE id < (last_seen_id)
ORDER BY id DESC LIMIT 10;
- Execution Time: 0.2ms.
- Reason: Uses
id
index to jump directly to the desired rows.
Conclusion
- Avoid
OFFSET
for Large Datasets: It’s a performance time bomb. - Use Keyset Pagination: Leverage indexed columns for instant navigation.
- Trade-off: Requires client-side tracking of the last seen record (e.g., via API tokens).
Lecture Highlight:
“OFFSET
is like reading a 1000-page book to find page 10. Keyset pagination is flipping directly to it.”
Further Reading: Explore Use The Index, Luke for advanced database optimization techniques.