Inspired by the brilliant 3Blue1Brown video on generating functions, this post explores an elegant Olympiad-level counting problem.
The Problem
Given two positive integers $n$ and $k$ (where $2 \le k \le n$), answer this:
How many subsets of $\lbrace 1, 2, 3, \ldots, n \rbrace$ have a sum divisible by $k$?
Let’s say $n = 2000$ and $k = 5$. You need to count how many subsets have sums divisible by 5.
Your first instinct: iterate through all subsets, sum each one, check divisibility.
The problem: There are $2^{2000}$ subsets.
To put this in perspective:
- The number of atoms in the observable universe: $\approx 10^{80}$
- The value of $2^{2000}$: $\approx 10^{602}$
Even if you could check one trillion subsets per second, you’d need more time than the age of the universe—multiplied by itself, many times over.
Brute force is not just slow. It’s impossible.
The Total Population
Before we can count a subset, we need to understand the full landscape.
For a set with $n$ elements, each element has exactly 2 choices:
- It’s in the subset
- It’s out of the subset
Think of $n$ light switches, each with 2 positions:
\[\text{Total subsets} = \underbrace{2 \times 2 \times \cdots \times 2}_{n \text{ times}} = 2^n\]For $n = 2000$, we have $2^{2000}$ total subsets. This is our starting point.
The Remainder Buckets
When we divide any subset’s sum by $k$, we get a remainder from $\lbrace 0, 1, 2, \ldots, k-1 \rbrace$.
We want subsets in the “remainder 0” bucket—those divisible by $k$.
Naive guess: Each of the $k$ buckets gets an equal share, so:
\[\text{Count} \stackrel{?}{=} \frac{2^n}{k}\]But this is wrong. The distribution isn’t perfectly uniform.
Why? Because the numbers $1, 2, 3, \ldots, n$ have structure. They cycle through remainders in a predictable pattern. This creates subtle imbalances.
The Correction Factor
The exact formula accounts for this imbalance:
\[\boxed{\text{Count} = \frac{2^n + (k-1) \cdot 2^{n/k}}{k}}\]Let’s decode each component:
| Symbol | Meaning | Role |
|---|---|---|
| $2^n$ | Total subsets | The full population |
| $2^{n/k}$ | Correction factor | Captures periodicity |
| $(k-1)$ | Symmetry multiplier | Balances non-zero remainders |
| $\div k$ | Bucket selector | Extracts one remainder class |
The key insight: $2^{n/k}$ appears because the remainders cycle every $k$ numbers. In the range $1$ to $n$, we have $\frac{n}{k}$ complete cycles. Each cycle contributes a multiplicative factor to the correction term.
Imagine a clock with $k$ hours. As we add numbers $1, 2, 3, \ldots, n$, their remainders (mod $k$) tick around this clock.
- Every $k$ consecutive numbers complete one full rotation
- In the range $1$ to $n$, we have $\frac{n}{k}$ complete rotations
- Each rotation contributes a factor of 2 to the correction term
- Thus: $2^{n/k}$
Example: For $n = 15$ and $k = 3$:
- Numbers $1, 2, 3$ → remainders $1, 2, 0$ (one cycle)
- Numbers $4, 5, 6$ → remainders $1, 2, 0$ (two cycles)
- Numbers $7, 8, 9$ → remainders $1, 2, 0$ (three cycles)
- And so on…
- Total cycles: $\frac{15}{3} = 5$
- Correction factor: $2^5 = 32$
See the Math in Motion
Before we write any code, let’s validate the formula interactively. Adjust $n$ and $k$ below and watch how the correction factor and target count respond in real-time.
Subset Divisibility Calculator
Live Metrics:
Total Subsets (2n): 1024
Correction Factor (2n/k): 8
Target Subsets: 342
Formula: (2n + (k-1)·2n/k) / k
Remainder Distribution
Growth Visualization (n vs Subsets)
Formula Breakdown:
What You’re Seeing
Left graph: The distribution of subsets across remainder buckets. Notice that remainder 0 (red bar) is slightly larger than the others—this is the correction factor at work.
Right graph: Exponential growth on a logarithmic scale. The green line shows total subsets ($2^n$), and the red line shows divisible subsets. They grow in parallel, but the gap between them is determined by $k$.
Key observation: As you increase $k$, the correction factor shrinks (because $n/k$ decreases), and the distribution becomes more uniform.
Approach A: Mathematical Formula (Fast, Large $n$)
When $n$ is huge (like $n = 2000$), we use the closed-form formula with modular arithmetic.
Time complexity: $O(\log n)$
Space complexity: $O(1)$
Best for: Competitive programming, large $n$, single queries
#include <iostream>
using namespace std;
// Binary exponentiation with modular arithmetic
// Computes (base^exp) % mod in O(log exp) time
long long power(long long base, long long exp, long long mod) {
long long result = 1;
base %= mod;
while (exp > 0) {
if (exp & 1) {
result = (__int128)result * base % mod;
}
base = (__int128)base * base % mod;
exp >>= 1;
}
return result;
}
// Modular inverse using Fermat's Little Theorem
// For prime mod: a^(-1) ≡ a^(mod-2) (mod mod)
long long modInverse(long long n, long long mod) {
return power(n, mod - 2, mod);
}
int main() {
long long n = 2000; // Size of set
long long k = 5; // Divisor
long long mod = 1e9 + 7; // Result modulo this prime
// Formula: (2^n + (k-1) * 2^(n/k)) / k
long long total_subsets = power(2, n, mod);
long long correction = (k - 1) * power(2, n / k, mod) % mod;
long long numerator = (total_subsets + correction) % mod;
long long result = (numerator * modInverse(k, mod)) % mod;
cout << "Subsets divisible by " << k << ": " << result << endl;
return 0;
}
Approach B: Dynamic Programming (Flexible, Small $n$)
When $n$ is moderate (say, $n \le 10^4$) and you need to track the process or handle variations, use DP.
Time complexity: $O(n \cdot k)$
Space complexity: $O(n \cdot k)$ (can be optimized to $O(k)$)
Best for: Educational purposes, debugging, extensions (e.g., “count subsets with sum exactly $m$”)
#include <iostream>
#include <vector>
using namespace std;
int main() {
int n = 20; // Size of set
int k = 5; // Divisor
long long mod = 1e9 + 7;
// dp[i][r] = number of subsets using first i elements with sum ≡ r (mod k)
vector<vector<long long>> dp(n + 1, vector<long long>(k, 0));
dp[0][0] = 1; // Empty subset has sum 0
for (int i = 1; i <= n; i++) {
for (int r = 0; r < k; r++) {
// Option 1: Don't include element i
dp[i][r] = dp[i-1][r];
// Option 2: Include element i (value = i)
int prev_remainder = (r - i % k + k) % k;
dp[i][r] = (dp[i][r] + dp[i-1][prev_remainder]) % mod;
}
}
cout << "Subsets divisible by " << k << ": " << dp[n][0] << endl;
// Bonus: Print distribution across all remainders
cout << "\nDistribution by remainder:\n";
for (int r = 0; r < k; r++) {
cout << "Remainder " << r << ": " << dp[n][r] << endl;
}
return 0;
}
Space optimization: Since dp[i] only depends on dp[i-1], we can use two 1D arrays instead of a 2D array, reducing space to $O(k)$.
Where This Fits in the Real World
The mathematics we’ve explored—generating functions, roots of unity, modular arithmetic—powers technologies you use every day.
1. Digital Signal Processing (DSP)
The “roots of unity filter” in our formula is the foundation of the Discrete Fourier Transform (DFT).
When your phone processes audio or decodes a Wi-Fi signal, it breaks complex waves into discrete frequencies using these mathematical filters—just like we filtered subsets by their remainder.
Connection: Both problems involve partitioning a large space (subsets or signals) into equivalence classes (remainders or frequencies) using periodic structure.
2. Cryptography & Error Correction
Reed-Solomon codes (used in QR codes, CDs, and satellite communication) rely on polynomial mathematics and modular arithmetic.
If data gets corrupted during transmission, the system checks if certain sums match expected divisibility properties. A mismatch reveals which bits were flipped.
Connection: The DP approach is similar to how error-correcting codes track “syndrome” values—remainders that indicate corruption.
3. Load Balancing in Distributed Systems
When distributing tasks across $k$ servers, engineers use modular arithmetic to ensure even distribution.
The same combinatorial formulas help predict whether one server (or “remainder bucket”) will be overloaded compared to others.
Connection: Our formula quantifies imbalance. If the correction factor is large, the distribution is uneven; if small, it’s nearly uniform.
4. Combinatorial Chemistry
Scientists use generating functions to count molecular structures (isomers).
If they need molecules with specific symmetry or bonding patterns (like a sum divisible by a certain value), these formulas predict how many such structures exist before lab work begins.
Connection: The DP table is a generating function in disguise. Each entry encodes a polynomial coefficient representing “ways to achieve this state.”
Resources & Further Reading
- 3Blue1Brown: Generating Functions - The video that inspired this post
- Codeforces: Subset Sum Problems - Competitive programming perspective
- Generating Functions on Wikipedia - Mathematical background
- Modular Arithmetic Tutorial - Khan Academy
- Dynamic Programming Patterns - LeetCode guide