Why PostgreSQL Ignores Your Index Sometimes

What if the index you’ve been bragging about in standup is actually making things worse—or at least, pointless?

That’s the gut punch plenty of devs feel when PostgreSQL ignores your index, opting instead for a full sequential scan that feels like betrayal. But here’s the thing: it’s not ignoring you out of spite. It’s doing math you didn’t even know was happening under the hood.

Why Does PostgreSQL Ignore My Index?

Look, you fire up EXPLAIN ANALYZE on that SELECT * FROM orders WHERE user_id = 42, expecting to see your shiny idx_orders_user_id light up the plan. Instead? Seq Scan. The whole table slurped up like a milkshake. Why?

Postgres isn’t some dogmatic rule-follower chanting “index if exists.” No. It’s a ruthless cost accountant, tallying every I/O operation, every page fetch, every random jump across your disk.

Few rows match? Index wins—quick hops to the index leaves, then pinpoint table lookups. Boom, sub-millisecond glory.

But crank the selectivity loose—say, 80% of your orders are ‘active’—and suddenly those hops turn into a drunken stagger. Random I/O murders performance on spinning rust (or even SSDs, if you’re not careful). Sequential scan? One smooth glide through the heap. Cheaper. Always.

PostgreSQL doesn’t care about your index. It cares about something else entirely.

That’s the raw truth from the trenches. Postgres estimates rows via statistics (those ANALYZE updates you keep forgetting), plugs ‘em into a cost model: seq_page_cost for linear reads (usually 1.0), random_page_cost for jumps (default 4.0 on HDDs, tweakable to 1.1-1.5 on NVMe bliss). Add cpu_tuple_cost, and it spits out the winning plan.

Short para for emphasis: Indexes aren’t magic. They’re trades.

The Warehouse Heist That Explains It All

Picture this: massive warehouse, orders stacked floor to ceiling. You need five specific boxes for user 42. Do you ping-pong via forklift to exact coordinates (index scan)? Hell yes—faster than combing every aisle.

Now, snag every ‘active’ box, and 80% qualify. Forklift frenzy? You’re burning fuel idling, repositioning, braking. Nah—just walk the grid, steady pace, grab as you go. That’s Postgres: minimum total work.

But dig deeper—why this mental model sticks. Back in the ’90s, when Postgres forked from Ingres, disks were slower than molasses. Random access killed. Sequential? King. Fast-forward (sorry, can’t say that), SSDs flip the script somewhat, but Postgres’ planner lags a tad—random_page_cost stays high by default because heaps aren’t always contiguous, fragmentation bites, and WAL writes scatter pages anyway.

Here’s my unique twist, one the original post misses: this isn’t just Postgres quirkiness. It’s a ghost of mechanical HDD architecture haunting the NVMe era. Oracle, MySQL—they all play the same game, but Postgres’ transparency (that EXPLAIN gold) lets you audit the books. Bold call: as ZFS and bcachefs mature, expect community patches slashing random costs further. Postgres row-storage could dominate analytical workloads again, if columnar upstarts like ClickHouse don’t eat its lunch first.

And yeah, corporate hype alert—those “index everything” tutorials from cloud vendors? Pure spin. They want you storage-bloating for their upsell.

Cracking the Cost Black Box

So, how’s it calculate? Planner starts with row estimates from pg_statistic histograms—run ANALYZE religiously, or you’re flying blind.

Selectivity: fraction of rows matching your WHERE. Tight (0.01%)? Index. Loose (50%)? Seq.

Costs stack: index scans hit index pages (seq-ish), then random table fetches per qualifying tuple. Bitmap indexes sweeten it for multiples, but singles? Pure random pain.

Tweakable knobs—seq_page_cost=1.0, random_page_cost=1.2 (SSD), effective_cache_size=your RAM ballpark. But don’t force it with /+ IndexScan /—that’s cheating the optimizer, scales poorly.

Real talk: at small scale (10k rows), everything flies. Hit millions? Wrong plan cascades—cache misses, lock contention, your app crawls.

When Indexes Actually Pay Off — And When to Ditch

Multi-column? Composite indexes, but only leading columns matter (unless funky functions). Partial indexes for status=’active’ if rare? Gold.

Covering indexes (INCLUDE cols) slash table hops. BRIN for time-series bloat. Gin for JSON tsquery.

But test. Always EXPLAIN ANALYZE in prod-like load. pgBadger your logs.

Postgres thinks like a lazy genius: cheapest path, period. You should too.

🧬 Related Insights

Read more: 3-Hour Webinar Vows Full Restaurant App: Demo or Disaster?
Read more: Why Canvas API Still Wins for Snappy Image Rotation Tools

Frequently Asked Questions

Why does PostgreSQL choose sequential scan over index?

Postgres picks seq scan when most rows match (low selectivity)—random I/O from index jumps costs more than one linear pass.

How to make PostgreSQL use my index?

Improve selectivity with better WHERE, run ANALYZE for stats, tune random_page_cost for SSDs. Avoid forcing unless desperate.

What is index selectivity in PostgreSQL?

Fraction of rows your condition hits. Under ~5-10%? Index shines. Above? Seq scan often cheaper.

Why PostgreSQL Ignores Your Index Sometimes

Key Takeaways

Why Does PostgreSQL Ignore My Index?

The Warehouse Heist That Explains It All

Cracking the Cost Black Box

When Indexes Actually Pay Off — And When to Ditch

🧬 Related Insights

Frequently asked questions

Worth sharing?

⚡ Key Takeaways

Why Does PostgreSQL Ignore My Index?

The Warehouse Heist That Explains It All

Cracking the Cost Black Box

When Indexes Actually Pay Off — And When to Ditch

🧬 Related Insights

Frequently asked questions

Share this article

Worth sharing?

Related Stories

PostgreSQL: Rethinking Query Optimization

pgAudit Attribution Gap Leaves GDPR Compliance in Doubt [Analysis]

Database Collapse: Cache Expiry's 8,000 Req/Sec Catastrophe

MySQL to AWS RDS Live: GoldenGate 19c Migration Unveiled [Data Flow]

Stay in the loop

Key Takeaways