Database Indexing
Indexes are one of the most important optimizations in databases.
They act like a lookup guide (similar to a book’s index) that helps you find rows quickly without scanning the entire table.
In this article, we’ll explore how indexes work, different types, their trade-offs, and common pitfalls.
1. Why Indexing Matters
Without an index, the database must perform a full table scan:
- Every row is checked to find the required data.
- Time complexity ≈ O(n).
With an index:
- DB navigates directly to relevant rows (e.g., via B-tree).
- Time complexity ≈ O(log n).
👉 This is why proper indexing can reduce query time from seconds to milliseconds.
2. How Indexes Work (Analogy)
- Imagine searching for “Physics” in a 500-page textbook.
- Without an index: you flip through every page (full scan).
- With an index: you go to the index section, find “Physics → Page 312,” and jump directly.
That’s what a database index does for your queries.
3. Types of Indexes
3.1 B-Tree Index
- The most common indexing method.
- Balanced tree structure, keeps data sorted.
- Efficient for equality (
=
) and range queries (BETWEEN
,<
,>
). - Example: PostgreSQL default index.
Good for:
- Numeric ranges, sorted queries, primary key lookups.
3.2 Hash Index
- Stores a hash of the indexed column.
- O(1) lookups for equality queries (
=
). - Not good for range queries (no ordering).
- Example: Memory-optimized tables in PostgreSQL, Redis uses hash structures internally.
Good for:
- Exact matches, e.g.,
user_id = 123
.
3.3 Bitmap Index
- Uses bit arrays to represent values.
- Very efficient for columns with low cardinality (few distinct values).
- Example: Gender (M/F), Boolean flags.
Good for:
- Data warehouses, OLAP queries.
3.4 Full-Text Index
- Optimized for text search.
- Breaks text into tokens and indexes them.
- Supports ranking, relevance, partial matches.
- Examples: MySQL
FULLTEXT
, PostgreSQLGIN/GIST
, Elasticsearch.
Good for:
- Search bars, product catalogs, document search.
3.5 Spatial Index
- Specialized indexes for geometric data (maps, locations).
- Example: R-tree.
- Used in GIS systems, Uber’s geospatial queries.
4. Primary vs Secondary Indexes
Primary Index:
- Built on the table’s primary key.
- Data is stored in sorted order of the primary key.
Secondary Index:
- Built on non-primary key columns.
- Points back to the row (via primary key).
Example (Users Table):
- Primary Key:
user_id
- Secondary Index:
email
Query: SELECT * FROM users WHERE email='x@example.com';
→ Uses secondary index to jump directly to the right row.
5. Trade-Offs of Indexing
Indexes speed up reads, but at a cost:
Slower writes
EveryINSERT
,UPDATE
,DELETE
must also update indexes.Extra storage
Indexes consume disk space (can be large for many columns).Maintenance overhead
Fragmentation may require reindexing.
👉 Rule of Thumb: Index columns that are frequently queried, not everything.
6. Common Pitfalls
Too Many Indexes
- Slows down writes.
- Use indexes strategically, not blindly.
Unused Indexes
- Wastes disk + memory.
- Regularly monitor with tools like
pg_stat_user_indexes
in PostgreSQL.
Stale Indexes
- Over time, deleted/updated rows can bloat indexes.
- Requires reindexing or VACUUM (Postgres).
Wrong Index Type
- Hash index on range queries = bad performance.
- B-tree on text search = suboptimal.
Composite Index Misuse
- If you create
(col1, col2)
index, queries must usecol1
for it to be effective.
- If you create
7. Interview Tip
When asked “How do you optimize slow queries?”:
- Check indexing first.
- Mention trade-offs (faster reads vs slower writes).
- Differentiate index types.
- Bonus: Talk about stale indexes and maintenance.
👉 Example Answer:
“I’d start by adding a B-Tree index on the column used in WHERE clause. If queries are equality-based on a small column (like status), a bitmap index might work better. I’d also check for unused or stale indexes, since too many indexes can slow down writes.”
8. Recap
- Indexes = shortcuts to avoid full scans.
- Types: B-Tree, Hash, Bitmap, Full-Text, Spatial.
- Primary vs Secondary indexes.
- Trade-offs: Faster reads, slower writes, extra storage.
- Pitfalls: Too many, unused, stale, or wrong indexes.
Next Steps
👉 Continue with Sharding vs Replication to learn scaling strategies beyond indexing.