Question 6
Normalized vs denormalized
In simple terms, what does normalized mean for a database? What does denormalized mean? Give a couple of real-world examples.
Answer outline
Normalized means you store each fact in one logical place and link related rows with IDs. You avoid copying the same data onto many rows, so when something changes you usually update it once.
Denormalized means you duplicate or pre-aggregate data on purpose, often to make reads faster or queries simpler, knowing you must keep copies in sync or accept stale cached values.
Principles
- Normalized example:
Userhasid+displayName;PosthasauthorIdpointing to that user. The name lives on one user row, not repeated on every post. - Denormalized example: a
Postrow storescommentCountso the feed can show counts without runningCOUNT(*)on comments for every row—faster, but you must increment/decrement that field when comments change (or recompute periodically). - Denormalized example: caching
authorNameon eachPostfor a read-heavy feed avoids a join per row, but if the user renames themselves you must update many posts or tolerate out-of-date names until refresh.
Sketch — normalized vs duplicated
// Normalized: one source for the name
User(id: "u1", name: "Alex")
Post(id: "p1", authorId: "u1", text: "Hello")
// Denormalized for read speed: name copied onto post
Post(id: "p1", authorId: "u1", authorName: "Alex", text: "Hello")
Follow-up angles
- Teams often start normalized and denormalize only where profiling shows the need.