DDIA Reading Notes - Chapter 3

Storage and Retrieval

·

2 min read

Chapter Overview

This chapter dives into the fundamental operations of databases - how they store and retrieve data. It categorizes storage engines into two main groups:

I. Online Transaction Processing (OLTP) Databases

  • Optimized for user-facing applications with high volume of requests

  • Typical access patterns involve reading/writing small numbers of records using indexes

  • Disk seek time is often the performance bottleneck

II. Online Analytical Processing (OLAP) Databases

  • Handle lower query volumes but each query is very demanding

  • Access patterns involve sequentially scanning millions of records

  • Disk bandwidth, not seek time, is the bottleneck.

The chapter aims to provide application developers with a solid understanding of storage engine internals to guide database selection, tuning, and comprehension of database documentation.

Visual

OLAP vs OLTP

AspectOLAPOLTP
PurposeComplex analytics on historical dataTransaction processing
WorkloadFewer complex queriesHigh volume of simple queries
Access PatternSequential scansRandom access via indexes
Data VolumeVery large (TB/PB)Smaller (GB/TB)
Data ModelDenormalized (star/snowflake)Normalized (relational)
StorageColumn-orientedRow-oriented
IndexingLess relevantCrucial for performance
BottleneckDisk bandwidthDisk seek time
UpdatesPeriodic batch loadsReal-time updates

Summary

In essence, Chapter 3 equips developers with critical insights into the core data storage and retrieval mechanisms of databases, enabling better alignment of database choices with application requirements.