Log Structured Merge Tree

Log-Structured Merge Trees

Ankit Verma
Latest posts by Ankit Verma (see all)

What is a database? Anything that does two things. When you give it some data it should store the data, and when you ask it again, it should give the data back to you. Now let’s talk about the anatomy of databases or data structures that powers the database. Internally every database uses any kind of file system or data structure.

Concept of LSM and SST Tables 

Lets first see how the writes are handled in LSM and SST tables 

WAL (Write ahead Log)

          WAL is used for state management so that if any durability issue arises then the previous state can be obtained using WAL logs 

See also  All About C# Immutable Classes

Memtable :

        This will contain an SSTable record of data. The only thing to remember in this is that SSTable records are immutable.

SST Tables 

       They will contain appended stream of record which are appended based on the timestamp. These streams of records are sorted based on the primary key.

Compaction 

Compact the records based on timestamp and latest values

Now lets us see how read had been handled in LSM

Bloom filter

      Bloom filter is a concept that gives the predictive cache if read value exists in SSTables or not 

Examples of Database that uses this kind of implementation are MongoDB and Cassandra

References 

  1. https://www.youtube.com/watch?v=oUNjDHYFES8
  2. https://www.youtube.com/watch?v=gBygn3cVP80
  3. https://www.amazon.com/Designing-Data-Intensive-Applications-Reliable-Maintainable