LSM Tree
Mutable and Immutable Data Structures
Immutable data structures have a significant advantage: compressibility, and updates do not require reserving additional space (no in-place updates, as updating larger data in place would trigger reallocation requiring defragmentation).
Using sequential writes with append-only operations is the key to maintaining data structure immutability.
Different databases have different implementation approaches for reading and writing immutable data structures. Some databases mark old data as expired or set deletion flags, but triggering cleanup may significantly reduce read and write performance.
When reading and writing immutable files, there’s no need to add segment locks for protection, greatly simplifying concurrent access locks. As sequentially written file sizes continue to grow, file groups need to be merged and rewritten, with dedicated processes cleaning up data to prevent performance degradation.
LSM is a mutable structure, while common B-trees and their variants are immutable structures.
Log Structure Merge Tree
Optimized for sequential disk reads, nodes can be fully occupied by data. Allows immutable, mergeable files. Just a concept, not a specific implementation.
Sorted String Table
SSTable is an ordered immutable dictionary specifically optimized for sequential read and write operations. In-place modifications would cause file rewrites.