BerkeleyDB started as out as a replacement for ndbm/dbm & hsearch.
It is highly modular and its components can be reused in isolation by other applications.
Access methods provide interface to data via
B-Tree is right choice in most case because it gives Locality of Reference for keys where Hash does not. However if memory size is limited and index cannot be loaded into memory, Hash might be better strategy.
Hash algorithm is based on Litwin's Extensible Liner Hashing research.
Queue provides Record level locking at the expense of fixed-length values.
Recno is variable length objects but cna only give page-level locking.
By using B+ Trees & on-disk format as in-memory format:
Modeling a system based on Cache allows us to think in terms of "pinning pages". This also helps with having same format for in-memory & on-disk format. [:see ~2]
To enable multi-version concurrency ie., Concurrent access to data/database, distinguishing between read & write data access is important.
API provides two types of access methods
getto ensure page exists in cache and to pin it.
putto unpin & release it for eviction.
Transaction Mechanism is used to make recovery after failure possible. One such example is Write-Ahead Logging.
Lock Manager needs to provide three functions
Simple way to define a lock object if working at page level is to create an object with fields
Some examples of "Type"
Database is a file. Each file is given a unique id at creation time. This id is stored as part of DB metadata within the file.
This ensure that file rename or moving around will not cause issues with identifying the database. This is useful for ACID related stuff.
Database has many pages. A page has an associated LSN and set of records.
In case of a page, recovering from partial edit (add, delete & modify) involves writing complete page with all the records.
In case of records, recovering a page would involve calculating state across multiple records with different edit phases.
Log is a record of changes that are sequentially applied to DB over time. The sequence is determined by Log Sequence Number (LSN).
Log Manager takes care of writing and retrieving logs from logfile.
If we use append-only logs then LSN can be as simple as
(<file id>, <offset within file>)
Log Manager will treat each edit as an opaque byte string. Then it is upto the supervising system to interpret the data within the byte string.
This makes it easier for different applications to reuse the Log Manager.
Every log record has Log Record Header prepended to it. It contains:
We can use a Log Manager-Log File to implement a failure recovery system.
If we work at page level and want to maintain logs for transaction updates, our log record would contain complete page data.
It's metadata/header will include