On Tue, Jul 02, 2013 at 09:44:10AM +0200, Ondrej Sury wrote::
> Florian Weimer has correctly pointed out that Oracle has decided to change
> the BDB 6.0 license to AGPLv3 (
> we (as the Debian project) need to take a decision.:
(because the AGPLv3 is incompatible with GPLv2-only, and other licenses,
and there is code under these licenses which depends on BDB. There is
also code which depends on BDB that is compatible with the AGPLv3
legally, but whose deployment would then be restricted in ways the users
would not be expecting.)
> As far as I am aware the most prominent users of Berkeley DB are
> moving away from the library anyway ...
There are actually very few true alternatives to BDB. While there are
many KVP (key value pair) stores  not many are transactional, allow
multi-version concurrency control  and support multi-threaded and
multi-process access. BDB is all of the above, and in addition the BDB
API has become very widely used over nearly 30 years. And of course the
BSD license allowed BDB to be embedded in a huge amount of software -
like the BSD networking stack, it turns up just about everywhere.
So there are three things to think about in a replacement:
1. Is the licensing as suitable as the BSD license has been, and is
the primary maintainer likely to do what Oracle just did to BDB?
2. Are the features at least as good as BDB, and is the API close enough
to make replacement reasonably easy?
3. BDB is very old code. When replacing it can we also adopt modern
approaches more suited to modern hardware and use cases?
I've looked at all of the KVP options I am aware of and consulted people
who specialise in the space and can only see one that fits each of these
well. MDB from the OpenLDAP project, http://symas.com/mdb/ .
As to point 1, the rights holders of MDB need to make a public
statement, but I have asked them in private and in any case the OpenLDAP
history speaks for itself.
As to point 2, MDB provides a superset of the KVP-specific features of
BDB, and the API is similar to BDB but somewhat simpler.
As to point 3, MDB is a from-scratch implementation in the modern world.
MDB object code is a tiny fraction of BDB, and by adopting a
memory-mapped architecture and dispensing with caching and locking it
relies on modern operating system features rather than trying to
implement them internally. This means greatly increased performance and
very much smaller windows for corruption to occur. MDB has very clean
support for concurrency compared to BDB, which makes it much more
suitable for modern applications.
There is an technical discussion of MDB here:
http://symas.com/mdb/20120829-LinuxCon-MDB-txt.pdf . Some performance
data has been published, one simple test that has a minimum of
imaginable bias is to compare the SQLite3 API that comes with Oracle BDB
with the SQLite3 ported to MDB. The other obvious speed test (that has
had reproducible data published) is with OpenLDAP, which has pluggable
back ends including both BDB and MDB.
I'll be delighted if someone can suggest something that is even more
suitable than MDB, but so far I haven't seen it.
Looking at the Debian archive, packages with BDB dependencies are as
follows (MDB integration has been marked where it exists, currently
about 10% of the packages.):
cfengine3 LMDB support published
cyrus-sasl2 LMDB support published
heimdal LMDB supported
memcachedb LMDB support published
opendkim LMDB supported
openldap LMDB supported
postfix LMDB support published
python2.7 LMDB supported
python3.2 LMDB supported
python3.3 LMDB supported
ruby-bdb LMDB supported
 KVP: Key value pair store is an unordered list of paired items, with
an index. http://en.wikipedia.org/wiki/Attribute-value_pair . On top of
this view you can layer tables (eg SQL RDBMS), special-purpose trees (eg
LDAP, DNS) or other models.
 MVCC is about avoiding locks by giving readers a consistent view of
the data store at a given point in time. BDB implemented MVCC using
locks, although that partly defeats the purpose. See
----- End forwarded message -----