libmdbx  0.11.6.39 (2022-04-13T11:05:50+03:00)
One of the fastest compact embeddable key-value ACID database without WAL.
Opening & Closing

Typedefs

typedef enum MDBX_env_flags_t MDBX_env_flags_t
 

Enumerations

enum  MDBX_env_flags_t {
  MDBX_ENV_DEFAULTS = 0, MDBX_NOSUBDIR = UINT32_C(0x4000), MDBX_RDONLY = UINT32_C(0x20000), MDBX_EXCLUSIVE = UINT32_C(0x400000),
  MDBX_ACCEDE = UINT32_C(0x40000000), MDBX_WRITEMAP = UINT32_C(0x80000), MDBX_NOTLS = UINT32_C(0x200000), MDBX_NORDAHEAD = UINT32_C(0x800000),
  MDBX_NOMEMINIT = UINT32_C(0x1000000), MDBX_COALESCE = UINT32_C(0x2000000), MDBX_LIFORECLAIM = UINT32_C(0x4000000), MDBX_PAGEPERTURB = UINT32_C(0x8000000),
  MDBX_SYNC_DURABLE = 0, MDBX_NOMETASYNC = UINT32_C(0x40000), MDBX_SAFE_NOSYNC = UINT32_C(0x10000), MDBX_MAPASYNC = MDBX_SAFE_NOSYNC,
  MDBX_UTTERLY_NOSYNC = MDBX_SAFE_NOSYNC | UINT32_C(0x100000)
}
 Environment flags. More...
 

Functions

LIBMDBX_API int mdbx_env_create (MDBX_env **penv)
 Create an MDBX environment instance. More...
 
LIBMDBX_API int mdbx_env_open (MDBX_env *env, const char *pathname, MDBX_env_flags_t flags, mdbx_mode_t mode)
 Open an environment instance. More...
 
LIBMDBX_API int mdbx_env_close_ex (MDBX_env *env, bool dont_sync)
 Close the environment and release the memory map. More...
 
int mdbx_env_close (MDBX_env *env)
 The shortcut to calling mdbx_env_close_ex() with the dont_sync=false argument. More...
 

Detailed Description

Typedef Documentation

◆ MDBX_env_flags_t

Enumeration Type Documentation

◆ MDBX_env_flags_t

Environment flags.

See also
mdbx_env_open()
mdbx_env_set_flags()
Enumerator
MDBX_ENV_DEFAULTS 
MDBX_NOSUBDIR 

No environment directory.

By default, MDBX creates its environment in a directory whose pathname is given in path, and creates its data and lock files under that directory. With this option, path is used as-is for the database main data file. The database lock file is the path with "-lck" appended.

  • with MDBX_NOSUBDIR = in a filesystem we have the pair of MDBX-files which names derived from given pathname by appending predefined suffixes.
  • without MDBX_NOSUBDIR = in a filesystem we have the MDBX-directory with given pathname, within that a pair of MDBX-files with predefined names.

This flag affects only at new environment creating by mdbx_env_open(), otherwise at opening an existing environment libmdbx will choice this automatically.

MDBX_RDONLY 

Read only mode.

Open the environment in read-only mode. No write operations will be allowed. MDBX will still modify the lock file - except on read-only filesystems, where MDBX does not use locks.

  • with MDBX_RDONLY = open environment in read-only mode. MDBX supports pure read-only mode (i.e. without opening LCK-file) only when environment directory and/or both files are not writable (and the LCK-file may be missing). In such case allowing file(s) to be placed on a network read-only share.
  • without MDBX_RDONLY = open environment in read-write mode.

This flag affects only at environment opening but can't be changed after.

MDBX_EXCLUSIVE 

Open environment in exclusive/monopolistic mode.

MDBX_EXCLUSIVE flag can be used as a replacement for MDB_NOLOCK, which don't supported by MDBX. In this way, you can get the minimal overhead, but with the correct multi-process and multi-thread locking.

  • with MDBX_EXCLUSIVE = open environment in exclusive/monopolistic mode or return MDBX_BUSY if environment already used by other process. The main feature of the exclusive mode is the ability to open the environment placed on a network share.
  • without MDBX_EXCLUSIVE = open environment in cooperative mode, i.e. for multi-process access/interaction/cooperation. The main requirements of the cooperative mode are:
    1. data files MUST be placed in the LOCAL file system, but NOT on a network share.
    2. environment MUST be opened only by LOCAL processes, but NOT over a network.
    3. OS kernel (i.e. file system and memory mapping implementation) and all processes that open the given environment MUST be running in the physically single RAM with cache-coherency. The only exception for cache-consistency requirement is Linux on MIPS architecture, but this case has not been tested for a long time).

This flag affects only at environment opening but can't be changed after.

MDBX_ACCEDE 

Using database/environment which already opened by another process(es).

The MDBX_ACCEDE flag is useful to avoid MDBX_INCOMPATIBLE error while opening the database/environment which is already used by another process(es) with unknown mode/flags. In such cases, if there is a difference in the specified flags (MDBX_NOMETASYNC, MDBX_SAFE_NOSYNC, MDBX_UTTERLY_NOSYNC, MDBX_LIFORECLAIM, MDBX_COALESCE and MDBX_NORDAHEAD), instead of returning an error, the database will be opened in a compatibility with the already used mode.

MDBX_ACCEDE has no effect if the current process is the only one either opening the DB in read-only mode or other process(es) uses the DB in read-only mode.

MDBX_WRITEMAP 

Map data into memory with write permission.

Use a writeable memory map unless MDBX_RDONLY is set. This uses fewer mallocs and requires much less work for tracking database pages, but loses protection from application bugs like wild pointer writes and other bad updates into the database. This may be slightly faster for DBs that fit entirely in RAM, but is slower for DBs larger than RAM. Also adds the possibility for stray application writes thru pointers to silently corrupt the database.

  • with MDBX_WRITEMAP = all data will be mapped into memory in the read-write mode. This offers a significant performance benefit, since the data will be modified directly in mapped memory and then flushed to disk by single system call, without any memory management nor copying.
  • without MDBX_WRITEMAP = data will be mapped into memory in the read-only mode. This requires stocking all modified database pages in memory and then writing them to disk through file operations.
Warning
On the other hand, MDBX_WRITEMAP adds the possibility for stray application writes thru pointers to silently corrupt the database.
Note
The MDBX_WRITEMAP mode is incompatible with nested transactions, since this is unreasonable. I.e. nested transactions requires mallocation of database pages and more work for tracking ones, which neuters a performance boost caused by the MDBX_WRITEMAP mode.

This flag affects only at environment opening but can't be changed after.

MDBX_NOTLS 

Tie reader locktable slots to read-only transactions instead of to threads.

Don't use Thread-Local Storage, instead tie reader locktable slots to MDBX_txn objects instead of to threads. So, mdbx_txn_reset() keeps the slot reserved for the MDBX_txn object. A thread may use parallel read-only transactions. And a read-only transaction may span threads if you synchronizes its use.

Applications that multiplex many user threads over individual OS threads need this option. Such an application must also serialize the write transactions in an OS thread, since MDBX's write locking is unaware of the user threads.

Note
Regardless to MDBX_NOTLS flag a write transaction entirely should always be used in one thread from start to finish. MDBX checks this in a reasonable manner and return the MDBX_THREAD_MISMATCH error in rules violation.

This flag affects only at environment opening but can't be changed after.

MDBX_NORDAHEAD 

Don't do readahead.

Turn off readahead. Most operating systems perform readahead on read requests by default. This option turns it off if the OS supports it. Turning it off may help random read performance when the DB is larger than RAM and system RAM is full.

By default libmdbx dynamically enables/disables readahead depending on the actual database size and currently available memory. On the other hand, such automation has some limitation, i.e. could be performed only when DB size changing but can't tracks and reacts changing a free RAM availability, since it changes independently and asynchronously.

Note
The mdbx_is_readahead_reasonable() function allows to quickly find out whether to use readahead or not based on the size of the data and the amount of available memory.

This flag affects only at environment opening and can't be changed after.

MDBX_NOMEMINIT 

Don't initialize malloc'ed memory before writing to datafile.

Don't initialize malloc'ed memory before writing to unused spaces in the data file. By default, memory for pages written to the data file is obtained using malloc. While these pages may be reused in subsequent transactions, freshly malloc'ed pages will be initialized to zeroes before use. This avoids persisting leftover data from other code (that used the heap and subsequently freed the memory) into the data file.

Note that many other system libraries may allocate and free memory from the heap for arbitrary uses. E.g., stdio may use the heap for file I/O buffers. This initialization step has a modest performance cost so some applications may want to disable it using this flag. This option can be a problem for applications which handle sensitive data like passwords, and it makes memory checkers like Valgrind noisy. This flag is not needed with MDBX_WRITEMAP, which writes directly to the mmap instead of using malloc for pages. The initialization is also skipped if MDBX_RESERVE is used; the caller is expected to overwrite all of the memory that was reserved in that case.

This flag may be changed at any time using mdbx_env_set_flags().

MDBX_COALESCE 

Aims to coalesce a Garbage Collection items.

With MDBX_COALESCE flag MDBX will aims to coalesce items while recycling a Garbage Collection. Technically, when possible short lists of pages will be combined into longer ones, but to fit on one database page. As a result, there will be fewer items in Garbage Collection and a page lists are longer, which slightly increases the likelihood of returning pages to Unallocated space and reducing the database file.

This flag may be changed at any time using mdbx_env_set_flags().

MDBX_LIFORECLAIM 

LIFO policy for recycling a Garbage Collection items.

MDBX_LIFORECLAIM flag turns on LIFO policy for recycling a Garbage Collection items, instead of FIFO by default. On systems with a disk write-back cache, this can significantly increase write performance, up to several times in a best case scenario.

LIFO recycling policy means that for reuse pages will be taken which became unused the lastest (i.e. just now or most recently). Therefore the loop of database pages circulation becomes as short as possible. In other words, the number of pages, that are overwritten in memory and on disk during a series of write transactions, will be as small as possible. Thus creates ideal conditions for the efficient operation of the disk write-back cache.

MDBX_LIFORECLAIM is compatible with all no-sync flags, but gives NO noticeable impact in combination with MDBX_SAFE_NOSYNC or MDBX_UTTERLY_NOSYNC. Because MDBX will reused pages only before the last "steady" MVCC-snapshot, i.e. the loop length of database pages circulation will be mostly defined by frequency of calling mdbx_env_sync() rather than LIFO and FIFO difference.

This flag may be changed at any time using mdbx_env_set_flags().

MDBX_PAGEPERTURB 

Debugging option, fill/perturb released pages.

MDBX_SYNC_DURABLE 

Default robust and durable sync mode.

Metadata is written and flushed to disk after a data is written and flushed, which guarantees the integrity of the database in the event of a crash at any time.

Attention
Please do not use other modes until you have studied all the details and are sure. Otherwise, you may lose your users' data, as happens in Miranda NG messenger.
MDBX_NOMETASYNC 

Don't sync the meta-page after commit.

Flush system buffers to disk only once per transaction commit, omit the metadata flush. Defer that until the system flushes files to disk, or next non-MDBX_RDONLY commit or mdbx_env_sync(). Depending on the platform and hardware, with MDBX_NOMETASYNC you may get a doubling of write performance.

This trade-off maintains database integrity, but a system crash may undo the last committed transaction. I.e. it preserves the ACI (atomicity, consistency, isolation) but not D (durability) database property.

MDBX_NOMETASYNC flag may be changed at any time using mdbx_env_set_flags() or by passing to mdbx_txn_begin() for particular write transaction.

See also
SYNC MODES
MDBX_SAFE_NOSYNC 

Don't sync anything but keep previous steady commits.

Like MDBX_UTTERLY_NOSYNC the MDBX_SAFE_NOSYNC flag disable similarly flush system buffers to disk when committing a transaction. But there is a huge difference in how are recycled the MVCC snapshots corresponding to previous "steady" transactions (see below).

With MDBX_WRITEMAP the MDBX_SAFE_NOSYNC instructs MDBX to use asynchronous mmap-flushes to disk. Asynchronous mmap-flushes means that actually all writes will scheduled and performed by operation system on it own manner, i.e. unordered. MDBX itself just notify operating system that it would be nice to write data to disk, but no more.

Depending on the platform and hardware, with MDBX_SAFE_NOSYNC you may get a multiple increase of write performance, even 10 times or more.

In contrast to MDBX_UTTERLY_NOSYNC mode, with MDBX_SAFE_NOSYNC flag MDBX will keeps untouched pages within B-tree of the last transaction "steady" which was synced to disk completely. This has big implications for both data durability and (unfortunately) performance:

  • a system crash can't corrupt the database, but you will lose the last transactions; because MDBX will rollback to last steady commit since it kept explicitly.
  • the last steady transaction makes an effect similar to "long-lived" read transaction (see above in the Restrictions & Caveats section) since prevents reuse of pages freed by newer write transactions, thus the any data changes will be placed in newly allocated pages.
  • to avoid rapid database growth, the system will sync data and issue a steady commit-point to resume reuse pages, each time there is insufficient space and before increasing the size of the file on disk.

In other words, with MDBX_SAFE_NOSYNC flag MDBX insures you from the whole database corruption, at the cost increasing database size and/or number of disk IOPs. So, MDBX_SAFE_NOSYNC flag could be used with mdbx_env_sync() as alternatively for batch committing or nested transaction (in some cases). As well, auto-sync feature exposed by mdbx_env_set_syncbytes() and mdbx_env_set_syncperiod() functions could be very useful with MDBX_SAFE_NOSYNC flag.

The number and volume of of disk IOPs with MDBX_SAFE_NOSYNC flag will exactly the as without any no-sync flags. However, you should expect a larger process's work set and significantly worse a locality of reference, due to the more intensive allocation of previously unused pages and increase the size of the database.

MDBX_SAFE_NOSYNC flag may be changed at any time using mdbx_env_set_flags() or by passing to mdbx_txn_begin() for particular write transaction.

MDBX_MAPASYNC 
Deprecated:
Please use MDBX_SAFE_NOSYNC instead of MDBX_MAPASYNC.

Since version 0.9.x the MDBX_MAPASYNC is deprecated and has the same effect as MDBX_SAFE_NOSYNC with MDBX_WRITEMAP. This just API simplification is for convenience and clarity.

MDBX_UTTERLY_NOSYNC 

Don't sync anything and wipe previous steady commits.

Don't flush system buffers to disk when committing a transaction. This optimization means a system crash can corrupt the database, if buffers are not yet flushed to disk. Depending on the platform and hardware, with MDBX_UTTERLY_NOSYNC you may get a multiple increase of write performance, even 100 times or more.

If the filesystem preserves write order (which is rare and never provided unless explicitly noted) and the MDBX_WRITEMAP and MDBX_LIFORECLAIM flags are not used, then a system crash can't corrupt the database, but you can lose the last transactions, if at least one buffer is not yet flushed to disk. The risk is governed by how often the system flushes dirty buffers to disk and how often mdbx_env_sync() is called. So, transactions exhibit ACI (atomicity, consistency, isolation) properties and only lose D (durability). I.e. database integrity is maintained, but a system crash may undo the final transactions.

Otherwise, if the filesystem not preserves write order (which is typically) or MDBX_WRITEMAP or MDBX_LIFORECLAIM flags are used, you should expect the corrupted database after a system crash.

So, most important thing about MDBX_UTTERLY_NOSYNC:

  • a system crash immediately after commit the write transaction high likely lead to database corruption.
  • successful completion of mdbx_env_sync(force = true) after one or more committed transactions guarantees consistency and durability.
  • BUT by committing two or more transactions you back database into a weak state, in which a system crash may lead to database corruption! In case single transaction after mdbx_env_sync, you may lose transaction itself, but not a whole database.

Nevertheless, MDBX_UTTERLY_NOSYNC provides "weak" durability in case of an application crash (but no durability on system failure), and therefore may be very useful in scenarios where data durability is not required over a system failure (e.g for short-lived data), or if you can take such risk.

MDBX_UTTERLY_NOSYNC flag may be changed at any time using mdbx_env_set_flags(), but don't has effect if passed to mdbx_txn_begin() for particular write transaction.

See also
SYNC MODES

Function Documentation

◆ mdbx_env_close()

int mdbx_env_close ( MDBX_env env)
inline

The shortcut to calling mdbx_env_close_ex() with the dont_sync=false argument.

◆ mdbx_env_close_ex()

LIBMDBX_API int mdbx_env_close_ex ( MDBX_env env,
bool  dont_sync 
)

Close the environment and release the memory map.

Only a single thread may call this function. All transactions, databases, and cursors must already be closed before calling this function. Attempts to use any such handles after calling this function will cause a SIGSEGV. The environment handle will be freed and must not be used again after this call.

Parameters
[in]envAn environment handle returned by mdbx_env_create().
[in]dont_syncA dont'sync flag, if non-zero the last checkpoint will be kept "as is" and may be still "weak" in the MDBX_SAFE_NOSYNC or MDBX_UTTERLY_NOSYNC modes. Such "weak" checkpoint will be ignored on opening next time, and transactions since the last non-weak checkpoint (meta-page update) will rolledback for consistency guarantee.
Returns
A non-zero error value on failure and 0 on success, some possible errors are:
Return values
MDBX_BUSYThe write transaction is running by other thread, in such case MDBX_env instance has NOT be destroyed not released!
Note
If any OTHER error code was returned then given MDBX_env instance has been destroyed and released.
Return values
MDBX_EBADSIGNEnvironment handle already closed or not valid, i.e. mdbx_env_close() was already called for the env or was not created by mdbx_env_create().
MDBX_PANICIf mdbx_env_close_ex() was called in the child process after fork(). In this case MDBX_PANIC is expected, i.e. MDBX_env instance was freed in proper manner.
MDBX_EIOAn error occurred during synchronization.

◆ mdbx_env_create()

LIBMDBX_API int mdbx_env_create ( MDBX_env **  penv)

Create an MDBX environment instance.

This function allocates memory for a MDBX_env structure. To release the allocated memory and discard the handle, call mdbx_env_close(). Before the handle may be used, it must be opened using mdbx_env_open().

Various other options may also need to be set before opening the handle, e.g. mdbx_env_set_geometry(), mdbx_env_set_maxreaders(), mdbx_env_set_maxdbs(), depending on usage requirements.

Parameters
[out]penvThe address where the new handle will be stored.
Returns
a non-zero error value on failure and 0 on success.

◆ mdbx_env_open()

LIBMDBX_API int mdbx_env_open ( MDBX_env env,
const char *  pathname,
MDBX_env_flags_t  flags,
mdbx_mode_t  mode 
)

Open an environment instance.

Indifferently this function will fails or not, the mdbx_env_close() must be called later to discard the MDBX_env handle and release associated resources.

Parameters
[in]envAn environment handle returned by mdbx_env_create()
[in]pathnameThe pathname for the database or the directory in which the database files reside. In the case of directory it must already exist and be writable.
[in]flagsSpecifies options for this environment. This parameter must be bitwise OR'ing together any constants described above in the env_flags and SYNC MODES sections.

Flags set by mdbx_env_set_flags() are also used:

Note
MDB_NOLOCK flag don't supported by MDBX, try use MDBX_EXCLUSIVE as a replacement.
MDBX don't allow to mix processes with different MDBX_SAFE_NOSYNC flags on the same environment. In such case MDBX_INCOMPATIBLE will be returned.

If the database is already exist and parameters specified early by mdbx_env_set_geometry() are incompatible (i.e. for instance, different page size) then mdbx_env_open() will return MDBX_INCOMPATIBLE error.

Parameters
[in]modeThe UNIX permissions to set on created files. Zero value means to open existing, but do not create.
Returns
A non-zero error value on failure and 0 on success, some possible errors are:
Return values
MDBX_VERSION_MISMATCHThe version of the MDBX library doesn't match the version that created the database environment.
MDBX_INVALIDThe environment file headers are corrupted.
MDBX_ENOENTThe directory specified by the path parameter doesn't exist.
MDBX_EACCESThe user didn't have permission to access the environment files.
MDBX_EAGAINThe environment was locked by another process.
MDBX_BUSYThe MDBX_EXCLUSIVE flag was specified and the environment is in use by another process, or the current process tries to open environment more than once.
MDBX_INCOMPATIBLEEnvironment is already opened by another process, but with different set of MDBX_SAFE_NOSYNC, MDBX_UTTERLY_NOSYNC flags. Or if the database is already exist and parameters specified early by mdbx_env_set_geometry() are incompatible (i.e. different pagesize, etc).
MDBX_WANNA_RECOVERYThe MDBX_RDONLY flag was specified but read-write access is required to rollback inconsistent state after a system crash.
MDBX_TOO_LARGEDatabase is too large for this process, i.e. 32-bit process tries to open >4Gb database.