Datalad uses the same configuration mechanism and syntax as Git itself. Consequently, datalad can be configured using the git config command. Both a global user configuration (typically at ~/.gitconfig), and a local repository-specific configuration (.git/config) are inspected.

In addition, datalad supports a persistent dataset-specific configuration. This configuration is stored at .datalad/config in any dataset. As it is part of a dataset, settings stored there will also be in effect for any consumer of such a dataset. Both global and local settings on a particular machine always override configuration shipped with a dataset.

All datalad-specific configuration variables are prefixed with datalad..

It is possible to override or amend the configuration using environment variables. Any variable with a name that starts with DATALAD_ will be available as the corresponding datalad. configuration variable, replacing any _ in the name with a dot, and all letters converted to lower case. Values from environment variables take precedence over configuration file settings.

The following sections provide a (non-exhaustive) list of settings honored by datalad. They are categorized according to the scope they are typically associated with.

Global user configuration

Default annex repository mode: Should dataset be initialized in direct mode?

Crawler pipeline house keeping: Should the crawler tidy up datasets (git gc, repack, clean)?

[value must be convertible to type bool]

NDA database server: Hostname of the database server
Cache directory: Where should datalad cache files? Default: ~/.cache/datalad

Local repository configuration


Crawler download caching: Should the crawler cache downloaded files?



Crawler dry-run: Should the crawler ... I AM NOT QUITE SURE WHAT?

[value must be convertible to type bool]

Sticky dataset configuration

Default annex backend: Content hashing method to be used by git-annex

Miscellaneous configuration

Specifies the protocol number used by the Runner to note shell command or python function call times and allows for dry runs. “externals-time” for ExecutionTimeExternalsProtocol, “time” for ExecutionTimeProtocol and “null” for NullProtocol. Any new DATALAD_CMD_PROTOCOL has to implement
Sets a prefix to add before the command call times are noted by DATALAD_CMD_PROTOCOL.:
This flag is used by the datalad extract_tb function which extracts and formats stack-traces. It caps the number of lines to DATALAD_EXC_STR_TBLIMIT of pre-processed entries from traceback.:
Used for control the verbosity of logs printed to stdout while running datalad commands/debugging:
Include name of the log target in the log line:
Which names (,-separated) to print log lines for:
Regular expression for which names to print log lines for:
Used to control either both stdout and stderr of external commands execution are logged in detail (at DEBUG level):

Used to add timestamp to datalad logs: Default: False

[value must be convertible to type bool]

Runs TraceBack function with collide set to True, if this flag is set to “collide”. This replaces any common prefix between current traceback log and previous invocation with ”...”:

Direct Mode for git-annex repositories: Set this flag to create annex repositories in direct mode by default

[value must be convertible to type bool]


git-annex repository version: Specifies the repository version for git-annex to be used by default

[value must be convertible to type ‘int’]


Binary flag to specify whether each annex repository should get datalad special remote in every test repository:

[value must be convertible to type bool]


Skips network tests completely if this flag is set Examples include test for s3, git_repositories, openfmri etc:

[value must be convertible to type bool]

Specifies network interfaces to bring down/up for testing. Currently used by travis.:

Does not execute teardown_package which cleans up temp files and directories created by tests if this flag is set:

[value must be convertible to type bool]


Binary flag to specify whether to test protocol interactions of custom remote with annex:

[value must be convertible to type bool]


Binary flag to specify if shell testing using shunit2 to be carried out:

[value must be convertible to type bool]


Skips SSH tests if this flag is not set:

[value must be convertible to type bool]

Create a temporary directory at location specified by this flag. It is used by tests to create a temporary git directory while testing git annex archives etc:
Specify the temporary file system to use as loop device for testing DATALAD_TESTS_TEMP_DIR creation:
Specify the size of temporary file system to use as loop device for testing DATALAD_TESTS_TEMP_DIR creation:

Function rmtemp will not remove temporary file/directory created for testing if this flag is set:

[value must be convertible to type bool]

Tests UI backend: Which UI backend to use Default: tests-noninteractive
Specifies the location of the file to record network transactions by the VCR module. Currently used by when testing custom special remotes: