datalad.config

class datalad.config.ConfigManager(dataset=None, overrides=None, source='any')[source]

Bases: object

Thin wrapper around git-config with support for a dataset configuration.

The general idea is to have an object that is primarily used to read/query configuration option. Upon creation, current configuration is read via one (or max two, in the case of the presence of dataset-specific configuration) calls to git config. If this class is initialized with a Dataset instance, it supports reading and writing configuration from .datalad/config inside a dataset too. This file is committed to Git and hence useful to ship certain configuration items with a dataset.

The API aims to provide the most significant read-access API of a dictionary, the Python ConfigParser, and GitPython’s config parser implementations.

This class is presently not capable of efficiently writing multiple configurations items at once. Instead, each modification results in a dedicated call to git config. This author thinks this is OK, as he cannot think of a situation where a large number of items need to be written during normal operation.

Each instance carries a public overrides attribute. This dictionary contains variables that override any setting read from a file. The overrides are persistent across reloads.

Any DATALAD_* environment variable is also presented as a configuration item. Settings read from environment variables are not stored in any of the configuration files, but are read dynamically from the environment at each reload() call. Their values take precedence over any specification in configuration files, and even overrides.

Parameters:
  • dataset (Dataset, optional) – If provided, all git config calls are executed in this dataset’s directory. Moreover, any modifications are, by default, directed to this dataset’s configuration file (which will be created on demand)

  • overrides (dict, optional) – Variable overrides, see general class documentation for details.

  • source ({'any', 'local', 'branch', 'branch-local'}, optional) – Which sources of configuration setting to consider. If ‘branch’, configuration items are only read from a dataset’s persistent configuration file in current branch, if any is present (the one in .datalad/config, not .git/config); if ‘local’, any non-committed source is considered (local and global configuration in Git config’s terminology); if ‘branch-local’, persistent configuration in current dataset branch and local, but not global or system configuration are considered; if ‘any’ all possible sources of configuration are considered. Note: ‘dataset’ and ‘dataset-local’ are deprecated in favor of ‘branch’ and ‘branch-local’.

add(var, value, scope='branch', reload=True)[source]

Add a configuration variable and value

Parameters:
  • var (str) – Variable name including any section like git config expects them, e.g. ‘core.editor’

  • value (str) – Variable value

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

get(k[, d]) D[k] if k in D, else d.  d defaults to None.[source]
Parameters:
  • default (optional) – Value to return when key is not present. None by default.

  • get_all (bool, optional) – If True, return all values of multiple identical configuration keys. By default only the last specified value is returned.

get_from_source(source, key, default=None)[source]

Like get(), but a source can be specific.

If source is ‘branch’, only the committed configuration is queried, overrides are applied. In the case of ‘local’, the committed configuration is ignored, but overrides and configuration from environment variables are applied as usual.

get_value(section, option, default=None)[source]

Like get(), but with an optional default value

If the default is not None, the given default value will be returned in case the option did not exist. This behavior imitates GitPython’s config parser.

getbool(section, option, default=None)[source]

A convenience method which coerces the option value to a bool

Values “on”, “yes”, “true” and any int!=0 are considered True Values which evaluate to bool False, “off”, “no”, “false” are considered False TypeError is raised for other values.

getfloat(section, option)[source]

A convenience method which coerces the option value to a float

getint(section, option)[source]

A convenience method which coerces the option value to an integer

has_option(section, option)[source]

If the given section exists, and contains the given option

has_section(section)[source]

Indicates whether a section is present in the configuration

items(section=None)[source]

Return a list of (name, value) pairs for each option

Optionally limited to a given section.

keys()[source]

Returns list of configuration item names

obtain(var, default=None, dialog_type=None, valtype=None, store=False, scope=None, reload=True, **kwargs)[source]

Convenience method to obtain settings interactively, if needed

A UI will be used to ask for user input in interactive sessions. Questions to ask, and additional explanations can be passed directly as arguments, or retrieved from a list of pre-configured items.

Additionally, this method allows for type conversion and storage of obtained settings. Both aspects can also be pre-configured.

Parameters:
  • var (str) – Variable name including any section like git config expects them, e.g. ‘core.editor’

  • default (any type) – In interactive sessions and if store is True, this default value will be presented to the user for confirmation (or modification). In all other cases, this value will be silently assigned unless there is an existing configuration setting.

  • dialog_type ({'question', 'yesno', None}) – Which dialog type to use in interactive sessions. If None, pre-configured UI options are used.

  • store (bool) – Whether to store the obtained value (or default)

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

  • **kwargs – Additional arguments for the UI function call, such as a question text.

options(section)[source]

Returns a list of options available in the specified section.

reload(force=False)[source]

Reload all configuration items from the configured sources

If force is False, all files configuration was previously read from are checked for differences in the modification times. If no difference is found for any file no reload is performed. This mechanism will not detect newly created global configuration files, use force in this case.

remove_section(sec, scope='branch', reload=True)[source]

Rename a configuration section

Parameters:
  • sec (str) – Name of the section to remove.

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

rename_section(old, new, scope='branch', reload=True)[source]

Rename a configuration section

Parameters:
  • old (str) – Name of the section to rename.

  • new (str) – Name of the section to rename to.

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

rewrite_url(url)

Any matching ‘url.<base>.insteadOf’ configuration is applied

Any URL that starts with such a configuration will be rewritten to start, instead, with <base>. When more than one insteadOf strings match a given URL, the longest match is used.

Parameters:
  • cfg (ConfigManager or dict) – dict-like with configuration variable name/value-pairs.

  • url (str) – URL to be rewritten, if matching configuration is found.

Returns:

Rewritten or unmodified URL.

Return type:

str

sections()[source]

Returns a list of the sections available

set(var, value, scope='branch', reload=True, force=False)[source]

Set a variable to a value.

In opposition to add, this replaces the value of var if there is one already.

Parameters:
  • var (str) – Variable name including any section like git config expects them, e.g. ‘core.editor’

  • value (str) – Variable value

  • force (bool) – if set, replaces all occurrences of var by a single one with the given value. Otherwise raise if multiple entries for var exist already

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

unset(var, scope='branch', reload=True)[source]

Remove all occurrences of a variable

Parameters:
  • var (str) – Name of the variable to remove

  • scope ({'branch', 'local', 'global', 'override'}, optional) – Indicator which configuration file to modify. ‘branch’ indicates the persistent configuration in .datalad/config of a dataset; ‘local’ the configuration of a dataset’s Git repository in .git/config; ‘global’ refers to the general configuration that is not specific to a single repository (usually in $USER/.gitconfig); ‘override’ limits the modification to the ConfigManager instance, and the assigned value overrides any setting from any other source. Note: ‘dataset’ is being DEPRECATED in favor of ‘branch’.

  • where ({'branch', 'local', 'global', 'override'}, optional) – DEPRECATED, use ‘scope’.

  • reload (bool) – Flag whether to reload the configuration from file(s) after modification. This can be disable to make multiple sequential modifications slightly more efficient.

datalad.config.anything2bool(val)[source]
datalad.config.get_git_version(runner=None)[source]

Return version of available git

datalad.config.parse_gitconfig_dump(dump, cwd=None, multi_value=True)[source]

Parse a dump-string from git config -z –list

This parser has limited support for discarding unrelated output that may contaminate the given dump. It does so performing a relatively strict matching of configuration key syntax, and discarding lines in the output that are not valid git-config keys.

There is also built-in support for parsing outputs generated with –show-origin (see return value).

Parameters:
  • dump (str) – Null-byte separated output

  • cwd (path-like, optional) – Use this absolute path to convert relative paths for origin reports into absolute paths. By default, the process working directory PWD is used.

  • multi_value (bool, optional) – If True, report values from multiple specifications of the same key as a tuple of values assigned to this key. Otherwise, the last configuration is reported.

  • Returns

  • --------

  • dict – Configuration items are returned as key/value pairs in a dictionary. The second tuple-item will be a set of identifiers comprising all source files/blobs, if origin information was included in the dump (–show-origin). An empty set is returned otherwise. For actual files a Path object is included in the set, for a git-blob a Git blob ID prefixed with ‘blob:’ is reported.

  • set – Configuration items are returned as key/value pairs in a dictionary. The second tuple-item will be a set of identifiers comprising all source files/blobs, if origin information was included in the dump (–show-origin). An empty set is returned otherwise. For actual files a Path object is included in the set, for a git-blob a Git blob ID prefixed with ‘blob:’ is reported.

datalad.config.quote_config(v)[source]

Helper to perform minimal quoting of config keys/value parts

Parameters:

v (str) – To-be-quoted string

datalad.config.rewrite_url(cfg, url)[source]

Any matching ‘url.<base>.insteadOf’ configuration is applied

Any URL that starts with such a configuration will be rewritten to start, instead, with <base>. When more than one insteadOf strings match a given URL, the longest match is used.

Parameters:
  • cfg (ConfigManager or dict) – dict-like with configuration variable name/value-pairs.

  • url (str) – URL to be rewritten, if matching configuration is found.

Returns:

Rewritten or unmodified URL.

Return type:

str

datalad.config.warn_on_undefined_git_identity(cfg)[source]

Check whether a Git identity is defined, and warn if not

Parameters:

cfg (ConfigManager) –

datalad.config.write_config_section(fobj, suite, name, props)[source]

Write a config section with (multiple) settings.

Parameters:
  • fobj (File) – Opened target file

  • suite (str) – First item of the section name, e.g. ‘submodule’, or ‘datalad’

  • name (str) – Remainder of the section name

  • props (dict) – Keys are configuration setting names within the section context (i.e. not duplicating suite and/or name, values are configuration setting values.