Miscellaneous utilities to assist with testing

class datalad.tests.utils.HTTPPath(path)[source]

Bases: object

Serve the content of a path via an HTTP URL.

This class can be used as a context manager, in which case it returns the URL.

Alternatively, the start and stop methods can be called directly.

Parameters:path (str) – Directory with content to serve.

Start serving path via HTTP.


Stop serving path.

class datalad.tests.utils.SilentHTTPHandler(*args, **kwargs)[source]

Bases: http.server.SimpleHTTPRequestHandler

A little adapter to silence the handler

log_message(format, *args)[source]

Log an arbitrary message.

This is used by all other logging functions. Override it if you have specific logging wishes.

The first argument, FORMAT, is a format string for the message to be logged. If the format string contains any % escapes requiring parameters, they should be specified as subsequent arguments (it’s just like printf!).

The client ip and current date/time are prefixed to every message.

datalad.tests.utils.assert_dict_equal(d1, d2)[source]
datalad.tests.utils.assert_in_results(results, **kwargs)[source]

Verify that the particular combination of keys and values is found in one of the results

datalad.tests.utils.assert_message(message, results)[source]

Verify that each status dict in the results has a message

This only tests the message template string, and not a formatted message with args expanded.

datalad.tests.utils.assert_no_errors_logged(func, skip_re=None)[source]

Decorator around function to assert that no errors logged during its execution

datalad.tests.utils.assert_not_in_results(results, **kwargs)[source]

Verify that the particular combination of keys and values is not in any of the results

datalad.tests.utils.assert_re_in(regex, c, flags=0, match=True, msg=None)[source]

Assert that container (list, str, etc) contains entry matching the regex

datalad.tests.utils.assert_repo_status(path, annex=None, untracked_mode='normal', **kwargs)[source]

Compare a repo status against (optional) exceptions.

Anything file/directory that is not explicitly indicated must have state ‘clean’, i.e. no modifications and recorded in Git.

  • path (str or Repo) – in case of a str: path to the repository’s base dir; Note, that passing a Repo instance prevents detecting annex. This might be useful in case of a non-initialized annex, a GitRepo is pointing to.
  • annex (bool or None) – explicitly set to True or False to indicate, that an annex is (not) expected; set to None to autodetect, whether there is an annex. Default: None.
  • untracked_mode ({'no', 'normal', 'all'}) – If and how untracked content is reported. The specification of untracked files that are OK to be found must match this mode. See Repo.status()
  • **kwargs – Files/directories that are OK to not be in ‘clean’ state. Each argument must be one of ‘added’, ‘untracked’, ‘deleted’, ‘modified’ and each value must be a list of filenames (relative to the root of the repository, in POSIX convention).
datalad.tests.utils.assert_result_count(results, n, **kwargs)[source]

Verify specific number of results (matching criteria, if any)

datalad.tests.utils.assert_result_values_cond(results, prop, cond)[source]

Verify that the values of all results for a given key in the status dicts fulfill condition cond.

  • results
  • prop (str) –
  • cond (callable) –
datalad.tests.utils.assert_result_values_equal(results, prop, values)[source]

Verify that the values of all results for a given key in the status dicts match the given sequence

datalad.tests.utils.assert_status(label, results)[source]

Verify that each status dict in the results has a given status label

label can be a sequence, in which case status must be one of the items in this sequence.

datalad.tests.utils.assert_str_equal(s1, s2)[source]

Helper to compare two lines


Internal helper to verify that we are not decorating generator tests

datalad.tests.utils.get_convoluted_situation(path, repocls=<class ''>)[source]

Delayed parsing so it could be monkey patched etc


Here is what this does (assuming UNIX, locked): | . | ├── directory_untracked | │   └── link2dir -> ../subdir | ├── OBSCURE_FILENAME_file_modified | ├── link2dir -> subdir | ├── link2subdsdir -> subds_modified/subdir | ├── link2subdsroot -> subds_modified | ├── subdir | │   ├── annexed_file.txt -> ../.git/annex/objects/… | │   ├── file_modified | │   ├── git_file.txt | │   └── link2annex_files.txt -> annexed_file.txt | └── subds_modified | ├── link2superdsdir -> ../subdir | ├── subdir | │   └── annexed_file.txt -> ../.git/annex/objects/… | └── subds_lvl1_modified | └── OBSCURE_FILENAME_directory_untracked | └── untracked_file

When a system has no symlink support, the link2… components are not included.

datalad.tests.utils.get_most_obscure_supported_name(tdir, return_candidates=False)[source]

Return the most obscure filename that the filesystem would support under TEMPDIR

  • return_candidates (bool, optional) – if True, return a tuple of (good, candidates) where candidates are “partially” sorted from trickiest considered
  • TODO (we might want to use it as a function where we would provide tdir) –

Return digests (md5) and mtimes for all the files under target_path


Get port of host in ssh_config.

Our tests depend on the host being defined in ssh_config, including its port. This method can be used by tests that want to check handling of an explicitly specified

Note that if host does not match a host in ssh_config, the default value of 22 is returned.

Parameters:host (str) –
Return type:port (int)
Raises:SkipTest if port cannot be found.

DEPRECATED and will be removed soon. Does nothing!

Originally was intended as a decorator workaround for nose’s behaviour with redirecting sys.stdout, but now we monkey patch nose now so no test should no longer be skipped.

See issue reported here:


Mark test as an “integration” test which generally is not needed to be run

Generally tend to be slower. Should be used in combination with @slow and @turtle if that is the case.


Test decorator marking a test as known to fail

This combines probe_known_failure and skip_known_failure giving the skipping precedence over the probing.


DEPRECATED. Stop using. Does nothing

Test decorator marking a test as known to fail in a direct mode test run

If is set to True behaves like known_failure. Otherwise the original (undecorated) function is returned.


Test decorator for a known test failure on Github’s macOS CI


Test decorator for a known test failure on Github’s Windows CI


Test decorator for a known test failure on macOS


Test decorator marking a test as known to fail on windows

On Windows behaves like known_failure. Otherwise the original (undecorated) function is returned.


Put repo into an adjusted branch if it is not already.

datalad.tests.utils.nok_startswith(s, prefix)[source]
datalad.tests.utils.ok_annex_get(ar, files, network=True)[source]

Helper to run .get decorated checking for correct operation

get passes through stderr from the ar to the user, which pollutes screen while running tests

Note: Currently not true anymore, since usage of –json disables progressbars

datalad.tests.utils.ok_archives_caches(repopath, n=1, persistent=None)[source]

Given a path to repository verify number of archives

  • repopath (str) – Path to the repository
  • n (int, optional) – Number of archives directories to expect
  • persistent (bool or None, optional) – If None – both persistent and not count.
datalad.tests.utils.ok_clean_git(path, annex=None, index_modified=[], untracked=[])[source]

Obsolete test helper. Use assert_repo_status() instead.

Still maps a few common cases to the new helper, to ease transition in extensions.

datalad.tests.utils.ok_endswith(s, suffix)[source]
datalad.tests.utils.ok_file_has_content(path, content, strip=False, re_=False, decompress=False, **kwargs)[source]

Verify that file exists and has expected content

datalad.tests.utils.ok_file_under_git(path, filename=None, annexed=False)[source]

Test if file is present and under git/annex control

If relative path provided, then test from current directory


Helper to verify that nothing rewritten the config file

datalad.tests.utils.ok_startswith(s, prefix)[source]

Checks whether path is either a working or broken symlink


Patch our config with custom settings. Returns mock.patch cm

Only the merged configuration from all sources (global, local, dataset) will be patched. Source-constrained patches (e.g. only committed dataset configuration) are not supported.


Test decorator allowing the test to pass when it fails and vice versa

Setting config datalad.tests.knownfailures.probe to True tests, whether or not the test is still failing. If it’s not, an AssertionError is raised in order to indicate that the reason for failure seems to be gone.

datalad.tests.utils.put_file_under_git(path, filename=None, content=None, annexed=False)[source]

Place file under git/annex and return used Repo


Override the git-annex version.

This temporarily masks the git-annex version present in external_versions and make AnnexRepo forget its cached version information.


Temporarily override environment variables for git/git-annex dates.

Parameters:timestamp (int) – Unix timestamp.

As discovered some httpretty bug causes a side-effect on other tests on some Pythons. So we skip the test if such problematic combination detected



Skip test if adjusted branch is used by default on TMPDIR file system.


Skip test completely in NONETWORK settings

If not used as a decorator, and just a function, could be used at the module level


Skip test completely under Windows


Skip test if uid == 0.

Note that on Windows (or anywhere else os.geteuid is not available) the test is _not_ skipped.


A little helper to skip some tests which require recent scrapy

datalad.tests.utils.skip_if_url_is_not_available(url, regex=None)[source]

Skips SSH tests if default connection/manager does not support multiplexing

e.g. currently on windows or if set via datalad.ssh.multiplex-connections config variable


Skips SSH tests if on windows or if environment variable DATALAD_TESTS_SSH was not set

Skip test when environment does not support symlinks

Perform a behavioral test instead of top-down logic, as on windows this could be on or off on a case-by-case basis.


Mark test as a slow, although not necessarily integration or usecase test

Rule of thumb cut-off to mark as slow is 10 sec


Mark test as very slow, meaning to not run it on Travis due to its time limit

Rule of thumb cut-off to mark as turtle is 2 minutes


Mark test as a usecase user ran into and which (typically) caused bug report to be filed/troubleshooted

Should be used in combination with @slow and @turtle if slow.