Miscellaneous utilities to assist with testing

class datalad.tests.utils_pytest.HTTPPath(path, use_ssl=False, auth=None)[source]

Bases: object

Serve the content of a path via an HTTP URL.

This class can be used as a context manager, in which case it returns the URL.

Alternatively, the start and stop methods can be called directly.

  • path (str) – Directory with content to serve.

  • use_ssl (bool) –

  • auth (tuple) – Username, password


Start serving path via HTTP.


Stop serving path.

class datalad.tests.utils_pytest.SilentHTTPHandler(*args, **kwargs)[source]

Bases: SimpleHTTPRequestHandler

A little adapter to silence the handler

log_message(format, *args)[source]

Log an arbitrary message.

This is used by all other logging functions. Override it if you have specific logging wishes.

The first argument, FORMAT, is a format string for the message to be logged. If the format string contains any % escapes requiring parameters, they should be specified as subsequent arguments (it’s just like printf!).

The client ip and current date/time are prefixed to every message.

Unicode control characters are replaced with escaped hex before writing the output to stderr.

datalad.tests.utils_pytest.assert_cwd_unchanged(func, ok_to_chdir=False)[source]

Decorator to test whether the current working directory remains unchanged


ok_to_chdir (bool, optional) – If True, allow to chdir, so this decorator would not then raise exception if chdir’ed but only return to original directory

datalad.tests.utils_pytest.assert_dict_equal(d1, d2)[source]
datalad.tests.utils_pytest.assert_equal(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_false(expr, msg=None)[source]
datalad.tests.utils_pytest.assert_greater(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_greater_equal(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_in(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_in_results(results, **kwargs)[source]

Verify that the particular combination of keys and values is found in one of the results

datalad.tests.utils_pytest.assert_is(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_is_instance(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_is_none(expr, msg=None)[source]
datalad.tests.utils_pytest.assert_is_not(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_is_not_none(expr, msg=None)[source]
datalad.tests.utils_pytest.assert_message(message, results)[source]

Verify that each status dict in the results has a message

This only tests the message template string, and not a formatted message with args expanded.

datalad.tests.utils_pytest.assert_no_errors_logged(func, skip_re=None)[source]

Decorator around function to assert that no errors logged during its execution

datalad.tests.utils_pytest.assert_not_equal(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_not_in(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_not_in_results(results, **kwargs)[source]

Verify that the particular combination of keys and values is not in any of the results

datalad.tests.utils_pytest.assert_not_is_instance(first, second, msg=None)[source]
datalad.tests.utils_pytest.assert_re_in(regex, c, flags=0, match=True, msg=None)[source]

Assert that container (list, str, etc) contains entry matching the regex

datalad.tests.utils_pytest.assert_repo_status(path, annex=None, untracked_mode='normal', **kwargs)[source]

Compare a repo status against (optional) exceptions.

Anything file/directory that is not explicitly indicated must have state ‘clean’, i.e. no modifications and recorded in Git.

  • path (str or Repo) – in case of a str: path to the repository’s base dir; Note, that passing a Repo instance prevents detecting annex. This might be useful in case of a non-initialized annex, a GitRepo is pointing to.

  • annex (bool or None) – explicitly set to True or False to indicate, that an annex is (not) expected; set to None to autodetect, whether there is an annex. Default: None.

  • untracked_mode ({'no', 'normal', 'all'}) – If and how untracked content is reported. The specification of untracked files that are OK to be found must match this mode. See Repo.status()

  • **kwargs – Files/directories that are OK to not be in ‘clean’ state. Each argument must be one of ‘added’, ‘untracked’, ‘deleted’, ‘modified’ and each value must be a list of filenames (relative to the root of the repository, in POSIX convention).

datalad.tests.utils_pytest.assert_result_count(results, n, **kwargs)[source]

Verify specific number of results (matching criteria, if any)

datalad.tests.utils_pytest.assert_result_values_cond(results, prop, cond)[source]

Verify that the values of all results for a given key in the status dicts fulfill condition cond.

  • results

  • prop (str) –

  • cond (callable) –

datalad.tests.utils_pytest.assert_result_values_equal(results, prop, values)[source]

Verify that the values of all results for a given key in the status dicts match the given sequence

datalad.tests.utils_pytest.assert_set_equal(first, second, msg=None)
datalad.tests.utils_pytest.assert_status(label, results)[source]

Verify that each status dict in the results has a given status label

label can be a sequence, in which case status must be one of the items in this sequence.

datalad.tests.utils_pytest.assert_str_equal(s1, s2)[source]

Helper to compare two lines

datalad.tests.utils_pytest.assert_true(expr, msg=None)[source]

Internal helper to verify that we are not decorating generator tests

datalad.tests.utils_pytest.eq_(first, second, msg=None)
datalad.tests.utils_pytest.get_annexstatus(ds, paths=None)[source]

Report a status for annexed contents. Assembles states for git content info, amended with annex info on ‘HEAD’ (to get the last committed stage and with it possibly vanished content), and lastly annex info wrt to the present worktree, to also get info on added/staged content this fuses the info reported from - git ls-files - git annex findref HEAD - git annex find –include ‘*’

datalad.tests.utils_pytest.get_convoluted_situation(path, repocls=<class ''>)[source]

Delayed parsing so it could be monkey patched etc


Here is what this does (assuming UNIX, locked): | . | ├── directory_untracked | │ └── link2dir -> ../subdir | ├── OBSCURE_FILENAME_file_modified | ├── link2dir -> subdir | ├── link2subdsdir -> subds_modified/subdir | ├── link2subdsroot -> subds_modified | ├── subdir | │ ├── annexed_file.txt -> ../.git/annex/objects/… | │ ├── file_modified | │ ├── git_file.txt | │ └── link2annex_files.txt -> annexed_file.txt | └── subds_modified | ├── link2superdsdir -> ../subdir | ├── subdir | │ └── annexed_file.txt -> ../.git/annex/objects/… | └── subds_lvl1_modified | └── OBSCURE_FILENAME_directory_untracked | └── untracked_file

When a system has no symlink support, the link2… components are not included.

datalad.tests.utils_pytest.get_most_obscure_supported_name(tdir, return_candidates=False)[source]

Return the most obscure filename that the filesystem would support under TEMPDIR

  • return_candidates (bool, optional) – if True, return a tuple of (good, candidates) where candidates are “partially” sorted from trickiest considered

  • TODO (we might want to use it as a function where we would provide tdir) –


Return digests (md5) and mtimes for all the files under target_path


Get port of host in ssh_config.

Our tests depend on the host being defined in ssh_config, including its port. This method can be used by tests that want to check handling of an explicitly specified

Note that if host does not match a host in ssh_config, the default value of 22 is returned.

Skips test if port cannot be found.


host (str) –

Return type:

port (int)


DEPRECATED and will be removed soon. Does nothing!

Originally was intended as a decorator workaround for nose’s behaviour with redirecting sys.stdout, but now we monkey patch nose now so no test should no longer be skipped.

See issue reported here:

datalad.tests.utils_pytest.in_(first, second, msg=None)

Mark test as an “integration” test which generally is not needed to be run

Generally tend to be slower. Should be used in combination with @slow and @turtle if that is the case.


Test decorator marking a test as known to fail

This combines probe_known_failure and skip_known_failure giving the skipping precedence over the probing.


DEPRECATED. Stop using. Does nothing

Test decorator marking a test as known to fail in a direct mode test run

If is set to True behaves like known_failure. Otherwise the original (undecorated) function is returned.


Test decorator for a known test failure on Github’s macOS CI


Test decorator for a known test failure on Github’s Windows CI


Test decorator for a known test failure on macOS


Test decorator marking a test as known to fail on windows

On Windows behaves like known_failure. Otherwise the original (undecorated) function is returned.


Put repo into an adjusted branch if it is not already.

datalad.tests.utils_pytest.neq_(first, second, msg=None)
datalad.tests.utils_pytest.nok_(expr, msg=None)
datalad.tests.utils_pytest.nok_startswith(s, prefix)[source]
datalad.tests.utils_pytest.ok_(expr, msg=None)
datalad.tests.utils_pytest.ok_annex_get(ar, files, network=True)[source]

Helper to run .get decorated checking for correct operation

get passes through stderr from the ar to the user, which pollutes screen while running tests

Note: Currently not true anymore, since usage of –json disables progressbars

datalad.tests.utils_pytest.ok_archives_caches(repopath, n=1, persistent=None)[source]

Given a path to repository verify number of archives

  • repopath (str) – Path to the repository

  • n (int, optional) – Number of archives directories to expect

  • persistent (bool or None, optional) – If None – both persistent and not count.

datalad.tests.utils_pytest.ok_clean_git(path, annex=None, index_modified=[], untracked=[])[source]

Obsolete test helper. Use assert_repo_status() instead.

Still maps a few common cases to the new helper, to ease transition in extensions.

datalad.tests.utils_pytest.ok_endswith(s, suffix)[source]
datalad.tests.utils_pytest.ok_file_has_content(path, content, strip=False, re_=False, decompress=False, **kwargs)[source]

Verify that file exists and has expected content

datalad.tests.utils_pytest.ok_file_under_git(path, filename=None, annexed=False)[source]

Test if file is present and under git/annex control

If relative path provided, then test from current directory


Helper to verify that nothing rewritten the config file

datalad.tests.utils_pytest.ok_startswith(s, prefix)[source]

Checks whether path is either a working or broken symlink


Patch our config with custom settings. Returns mock.patch cm

Only the merged configuration from all sources (global, local, dataset) will be patched. Source-constrained patches (e.g. only committed dataset configuration) are not supported.


Test decorator allowing the test to pass when it fails and vice versa

Setting config datalad.tests.knownfailures.probe to True tests, whether or not the test is still failing. If it’s not, an AssertionError is raised in order to indicate that the reason for failure seems to be gone.

datalad.tests.utils_pytest.put_file_under_git(path, filename=None, content=None, annexed=False)[source]

Place file under git/annex and return used Repo

datalad.tests.utils_pytest.run_under_dir(func, newdir='.')[source]

Decorator to run tests under another directory

It is somewhat ugly since we can’t really chdir back to a directory which had a symlink in its path. So using this decorator has potential to move entire testing run under the dereferenced directory name – sideeffect.

The only way would be to instruct testing framework (i.e. nose in our case ATM) to run a test by creating a new process with a new cwd

datalad.tests.utils_pytest.serve_path_via_http(tfunc, *targs, use_ssl=False, auth=None)[source]

Decorator which serves content of a directory via http url

  • path (str) – Directory with content to serve.

  • use_ssl (bool) – Flag whether to set up SSL encryption and return a HTTPS URL. This require a valid certificate setup (which is tested for proper function) or it will cause a SkipTest to be raised.

  • auth (tuple or None) – If a (username, password) tuple is given, the server access will be protected via HTTP basic auth.


Override the git-annex version.

This temporarily masks the git-annex version present in external_versions and make AnnexRepo forget its cached version information.


Temporarily override environment variables for git/git-annex dates.


timestamp (int) – Unix timestamp.

datalad.tests.utils_pytest.skip_if(func, cond=True, msg=None, method='raise')[source]

Skip test for specific condition

  • cond (bool) – condition on which to skip

  • msg (str) – message to print if skipping

  • method (str) – either ‘raise’ or ‘pass’. Whether to skip by raising SkipTest or by just proceeding and simply not calling the decorated function. This is particularly meant to be used, when decorating single assertions in a test with method=’pass’ in order to not skip the entire test, but just that assertion.


Skip test if adjusted branch is used by default on TMPDIR file system.


Skip test completely in NONETWORK settings

If not used as a decorator, and just a function, could be used at the module level


Skip test completely under Windows


Skip test if uid == 0.

Note that on Windows (or anywhere else os.geteuid is not available) the test is _not_ skipped.


A little helper to skip some tests which require recent scrapy

datalad.tests.utils_pytest.skip_if_url_is_not_available(url, regex=None)[source]
datalad.tests.utils_pytest.skip_known_failure(func, method='raise')[source]

Test decorator allowing to skip a test that is known to fail

Setting config datalad.tests.knownfailures.skip to a bool enables/disables skipping.


Skips SSH tests if default connection/manager does not support multiplexing

e.g. currently on windows or if set via datalad.ssh.multiplex-connections config variable


Skips SSH tests if on windows or if environment variable DATALAD_TESTS_SSH was not set

Skip test when environment does not support symlinks

Perform a behavioral test instead of top-down logic, as on windows this could be on or off on a case-by-case basis.


Mark test as a slow, although not necessarily integration or usecase test

Rule of thumb cut-off to mark as slow is 10 sec


Mark test as very slow, meaning to not run it on Travis due to its time limit

Rule of thumb cut-off to mark as turtle is 2 minutes


Mark test as a usecase user ran into and which (typically) caused bug report to be filed/troubleshooted

Should be used in combination with @slow and @turtle if slow.

datalad.tests.utils_pytest.with_fake_cookies_db(func, cookies={})[source]

mock original cookies db with a fake one for the duration of the test


Decorator to use non-persistent MemoryKeyring instance

datalad.tests.utils_pytest.with_sameas_remote(func, autoenabled=False)[source]

Provide a repository with a git-annex sameas remote configured.

The repository will have two special remotes: r_dir (type=directory) and r_rsync (type=rsync). The rsync remote will be configured with –sameas=r_dir, and autoenabled if autoenabled is true.

datalad.tests.utils_pytest.with_tempfile(t, **tkwargs)[source]

Decorator function to provide a temporary file name and remove it at the end

  • set (To change the used directory without providing keyword argument 'dir') –


  • mkdir (bool, optional (default: False)) – If True, temporary directory created using tempfile.mkdtemp()

  • content (str or bytes, optional) – Content to be stored in the file created

  • wrapped (function, optional) – If set, function name used to prefix temporary file name

  • **tkwargs – All other arguments are passed into the call to{,d}temp(), and resultant temporary filename is passed as the first argument into the function t. If no ‘prefix’ argument is provided, it will be constructed using module and function names (‘.’ replaced with ‘_’).


def test_write(tfile=None):
    open(tfile, 'w').write('silly test')
datalad.tests.utils_pytest.with_testsui(t, responses=None, interactive=True)[source]

Switch main UI to be ‘tests’ UI and possibly provide answers to be used

datalad.tests.utils_pytest.with_tree(t, tree=None, archives_leading_dir=True, delete=True, **tkwargs)[source]

Decorator to remove http*_proxy env variables for the duration of the test