datalad.support.annexrepo

Interface to git-annex by Joey Hess.

For further information on git-annex see https://git-annex.branchable.com/.

class datalad.support.annexrepo.AnnexInitOutput(done_future=None, encoding=None)[source]

Bases: WitlessProtocol, AssemblingDecoderMixIn

fd_infos: dict[int, tuple[str, Optional[bytearray]]]
pipe_data_received(fd, byts)[source]
proc_err = True
proc_out = True
process: Optional[subprocess.Popen]
class datalad.support.annexrepo.AnnexJsonProtocol(done_future=None, total_nbytes=None)[source]

Bases: WitlessProtocol

Subprocess communication protocol for annex … –json commands

Importantly, parsed JSON content is returned as a result, not string output.

This protocol also handles git-annex’s JSON-style progress reporting.

add_to_output(json_object)[source]
connection_made(transport)[source]
fd_infos: dict[int, tuple[str, Optional[bytearray]]]
pipe_data_received(fd, data)[source]
proc_err = True
proc_out = True
process: Optional[subprocess.Popen]
process_exited()[source]
class datalad.support.annexrepo.AnnexRepo(*args, **kwargs)[source]

Bases: GitRepo, RepoInterface

Representation of an git-annex repository.

Paths given to any of the class methods will be interpreted as relative to PWD, in case this is currently beneath AnnexRepo’s base dir (self.path). If PWD is outside of the repository, relative paths will be interpreted as relative to self.path. Absolute paths will be accepted either way.

GIT_ANNEX_MIN_VERSION = '8.20200309'
WEB_UUID = '00000000-0000-0000-0000-000000000001'
add(files, git=None, backend=None, options=None, jobs=None, git_options=None, annex_options=None, update=False)[source]

Add file(s) to the repository.

Parameters:
  • files (list of str) – list of paths to add to the annex

  • git (bool) – if True, add to git instead of annex.

  • backend

  • options

  • update (bool) –

    –update option for git-add. From git’s manpage:

    Update the index just where it already has an entry matching <pathspec>. This removes as well as modifies index entries to match the working tree, but adds no new files.

    If no <pathspec> is given when –update option is used, all tracked files in the entire working tree are updated (old versions of Git used to limit the update to the current directory and its subdirectories).

    Note: Used only, if a call to git-add instead of git-annex-add is performed

Return type:

list of dict or dict

add_(files, git=None, backend=None, options=None, jobs=None, git_options=None, annex_options=None, update=False)[source]

Like add, but returns a generator

add_url_to_file(file_, url, options=None, backend=None, batch=False, git_options=None, annex_options=None, unlink_existing=False)[source]

Add file from url to the annex.

Downloads file from url and add it to the annex. If annex knows file already, records that it can be downloaded from url.

Note: Consider using the higher-level download_url instead.

Parameters:
  • file (str) –

  • url (str) –

  • options (list) – options to the annex command

  • batch (bool, optional) – initiate or continue with a batched run of annex addurl, instead of just calling a single git annex addurl command

  • unlink_existing (bool, optional) – by default crashes if file already exists and is under git. With this flag set to True would first remove it.

Returns:

In batch mode only ATM returns dict representation of json output returned by annex

Return type:

dict

add_urls(urls, options=None, backend=None, cwd=None, jobs=None, git_options=None, annex_options=None)[source]

Downloads each url to its own file, which is added to the annex.

Deprecated since version 0.17: Use add_url_to_file() or call_annex() instead.

Parameters:
  • urls (list of str) –

  • options (list, optional) – options to the annex command

  • cwd (string, optional) – working directory from within which to invoke git-annex

adjust(options=None)[source]

enter an adjusted branch

This command is only available in a v6+ git-annex repository.

Parameters:

options (list of str) – currently requires ‘–unlock’ or ‘–fix’; default: –unlock

annexstatus(paths=None, untracked='all')[source]

Deprecated since version 0.16: Use get_content_annexinfo() or the test helper datalad.tests.utils_pytest.get_annexstatus() instead.

call_annex(args, files=None)[source]

Call annex and return standard output.

Parameters:
  • args (list of str) – Arguments to pass to annex.

  • files (list of str, optional) – File arguments to pass to annex. The advantage of passing these here rather than as part of args is that the call will be split into multiple calls to avoid exceeding the maximum command line length.

Return type:

standard output (str)

Raises:

See _call_annex() for information on Exceptions.

call_annex_items_(args, files=None, sep=None)[source]

Call git-annex, splitting output on sep.

Parameters:
  • args (list of str) – Arguments to pass to git-annex.

  • files (list of str, optional) – File arguments to pass to annex. The advantage of passing these here rather than as part of args is that the call will be split into multiple calls to avoid exceeding the maximum command line length.

  • sep (str, optional) – Split the output by str.split(sep) rather than str.splitlines.

Return type:

Generator that yields output items.

Raises:

See _call_annex() for information on Exceptions.

call_annex_oneline(args, files=None)[source]

Call annex for a single line of output.

This method filters prior output line selection to exclude git-annex status output that is triggered by command execution, but is not related to the particular command. This includes lines like:

(merging … into git-annex) (recording state …)

Parameters:
  • args (list of str) – Arguments to pass to annex.

  • files (list of str, optional) – File arguments to pass to annex. The advantage of passing these here rather than as part of args is that the call will be split into multiple calls to avoid exceeding the maximum command line length.

Returns:

Either a single output line, or an empty string if there was no output.

Return type:

str

Raises:
  • AssertionError if there is more than one line of output.

  • See _call_annex() for information on Exceptions.

call_annex_records(args, files=None)[source]

Call annex with –json* to request structured result records

This method behaves like call_annex(), but returns parsed result records.

Parameters:
  • args (list of str) – Arguments to pass to annex.

  • files (list of str, optional) – File arguments to pass to annex. The advantage of passing these here rather than as part of args is that the call will be split into multiple calls to avoid exceeding the maximum command line length.

Returns:

List of parsed result records.

Return type:

list(dict)

Raises:
  • CommandError if the call exits with a non-zero status. All result

  • records captured until the non-zero exit are available in the

  • exception's kwargs-dict attribute under key 'stdout_json'.

  • See _call_annex() for more information on Exceptions.

call_annex_success(args, files=None)[source]

Call git-annex and return true if the call exit code of 0.

All parameters match those described for call_annex.

Return type:

bool

classmethod check_direct_mode_support()[source]

Does git-annex version support direct mode?

The result is cached at cls.supports_direct_mode.

Return type:

bool

classmethod check_repository_versions()[source]

Get information on supported and upgradable repository versions.

The result is cached at cls.repository_versions.

Returns:

supported -> list of supported versions (int) upgradable -> list of upgradable versions (int)

Return type:

dict

copy_to(files, remote, options=None, jobs=None)[source]

Copy the actual content of files to remote

Parameters:
  • files (str or list of str) – path(s) to copy

  • remote (str) – name of remote to copy files to

Returns:

files successfully copied

Return type:

list of str

property default_backends
drop(files, options=None, key=False, jobs=None)[source]

Drops the content of annexed files from this repository.

Drops only if possible with respect to required minimal number of available copies.

Parameters:
  • files (list of str) – paths to drop

  • options (list of str, optional) – commandline options for the git annex drop command

  • jobs (int, optional) – how many jobs to run in parallel (passed to git-annex call)

Returns:

‘success’ item in each object indicates failure/success per file path.

Return type:

list(JSON objects)

drop_key(keys, options=None, batch=False)[source]

Drops the content of annexed files from this repository referenced by keys

Dangerous: it drops without checking for required minimal number of available copies.

Parameters:
  • keys (list of str, str) –

  • batch (bool, optional) – initiate or continue with a batched run of annex dropkey, instead of just calling a single git annex dropkey command

enable_remote(name, options=None, env=None)[source]

Enables use of an existing special remote

Parameters:
  • name (str) – name, the special remote was created with

  • options (list, optional) –

file_has_content(files, allow_quick=False, batch=False)[source]

Check whether files have their content present under annex.

Parameters:
  • files (list of str) – file(s) to check for being actually present.

  • allow_quick (bool, optional) – This is no longer supported.

Returns:

For each input file states whether file has content locally

Return type:

list of bool

find(files, batch=False)[source]

Run git annex find on file(s).

Parameters:
  • files (list of str) – files to find under annex

  • batch (bool, optional) – initiate or continue with a batched run of annex find, instead of just calling a single git annex find command. If any items in files are directories, this value is treated as False.

Returns:

  • A dictionary the maps each item in files to its git annex find

  • result. Items without a successful result will be an empty string, and

  • multi-item results (which can occur for if files includes a

  • directory) will be returned as a list.

fsck(paths=None, remote=None, fast=False, annex_options=None, git_options=None)[source]

Front-end for git-annex fsck

Parameters:
  • paths (list) – Limit operation to specific paths.

  • remote (str) – If given, the identified remote will be fsck’ed instead of the local repository.

  • fast (bool) – If True, typically means that no actual content is being verified, but tests are limited to the presence of files.

get(files, remote=None, options=None, jobs=None, key=False)[source]

Get the actual content of files

Parameters:
  • files (list of str) – paths to get

  • remote (str, optional) – from which remote to fetch content

  • options (list of str, optional) – commandline options for the git annex get command

  • jobs (int or None, optional) – how many jobs to run in parallel (passed to git-annex call). If not specified (None), then

  • key (bool, optional) – If provided file value is actually a key

Returns:

files

Return type:

list of dict

get_annexed_files(with_content_only=False, patterns=None)[source]

Get a list of files in annex

Parameters:
  • with_content_only (bool, optional) – Only list files whose content is present.

  • patterns (list, optional) – Globs to pass to annex’s –include=. Files that match any of these will be returned (i.e., they’ll be separated by –or).

Return type:

A list of POSIX file names

get_content_annexinfo(paths=None, init='git', ref=None, eval_availability=False, key_prefix='', **kwargs)[source]
Parameters:
  • paths (list or None) – Specific paths to query info for. In None, info is reported for all content.

  • init ('git' or dict-like or None) – If set to ‘git’ annex content info will amend the output of GitRepo.get_content_info(), otherwise the dict-like object supplied will receive this information and the present keys will limit the report of annex properties. Alternatively, if None is given, no initialization is done, and no limit is in effect.

  • ref (gitref or None) – If not None, annex content info for this Git reference will be produced, otherwise for the content of the present worktree.

  • eval_availability (bool) – If this flag is given, evaluate whether the content of any annex’ed file is present in the local annex.

  • **kwargs – Additional arguments for GitRepo.get_content_info(), if init is set to ‘git’.

Returns:

The keys/values match those reported by GitRepo.get_content_info(). In addition, the following properties are added to each value dictionary:

type

Can be ‘file’, ‘symlink’, ‘dataset’, ‘directory’, where ‘file’ is also used for annex’ed files (corrects a ‘symlink’ report made by get_content_info().

key

Annex key of a file (if an annex’ed file)

bytesize

Size of an annexed file in bytes.

has_content

Bool whether a content object for this key exists in the local annex (with eval_availability)

objloc

pathlib.Path of the content object in the local annex, if one is available (with eval_availability)

Return type:

dict

get_contentlocation(key, batch=False)[source]

Get location of the key content

Normally under .git/annex objects in indirect mode and within file tree in direct mode.

Unfortunately there is no (easy) way to discriminate situations when given key is simply incorrect (not known to annex) or its content not currently present – in both cases annex just silently exits with -1

Parameters:
  • key (str) – key

  • batch (bool, optional) – initiate or continue with a batched run of annex contentlocation

Returns:

path relative to the top directory of the repository. If no content is present, empty string is returned

Return type:

str

get_corresponding_branch(branch=None)[source]

Get the name of a potential corresponding branch.

Parameters:

branch (str, optional) – Name of the branch to report a corresponding branch for; defaults to active branch

Returns:

Name of the corresponding branch, or None if there is no corresponding branch.

Return type:

str or None

get_description(uuid=None)[source]

Get annex repository description

Parameters:

uuid (str, optional) – For which remote (based on uuid) to report description for

Returns:

None returned if not found

Return type:

str or None

get_file_annexinfo(path, ref=None, eval_availability=False, key_prefix='')[source]

Query annex properties for a single file

This is the companion to get_content_annexinfo() and offers simplified usage for single-file queries (the result lookup based on a path is not necessary.

All keyword arguments have identical names and semantics as their get_content_annexinfo() counterparts. See their documentation for more information.

Parameters:

path (Path or str) – A single path to a file in the repository.

Returns:

Keys and values match the values returned by get_content_annexinfo(). If a file has no annex properties (i.e., a file that is directly checked into Git and is not annexed), the returned dictionary is empty.

Return type:

dict

Raises:
  • ValueError – When a given path is not matching a single file, but resolves to multiple files (e.g. a directory path)

  • NoSuchPathError – When the given path does not match any file in a repository

get_file_backend(files)[source]

Get the backend currently used for file(s).

Parameters:

files (list of str) –

Returns:

For each file in input list indicates the used backend by a str like “SHA256E” or “MD5”.

Return type:

list of str

get_file_key(files, batch=None)[source]

DEPRECATED. Use get_content_annexinfo()

See the method body for how to use get_content_annexinfo() to replace get_file_key().

For single-file queries it is recommended to consider get_file_annexinfo()

get_file_size(path)[source]
get_groupwanted(name)[source]

Get groupwanted expression for a group name

Parameters:

name (str) – Name of the groupwanted group

classmethod get_key_backend(key)[source]

Get the backend from a given key

get_metadata(files, timestamps=False, batch=False)[source]

Query git-annex file metadata

Parameters:
  • files (str or iterable(str)) – One or more paths for which metadata is to be queried. If one or more paths could be directories, batch=False must be given to prevent git-annex given an error. Due to technical limitations, such error will lead to a hanging process.

  • timestamps (bool, optional) – If True, the output contains a ‘<metadatakey>-lastchanged’ key for every metadata item, reflecting the modification time, as well as a ‘lastchanged’ key with the most recent modification time of any metadata item.

  • batch (bool, optional) – If True, a metadata –batch process will be used, and only confirmed annex’ed files can be queried (else query will hang indefinitely). If False, invokes without –batch, and gives all files as arguments (this can be problematic with a large number of files).

Returns:

One tuple per file (could be more items than input arguments when directories are given). First tuple item is the filename, second item is a dictionary with metadata key/value pairs. Note that annex metadata tags are stored under the key ‘tag’, which is a regular metadata item that can be manipulated like any other.

Return type:

generator

get_preferred_content(property, remote=None)[source]

Get preferred content configuration of a repository or remote

Parameters:
  • property ({'wanted', 'required', 'group'}) – Type of property to query

  • remote (str, optional) – If not specified (None), returns the property for the local repository.

Returns:

Whether the setting is returned, or None if there is none.

Return type:

str

Raises:
  • ValueError – If an unknown property label is given.

  • CommandError – If the annex call errors.

get_remotes(with_urls_only=False, exclude_special_remotes=False)[source]

Get known (special-) remotes of the repository

Parameters:
  • exclude_special_remotes (bool, optional) – if True, don’t return annex special remotes

  • with_urls_only (bool, optional) – return only remotes which have urls

Returns:

remotes – List of names of the remotes

Return type:

list of str

static get_size_from_key(key)[source]

A little helper to obtain size encoded in a key

Returns:

size of the file or None if either no size is encoded in the key or key was None itself

Return type:

int or None

Raises:

ValueError – if key is considered invalid (at least its size-related part)

get_special_remotes(include_dead=False)[source]

Get info about all known (not just enabled) special remotes.

The present implementation is not able to report on special remotes that have only been configured in a private annex repo (annex.private=true).

Parameters:

include_dead (bool, optional) – Whether to include remotes announced dead.

Returns:

Keys are special remote UUIDs. Each value is a dictionary with configuration information git-annex has for the remote. This should include the ‘type’ and ‘name’ as well as any initremote parameters that git-annex stores.

Note: This is a faithful translation of git-annex:remote.log with one exception. For a special remote initialized with the –sameas flag, git-annex stores the special remote name under the “sameas-name” key, we copy this value under the “name” key so that callers don’t have to check two places for the name. If you need to detect whether you’re working with a sameas remote, the presence of either “sameas-name” or “sameas-uuid” is a reliable indicator.

Return type:

dict

get_tracking_branch(branch=None, remote_only=False, corresponding=True)[source]

Get the tracking branch for branch if there is any.

By default returns the tracking branch of the corresponding branch if branch is a managed branch.

Parameters:
  • branch (str) – local branch to look up. If none is given, active branch is used.

  • remote_only (bool) – Don’t return a value if the upstream remote is set to “.” (meaning this repository).

  • corresponding (bool) – If True actually look up the corresponding branch of branch (also if branch isn’t explicitly given)

Returns:

(remote or None, refspec or None) of the tracking branch

Return type:

tuple

get_urls(file_, key=False, batch=False)[source]

Get URLs for a file/key

Parameters:
  • file (str) –

  • key (bool, optional) – Whether provided files are actually annex keys

Return type:

A list of URLs

git_annex_version = None
info(files, batch=False, fast=False)[source]

Provide annex info for file(s).

Parameters:

files (list of str) – files to look for

Returns:

Info for each file

Return type:

dict

init_remote(name, options)[source]

Creates a new special remote

Parameters:

name (str) – name of the special remote

is_available(file_, remote=None, key=False, batch=False)[source]

Check if file or key is available (from a remote)

In case if key or remote is misspecified, it wouldn’t fail but just keep returning False, although possibly also complaining out loud ;)

Parameters:
  • file (str) – Filename or a key

  • remote (str, optional) – Remote which to check. If None, possibly multiple remotes are checked before positive result is reported

  • key (bool, optional) – Whether provided files are actually annex keys

  • batch (bool, optional) – Initiate or continue with a batched run of annex checkpresentkey

Returns:

with True indicating that file/key is available from (the) remote

Return type:

bool

is_crippled_fs()[source]

Return True if git-annex considers current filesystem ‘crippled’.

Return type:

True if on crippled filesystem, False otherwise

is_direct_mode()[source]

Return True if annex is in direct mode

Return type:

True if in direct mode, False otherwise.

is_initialized()[source]

quick check whether this appears to be an annex-init’ed repo

is_managed_branch(branch=None)[source]

Whether branch is managed by git-annex.

ATM this returns True if on an adjusted branch of annex v6+ repository: either ‘adjusted/my_branch(unlocked)’ or ‘adjusted/my_branch(fixed)’

Note: The term ‘managed branch’ is used to make clear it’s meant to be more general than the v6+ ‘adjusted branch’.

Parameters:

branch (str) – name of the branch; default: active branch

Returns:

True if on a managed branch, False otherwise

Return type:

bool

is_remote_annex_ignored(remote)[source]

Return True if remote is explicitly ignored

is_special_annex_remote(remote, check_if_known=True)[source]

Return whether remote is a special annex remote

Decides based on the presence of an annex- option and lack of a configured URL for the remote.

is_under_annex(files, allow_quick=False, batch=False)[source]

Check whether files are under annex control

Parameters:
  • files (list of str) – file(s) to check for being under annex

  • allow_quick (bool, optional) – This is no longer supported.

Returns:

For each input file states whether file is under annex

Return type:

list of bool

is_valid_annex(allow_noninitialized=False, check_git=True)[source]

Returns whether the underlying repository appears to be still valid

Note, that this almost identical to the classmethod is_valid_repo(). However, if we are testing an existing instance, we can save Path object creations. Since this testing is done a lot, this is relevant. Creation of the Path objects in is_valid_repo() takes nearly half the time of the entire function.

Also note, that this method is bound to an instance but still class-dependent, meaning that a subclass cannot simply overwrite it. This is particularly important for the call from within __init__(), which in turn is called by the subclasses’ __init__. Using an overwrite would lead to the wrong thing being called.

classmethod is_valid_repo(path, allow_noninitialized=False)[source]

Return True if given path points to an annex repository

localsync(remote=None, managed_only=False)[source]

Consolidate the local git-annex branch and/or managed branches.

This method calls git annex sync to perform purely local operations that:

  1. Update the corresponding branch of any managed branch.

  2. Synchronize the local ‘git-annex’ branch with respect to particular or all remotes (as currently reflected in the local state of their remote ‘git-annex’ branches).

If a repository has git-annex’s ‘synced/…’ branches these will be updated. Otherwise, such branches that are created by git annex sync are removed again after the sync is complete.

Parameters:
  • remote (str or list, optional) – If given, specifies the name of one or more remotes to sync against. If not given, all remotes are considered.

  • managed_only (bool, optional) – Only perform a sync if a managed branch with a corresponding branch is detected. By default, a sync is always performed.

merge_annex(remote=None)[source]
migrate_backend(files, backend=None)[source]

Changes the backend used for file.

The backend used for the key-value of files. Only files currently present are migrated. Note: There will be no notification if migrating fails due to the absence of a file’s content!

Parameters:
  • files (list) – files to migrate.

  • backend (str) – specify the backend to migrate to. If none is given, the default backend of this instance will be used.

precommit()[source]

Perform pre-commit maintenance tasks, such as closing all batched annexes since they might still need to flush their changes into index

repo_info(fast=False, merge_annex_branches=True)[source]

Provide annex info for the entire repository.

Parameters:
  • fast (bool, optional) – Pass –fast to git annex info.

  • merge_annex_branches (bool, optional) – Whether to allow git-annex if needed to merge annex branches, e.g. to make sure up to date descriptions for git annex remotes

Returns:

Info for the repository, with keys matching the ones returned by annex

Return type:

dict

repository_versions = None
rm_url(file_, url)[source]

Record that the file is no longer available at the url.

Parameters:
  • file (str) –

  • url (str) –

set_default_backend(backend, persistent=True, commit=True)[source]

Set default backend

Parameters:
  • backend (str) –

  • persistent (bool, optional) – If persistent, would add/commit to .gitattributes. If not – would set within .git/config

set_groupwanted(name, expr)[source]

Set expr for the name groupwanted

set_metadata(files, reset=None, add=None, init=None, remove=None, purge=None, recursive=False)[source]

Manipulate git-annex file-metadata

Parameters:
  • files (str or list(str)) – One or more paths for which metadata is to be manipulated. The changes applied to each file item are uniform. However, the result may not be uniform across files, depending on the actual operation.

  • reset (dict, optional) – Metadata items matching keys in the given dict are (re)set to the respective values.

  • add (dict, optional) – The values of matching keys in the given dict appended to any possibly existing values. The metadata keys need not necessarily exist before.

  • init (dict, optional) – Metadata items for the keys in the given dict are set to the respective values, if the key is not yet present in a file’s metadata.

  • remove (dict, optional) – Values in the given dict are removed from the metadata items matching the respective key, if they exist in a file’s metadata. Non-existing values, or keys do not lead to failure.

  • purge (list, optional) – Any metadata item with a key matching an entry in the given list is removed from the metadata.

  • recursive (bool, optional) – If False, fail (with CommandError) when directory paths are given as files.

Returns:

JSON obj per modified file

Return type:

list

set_metadata_(files, reset=None, add=None, init=None, remove=None, purge=None, recursive=False)[source]

Like set_metadata() but returns a generator

set_preferred_content(property, expr, remote=None)[source]

Set preferred content configuration of a repository or remote

Parameters:
  • property ({'wanted', 'required', 'group'}) – Type of property to query

  • expr (str) – Any expression or label supported by git-annex for the given property.

  • remote (str, optional) – If not specified (None), sets the property for the local repository.

Returns:

Raw git-annex output in response to the set command.

Return type:

str

Raises:
  • ValueError – If an unknown property label is given.

  • CommandError – If the annex call errors.

set_remote_dead(name)[source]

Announce to annex that remote is “dead”

set_remote_url(name, url, push=False)[source]

Set the URL a remote is pointing to

Sets the URL of the remote name. Requires the remote to already exist.

Parameters:
  • name (str) – name of the remote

  • url (str) –

  • push (bool) – if True, set the push URL, otherwise the fetch URL; if True, additionally set annexurl to url, to make sure annex uses it to talk to the remote, since access via fetch URL might be restricted.

supports_direct_mode = None
property supports_unlocked_pointers

Return True if repository version supports unlocked pointers.

sync(remotes=None, push=True, pull=True, commit=True, content=False, all=False, fast=False)[source]

This method is deprecated, use call_annex([‘sync’, …]) instead.

Synchronize local repository with remotes

Use this command when you want to synchronize the local repository with one or more of its remotes. You can specify the remotes (or remote groups) to sync with by name; the default if none are specified is to sync with all remotes.

Parameters:
  • remotes (str, list(str), optional) – Name of one or more remotes to be sync’ed.

  • push (bool) – By default, git pushes to remotes.

  • pull (bool) – By default, git pulls from remotes

  • commit (bool) – A commit is done by default. Disable to avoid committing local changes.

  • content (bool) – Normally, syncing does not transfer the contents of annexed files. This option causes the content of files in the work tree to also be uploaded and downloaded as necessary.

  • all (bool) – This option, when combined with content, makes all available versions of all files be synced, when preferred content settings allow

  • fast (bool) – Only sync with the remotes with the lowest annex-cost value configured

unannex(files, options=None)[source]

undo accidental add command

Use this to undo an accidental git annex add command. Note that for safety, the content of the file remains in the annex, until you use git annex unused and git annex dropunused.

Parameters:
  • files (list of str) –

  • options (list of str) –

Returns:

successfully unannexed files

Return type:

list of str

unlock(files)[source]

unlock files for modification

Note: This method is silent about errors in unlocking a file (e.g, the file has not content). Use the higher-level interface.unlock to get more informative reporting.

Parameters:

files (list of str) –

Returns:

successfully unlocked files

Return type:

list of str

property uuid

Annex UUID

Returns:

Returns a the annex UUID, if there is any, or None otherwise.

Return type:

str

whereis(files, output='uuids', key=False, options=None, batch=False)[source]

Lists repositories that have actual content of file(s).

Parameters:
  • files (list of str) – files to look for

  • output ({'descriptions', 'uuids', 'full'}, optional) – If ‘descriptions’, a list of remotes descriptions returned is per each file. If ‘full’, for each file a dictionary of all fields is returned as returned by annex

  • key (bool, optional) – Whether provided files are actually annex keys

  • options (list, optional) – Options to pass into git-annex call

Returns:

if output == ‘descriptions’, contains a list of descriptions of remotes for each input file, describing the remote for each remote, which was found by git-annex whereis, like:

u'me@mycomputer:~/where/my/repo/is [origin]' or
u'web' or
u'me@mycomputer:~/some/other/clone'

if output == ‘uuids’, returns a list of uuids. if output == ‘full’, returns a dictionary with filenames as keys and values a detailed record, e.g.:

{'00000000-0000-0000-0000-000000000001': {
  'description': 'web',
  'here': False,
  'urls': ['http://127.0.0.1:43442/about.txt', 'http://example.com/someurl']
}}

Return type:

list of list of unicode or dict

class datalad.support.annexrepo.BatchedAnnex(annex_cmd, git_options=None, annex_options=None, path=None, json=False, output_proc=None, batch_opt='--batch')[source]

Bases: BatchedCommand

Container for an annex process which would allow for persistent communication

class datalad.support.annexrepo.BatchedAnnexes(batch_size=0, git_options=None)[source]

Bases: SafeDelCloseMixin, dict

Class to contain the registry of active batch’ed instances of annex for a repository

clear()[source]

Override just to make sure we don’t rely on __del__ to close all the pipes

close()[source]

Close communication to all the batched annexes

It does not remove them from the dictionary though

get(codename, annex_cmd=None, **kwargs)[source]

Return the value for key if key is in the dictionary, else default.

Return type:

BatchedAnnex

class datalad.support.annexrepo.GeneratorAnnexJsonNoStderrProtocol(done_future=None, total_nbytes=None)[source]

Bases: GeneratorAnnexJsonProtocol

fd_infos: dict[int, tuple[str, Optional[bytearray]]]
pipe_data_received(fd, data)[source]
process: Optional[subprocess.Popen]
process_exited()[source]
class datalad.support.annexrepo.GeneratorAnnexJsonProtocol(done_future=None, total_nbytes=None)[source]

Bases: GeneratorMixIn, AnnexJsonProtocol

add_to_output(json_object)[source]
fd_infos: dict[int, tuple[str, Optional[bytearray]]]
process: Optional[subprocess.Popen]
datalad.support.annexrepo.readline_json(stdout)[source]
datalad.support.annexrepo.readlines_until_ok_or_failed(stdout, maxlines=100)[source]

Read stdout until line ends with ok or failed