datalad_revolution.gitrepo

Amendment of the DataLad GitRepo base class

class datalad_revolution.gitrepo.RevolutionGitRepo(*args, **kwargs)

Bases: datalad.support.gitrepo.GitRepo

diff(fr, to, paths=None, untracked='all', ignore_submodules='no')

Like status(), but reports changes between to arbitrary revisions

Parameters:
  • fr (str) – Revision specification (anything that Git understands). Passing None considers anything in the target state as new.
  • to (str or None) – Revision specification (anything that Git understands), or None to compare to the state of the work tree.
  • paths (list or None) – If given, limits the query to the specified paths. To query all paths specify None, not an empty list.
  • untracked ({'no', 'normal', 'all'}) – If and how untracked content is reported when no ref was given: ‘no’: no untracked files are reported; ‘normal’: untracked files and entire untracked directories are reported as such; ‘all’: report individual files even in fully untracked directories.
  • ignore_submodules ({'no', 'other', 'all'}) –
Returns:

Each content item has an entry under its relative path within the repository. Each value is a dictionary with properties:

type

Can be ‘file’, ‘symlink’, ‘dataset’, ‘directory’

state

Can be ‘added’, ‘untracked’, ‘clean’, ‘deleted’, ‘modified’.

Return type:

dict

diffstatus(fr, to, paths=None, untracked='all', ignore_submodules='no', _cache=None)

Like diff(), but reports the status of ‘clean’ content too

dirty
get_content_info(paths=None, ref=None, untracked='all')

Get identifier and type information from repository content.

This is simplified front-end for git ls-files/tree.

Both commands differ in their behavior when queried about subdataset paths. ls-files will not report anything, ls-tree will report on the subdataset record. This function uniformly follows the behavior of ls-tree (report on the respective subdataset mount).

Parameters:
  • paths (list(patlib.PurePath)) – Specific paths, relative to the (resolved repository root, to query info for. Paths must be normed to match the reporting done by Git, i.e. no parent dir components (ala “some/../this”). If none are given, info is reported for all content.
  • ref (gitref or None) – If given, content information is retrieved for this Git reference (via ls-tree), otherwise content information is produced for the present work tree (via ls-files).
  • untracked ({'no', 'normal', 'all'}) – If and how untracked content is reported when no ref was given: ‘no’: no untracked files are reported; ‘normal’: untracked files and entire untracked directories are reported as such; ‘all’: report individual files even in fully untracked directories.
Returns:

Each content item has an entry under its relative path within the repository. Each value is a dictionary with properties:

type

Can be ‘file’, ‘symlink’, ‘dataset’, ‘directory’

Note that the reported type will not always match the type of content commited to Git, rather it will reflect the nature of the content minus platform/mode-specifics. For example, a symlink to a locked annexed file on Unix will have a type ‘file’, reported, while a symlink to a file in Git or directory will be of type ‘symlink’.

gitshasum

SHASUM of the item as tracked by Git, or None, if not tracked. This could be different from the SHASUM of the file in the worktree, if it was modified.

Return type:

dict

Raises:

ValueError – In case of an invalid Git reference (e.g. ‘HEAD’ in an empty repository)

get_staged_paths()

Returns a list of any stage repository path(s)

This is a rather fast call, as it will not depend on what is going on in the worktree.

is_dirty(**kwargs)
save(message=None, paths=None, _status=None, **kwargs)

Save dataset content.

Parameters:
  • message (str or None) – A message to accompany the changeset in the log. If None, a default message is used.
  • paths (list or None) – Any content with path matching any of the paths given in this list will be saved. Matching will be performed against the dataset status (GitRepo.status()), or a custom status provided via _status. If no paths are provided, ALL non-clean paths present in the repo status or _status will be saved.
  • ignore_submodules ({'no', 'all'}) – If _status is not given, will be passed as an argument to Repo.status(). With ‘all’ no submodule state will be saved in the dataset. Note that submodule content will never be saved in their respective datasets, as this function’s scope is limited to a single dataset.
  • _status (dict or None) – If None, Repo.status() will be queried for the given ds. If a dict is given, its content will be used as a constraint. For example, to save only modified content, but no untracked content, set paths to None and provide a _status that has no entries for untracked content.
  • **kwargs

    Additional arguments that are passed to underlying Repo methods. Supported:

    • git : bool (passed to Repo.add()
    • ignore_submodules : {‘no’, ‘other’, ‘all’} passed to Repo.status()
    • untracked : {‘no’, ‘normal’, ‘all’} - passed to Repo.satus()
save_(message=None, paths=None, _status=None, **kwargs)

Like save() but working as a generator.

status(paths=None, untracked='all', ignore_submodules='no')

Simplified git status equivalent.

Parameters:
  • paths (list or None) – If given, limits the query to the specified paths. To query all paths specify None, not an empty list. If a query path points into a subdataset, a report is made on the subdataset record within the queried dataset only (no recursion).
  • untracked ({'no', 'normal', 'all'}) – If and how untracked content is reported when no ref was given: ‘no’: no untracked files are reported; ‘normal’: untracked files and entire untracked directories are reported as such; ‘all’: report individual files even in fully untracked directories.
  • ignore_submodules ({'no', 'other', 'all'}) –
Returns:

Each content item has an entry under its relative path within the repository. Each value is a dictionary with properties:

type

Can be ‘file’, ‘symlink’, ‘dataset’, ‘directory’

state

Can be ‘added’, ‘untracked’, ‘clean’, ‘deleted’, ‘modified’.

Return type:

dict