datalad_next.iter_collections.iter_gitdiff
- datalad_next.iter_collections.iter_gitdiff(path: Path, from_treeish: str | None, to_treeish: str | None, *, recursive: str = 'repository', find_renames: int | None = None, find_copies: int | None = None, yield_tree_items: str | None = None, eval_submodule_state: str = 'full', pathspecs: list[str] | GitPathSpecs | None = None) Generator[GitDiffItem, None, None] [source]
Report differences between Git tree-ishes or tracked worktree content
This function is a wrapper around the Git command
diff-tree
anddiff-index
. Therefore most semantics also apply here.The main difference with respect to the Git commands are: 1) uniform support for non-recursive, single tree reporting (no subtrees); and 2) support for submodule recursion.
Notes on 'no' recursion mode
When comparing to the worktree,
git diff-index
always reports on subdirectories. For homogeneity with the report on a committed tree, a non-recursive mode emulation is implemented. It compresses all reports from a direct subdirectory into a single report on that subdirectory. Thegitsha
of that directory item will always beNone
. Moreover, no type or typechange inspection, or further filesystem queries are performed. Therefore,prev_gittype
will always beNone
, and any change other than the addition of the directory will be labeled as aGitDiffStatus.modification
.- Parameters:
path (Path) -- Path of a directory in a Git repository to report on. This directory need not be the root directory of the repository, but must be part of the repository. If the directory is not the root directory of a non-bare repository, the iterator is constrained to items underneath that directory.
from_treeish (str or None) -- Git "tree-ish" that defines the comparison reference. If
None
,to_treeeish
must not beNone
(see its documentation for details).to_treeish -- Git "tree-ish" that defines the comparison target. If
None
,from_treeish
must not beNone
, and that tree-ish will be compared against the worktree. (see its documentation for details). Iffrom_treeish
isNone
, the given tree-ish is compared to its immediate parents (seegit diff-tree
documentation for details).recursive ({'repository', 'submodules', 'no'}, optional) -- Behavior for recursion into subtrees. By default (
repository
), all trees within the repository underneathpath
) are reported, but no tree within submodules. Withsubmodules
, recursion includes any submodule that is present. Ifno
, only direct children are reported on.find_renames (int, optional) -- If given, this defines the similarity threshold for detecting renames (see
git diff-{index,tree} --find-renames
). By default, no rename detection is done and reported items never have therename
status. Instead, a renames would be reported as a deletion and an addition.find_copied (int, optional) -- If given, this defines the similarity threshold for detecting copies (see
git diff-{index,tree} --find-copies
). By default, no copy detection is done and reported items never have thecopy
status. Instead, a copy would be reported as addition. This option always implies the use of the--find-copies-harder
Git option that enables reporting of copy sources, even when they have not been modified in the same change. This is a very expensive operation for large projects, so use it with caution.yield_tree_items ({'submodules', 'directories', 'all', None}, optional) -- Whether to yield an item on type of subtree that will also be recursed into. For example, a submodule item, when submodule recursion is enabled. When disabled, subtree items (directories, submodules) will still be reported whenever there is no recursion into them. For example, submodule items are reported when
recursive='repository
, even whenyield_tree_items=None
.eval_submodule_state ({"no", "commit", "full"}, optional) -- If 'full' (default), the state of a submodule is evaluated by considering all modifications ('--ignore-submodules=none'). If 'commit', the modification check is restricted to comparing the submodule's "HEAD" commit to the one recorded in the superdataset ('--ignore-submodules=dirty'). If 'no', the state of the subdataset is not evaluated ('--ignore-submodules=all').
pathspecs (optional) -- Patterns used to limit results to particular paths. Any pathspecs supported by Git can be used and are passed to the underlying
git ls-files
queries. Pathspecs are also supported for recursive reporting on submodules. In such a case, the results match those of individual queries with analog pathspecs on the respective submodules (Git itself does not support pathspecs for submodule-recursive operations). For example, asubmodule
recursion with a pathspec*.jpg
will yield reports on all JPG files in all submodules, even though a submodule path itself does not match*.jpg
. On the other hand, a pathspecsubmoddir/*.jpg
will only report on JPG files in the submodule atsubmoddir/
, but on all JPG files in that submodule. As of version 1.5, the pathspec support for submodule recursion is preliminary and results should be carefully investigated.
- Yields:
GitDiffItem
-- Thename
andprev_name
attributes of an item are astr
with the corresponding (relative) path, as reported by Git (in POSIX conventions).