Python module reference¶
This module reference extends the manual with a comprehensive overview of the available functionality built into datalad. Each module in the package is documented by a general summary of its purpose and the list of classes and functions it provides.
High-level user interface¶
Dataset operations¶
api.Dataset (path) |
Representation of a DataLad dataset/repository |
api.create ([path, initopts, force, …]) |
Create a new dataset from scratch. |
api.create_sibling (sshurl[, name, …]) |
Create a dataset sibling on a UNIX-like Shell (local or SSH)-accessible machine |
api.create_sibling_github (reponame[, …]) |
Create dataset sibling on GitHub. |
api.create_sibling_gitlab ([path, site, …]) |
Create dataset sibling at a GitLab site |
api.drop ([path, dataset, recursive, …]) |
Drop file content from datasets |
api.get ([path, source, dataset, recursive, …]) |
Get any dataset content (files/directories/subdatasets). |
api.install ([path, source, dataset, …]) |
Install a dataset from a (remote) source. |
api.publish ([path, dataset, to, since, …]) |
Publish a dataset to a known sibling. |
api.remove ([path, dataset, recursive, …]) |
Remove components from datasets |
api.save ([path, message, dataset, …]) |
Save the current state of a dataset |
api.update ([path, sibling, merge, follow, …]) |
Update a dataset from a sibling. |
api.uninstall ([path, dataset, recursive, …]) |
Uninstall subdatasets |
api.unlock ([path, dataset, recursive, …]) |
Unlock file(s) of a dataset |
Metadata handling¶
api.search ([query, dataset, force_reindex, …]) |
Search dataset metadata |
api.metadata ([path, dataset, …]) |
Metadata reporting for files and entire datasets |
api.aggregate_metadata ([path, dataset, …]) |
Aggregate metadata of one or more datasets for later query. |
api.extract_metadata (types[, files, dataset]) |
Run one or more of DataLad’s metadata extractors on a dataset or file. |
Reproducible execution¶
api.run ([cmd, dataset, inputs, outputs, …]) |
Run an arbitrary shell command and record its impact on a dataset. |
api.rerun ([revision, since, dataset, …]) |
Re-execute previous datalad run commands. |
api.run_procedure ([spec, dataset, discover, …]) |
Run prepared procedures (DataLad scripts) on a dataset |
Plumbing commands¶
api.annotate_paths ([path, dataset, …]) |
Analyze and act upon input paths |
api.clean ([dataset, what, recursive, …]) |
Clean up after DataLad (possible temporary files etc.) |
api.clone (source[, path, dataset, …]) |
Obtain a dataset (copy) from a URL or local directory |
api.copy_file ([path, dataset, recursive, …]) |
Copy files and their availability metadata from one dataset to another. |
api.create_test_dataset ([path, spec, seed]) |
Create test (meta-)dataset. |
api.diff ([path, fr, to, dataset, annex, …]) |
Report differences between two states of a dataset (hierarchy) |
api.download_url (urls[, dataset, path, …]) |
Download content |
api.ls (loc[, recursive, fast, all_, long_, …]) |
List summary information about URLs and dataset(s) |
api.push ([path, dataset, to, since, data, …]) |
Push a dataset to a known sibling. |
api.sshrun (login, cmd[, port, ipv4, ipv6, …]) |
Run command on remote machines via SSH. |
api.siblings ([action, dataset, name, url, …]) |
Manage sibling configuration |
api.subdatasets ([path, dataset, fulfilled, …]) |
Report subdatasets and their properties. |
Miscellaneous commands¶
api.add_archive_content (archive[, annex, …]) |
Add content of an archive under git annex control. |
api.test ([module, verbose, nocapture, pdb, stop]) |
Run internal DataLad (unit)tests. |
Plugins¶
DataLad can be customized by plugins. The following plugins are shipped with DataLad.
add_readme |
add a README file to a dataset |
addurls |
Create and update a dataset from a list of URLs. |
check_dates |
Extension for checking dates within repositories. |
export_archive |
export a dataset as a compressed TAR/ZIP archive |
export_to_figshare |
export a dataset as a TAR/ZIP archive to figshare |
no_annex |
configure which dataset parts to never put in the annex |
wtf |
provide information about this DataLad installation |
Support functionality¶
auto |
Proxy basic file operations (e.g. |
cmd |
Wrapper for command and function calls, allowing for dry runs and output handling |
consts |
constants for datalad |
log |
|
utils |
|
version |
Defines version to be imported in the module and obtained from setup.py |
support.gitrepo |
Internal low-level interface to Git repositories |
support.annexrepo |
Interface to git-annex by Joey Hess. |
support.archives |
Various handlers/functionality for different types of files (e.g. |
support.configparserinc |
|
customremotes.main |
|
customremotes.base |
Base classes to custom git-annex remotes (e.g. |
customremotes.archives |
Custom remote to support getting the load from archives present under annex |
Test infrastructure¶
tests.utils |
Miscellaneous utilities to assist with testing |
tests.utils_testrepos |
|
tests.heavyoutput |
Helper to provide heavy load on stdout and stderr |
Command line interface infrastructure¶
cmdline.main |
|
cmdline.helpers |
|
cmdline.common_args |