datalad clone

Synopsis

datalad clone [-h] [-d DATASET] [-D DESCRIPTION] [--reckless [{auto}]] SOURCE [PATH]

Description

Obtain a dataset (copy) from a URL or local directory

The purpose of this command is to obtain a new clone (copy) of a dataset and place it into a not-yet-existing or empty directory. As such CLONE provides a strict subset of the functionality offered by INSTALL. Only a single dataset can be obtained, and immediate recursive installation of subdatasets is not supported. However, once a (super)dataset is installed via CLONE, any content, including subdatasets can be obtained by a subsequent GET command.

Primary differences over a direct git clone call are 1) the automatic initialization of a dataset annex (pure Git repositories are equally supported); 2) automatic registration of the newly obtained dataset as a subdataset (submodule), if a parent dataset is specified; and 3) support for additional resource identifiers (DataLad resource identifiers as used on datasets.datalad.org, and RIA store URLs as used for store.datalad.org; see examples); and 4) automatic configurable generation of alternative access URL for common cases (such as appending ‘.git’ to the URL in case the accessing the base URL failed).

SEEALSO

http://handbook.datalad.org/en/latest/usecases/datastorage_for_institutions.html
More information on Remote Indexed Archive (RIA) stores

Examples

Install a dataset from Github into the current directory:

% datalad clone https://github.com/datalad-datasets/longnow-podcasts.git

Install a dataset into a specific directory:

% datalad clone https://github.com/datalad-datasets/longnow-podcasts.git myfavpodcasts

Install a dataset as a subdataset into the current dataset:

% datalad clone -d . https://github.com/datalad-datasets/longnow-podcasts.git

Install the main superdataset from datasets.datalad.org:

% datalad clone ///

Install a dataset identified by its ID from store.datalad.org:

% datalad clone ria+http://store.datalad.org#76b6ca66-36b1-11ea-a2e6-f0d5bf7b5561

Options

SOURCE

URL, DataLad resource identifier, local path or instance of dataset to be cloned. Constraints: value must be a string

PATH

path to clone into. If no PATH is provided a destination path will be derived from a source URL similar to git clone.

-h, --help, --help-np

show this help message. –help-np forcefully disables the use of a pager for displaying the help message

-d DATASET, --dataset DATASET

(parent) dataset to clone into. If given, the newly cloned dataset is registered as a subdataset of the parent. Also, if given, relative paths are interpreted as being relative to the parent dataset, and not relative to the working directory. Constraints: Value must be a Dataset or a valid identifier of a Dataset (e.g. a path)

-D DESCRIPTION, --description DESCRIPTION

short description to use for a dataset location. Its primary purpose is to help humans to identify a dataset copy (e.g., “mike’s dataset on lab server”). Note that when a dataset is published, this information becomes available on the remote side. Constraints: value must be a string

--reckless [{auto}]

Set up the dataset to be able to obtain content in the cheapest/fastest possible way, even if this poses a potential risk the data integrity (e.g. hardlink files from a local clone of the dataset). Use with care, and limit to “read-only” use cases. With this flag the installed dataset will be marked as untrusted. The reckless mode is stored in a dataset’s local configuration under ‘datalad.clone.reckless’, and will be inherited to any of its subdatasets. Constraints: value must be one of (None, True, False, ‘auto’)

Authors

datalad is developed by The DataLad Team and Contributors <team@datalad.org>.