datalad.api.create_sibling_github(reponame, dataset=None, recursive=False, recursion_limit=None, name='github', existing='error', github_login=None, github_passwd=None, github_organization=None, access_protocol='https', publish_depends=None, dryrun=False)

Create dataset sibling on Github.

An existing GitHub project, or a project created via the GitHub website can be configured as a sibling with the siblings command. Alternatively, this command can create a repository under a user’s Github account, or any organization a user is a member of (given appropriate permissions). This is particulary helpful for recursive sibling creation for subdatasets. In such a case, a dataset hierarchy is represented as a flat list of GitHub repositories.

Github cannot host dataset content. However, in combination with other data sources (and siblings), publishing a dataset to Github can facilitate distribution and exchange, while still allowing any dataset consumer to obtain actual data content from alternative sources.

For Github authentication user credentials can be given as arguments. Alternatively, they are obtained interactively or queried from the systems credential store. Lastly, an oauth token stored in the Git configuration under variable hub.oauthtoken will be used automatically. Such a token can be obtained, for example, using the commandline Github interface ( by running: git hub setup (if no 2FA is used).

  • reponame (str) – Github repository name. When operating recursively, a suffix will be appended to this name for each subdataset.
  • dataset (Dataset or None, optional) – specify the dataset to create the publication target for. If no dataset is given, an attempt is made to identify the dataset based on the current working directory. [Default: None]
  • recursive (bool, optional) – if set, recurse into potential subdataset. [Default: False]
  • recursion_limit (int or None, optional) – limit recursion into subdataset to the given number of levels. [Default: None]
  • name (str, optional) – name to represent the Github repository in the local dataset installation. [Default: ‘github’]
  • existing ({'skip', 'error', 'reconfigure'}, optional) – desired behavior when already existing or configured siblings are discovered. ‘skip’: ignore; ‘error’: fail immediately; ‘reconfigure’: use the existing repository and reconfigure the local dataset to use it as a sibling. [Default: ‘error’]
  • github_login (str or None, optional) – Github user name or access token. [Default: None]
  • github_passwd (str or None, optional) – Github user password. [Default: None]
  • github_organization (str or None, optional) – If provided, the repository will be created under this Github organization. The respective Github user needs appropriate permissions. [Default: None]
  • access_protocol ({'https', 'ssh'}, optional) – Which access protocol/URL to configure for the sibling. [Default: ‘https’]
  • publish_depends (list of str or None, optional) – add a dependency such that the given existing sibling is always published prior to the new sibling. This equals setting a configuration item ‘remote.SIBLINGNAME.datalad-publish-depends’. Multiple dependencies can be given as a list of sibling names. [Default: None]
  • dryrun (bool, optional) – If this flag is set, no communication with Github is performed, and no repositories will be created. Instead would-be repository names are reported for all relevant datasets. [Default: False]