datalad.api.create_sibling_github(reponame, dataset=None, recursive=False, recursion_limit=None, name='github', existing='error', github_login=None, github_organization=None, access_protocol='https', publish_depends=None, private=False, dryrun=False, dry_run=False)

Create dataset sibling on GitHub.

An existing GitHub project, or a project created via the GitHub website can be configured as a sibling with the siblings command. Alternatively, this command can create a repository under a user’s GitHub account, or any organization a user is a member of (given appropriate permissions). This is particularly helpful for recursive sibling creation for subdatasets. In such a case, a dataset hierarchy is represented as a flat list of GitHub repositories.

GitHub cannot host dataset content (but LFS special remote could be used, However, in combination with other data sources (and siblings), publishing a dataset to GitHub can facilitate distribution and exchange, while still allowing any dataset consumer to obtain actual data content from alternative sources.

For GitHub authentication a personal access token is needed. Such a token can be generated by visiting or navigating via GitHub Web UI through: Settings -> Developer settings -> Personal access tokens. We will first consult Git configuration hub.oauthtoken for tokens possibly available there, and then from the system credential store.

If you provide github_login, we will consider only tokens associated with that GitHub login from hub.oauthtoken, and store/check the token in credential store as associated with that specific login name.

  • reponame (str) – GitHub repository name. When operating recursively, a suffix will be appended to this name for each subdataset.
  • dataset (Dataset or None, optional) – specify the dataset to create the publication target for. If no dataset is given, an attempt is made to identify the dataset based on the current working directory. [Default: None]
  • recursive (bool, optional) – if set, recurse into potential subdataset. [Default: False]
  • recursion_limit (int or None, optional) – limit recursion into subdataset to the given number of levels. [Default: None]
  • name (str, optional) – name to represent the GitHub repository in the local dataset installation. [Default: ‘github’]
  • existing ({'skip', 'error', 'reconfigure', 'replace'}, optional) – desired behavior when already existing or configured siblings are discovered. In this case, a dataset can be skipped (‘skip’), the sibling configuration be updated (‘reconfigure’), or process interrupts with error (‘error’). DANGER ZONE: If ‘replace’ is used, an existing github repository will be irreversibly removed, re- initialized, and the sibling (re-)configured (thus implies ‘reconfigure’). replace could lead to data loss, so use with care. To minimize possibility of data loss, in interactive mode DataLad will ask for confirmation, but it would raise an exception in non- interactive mode. [Default: ‘error’]
  • github_login (str or None, optional) – GitHub user name or access token. [Default: None]
  • github_organization (str or None, optional) – If provided, the repository will be created under this GitHub organization. The respective GitHub user needs appropriate permissions. [Default: None]
  • access_protocol ({'https', 'ssh'}, optional) – Which access protocol/URL to configure for the sibling. [Default: ‘https’]
  • publish_depends (list of str or None, optional) – add a dependency such that the given existing sibling is always published prior to the new sibling. This equals setting a configuration item ‘remote.SIBLINGNAME.datalad-publish-depends’. Multiple dependencies can be given as a list of sibling names. [Default: None]
  • private (bool, optional) – If this flag is set, the repository created on github will be marked as private and only visible to those granted access or by membership of a team/organization/etc. [Default: False]
  • dryrun (bool, optional) – Deprecated. Use the renamed dry_run parameter. [Default: False]
  • dry_run (bool, optional) – If this flag is set, no repositories will be created. Instead tests for name collisions with existing projects will be performed, and would-be repository names are reported for all relevant datasets. [Default: False]
  • on_failure ({'ignore', 'continue', 'stop'}, optional) – behavior to perform on failure: ‘ignore’ any failure is reported, but does not cause an exception; ‘continue’ if any failure occurs an exception will be raised at the end, but processing other actions will continue for as long as possible; ‘stop’: processing will stop on first failure and an exception is raised. A failure is any result with status ‘impossible’ or ‘error’. Raised exception is an IncompleteResultsError that carries the result dictionaries of the failures in its failed attribute. [Default: ‘continue’]
  • result_filter (callable or None, optional) – if given, each to-be-returned status dictionary is passed to this callable, and is only returned if the callable’s return value does not evaluate to False or a ValueError exception is raised. If the given callable supports **kwargs it will additionally be passed the keyword arguments of the original API call. [Default: None]
  • result_renderer ({'default', 'json', 'json_pp', 'tailored'} or None, optional) – format of return value rendering on stdout. [Default: None]
  • result_xfm ({'datasets', 'successdatasets-or-none', 'paths', 'relpaths', 'metadata'} or callable or None, optional) – if given, each to-be-returned result status dictionary is passed to this callable, and its return value becomes the result instead. This is different from result_filter, as it can perform arbitrary transformation of the result value. This is mostly useful for top- level command invocations that need to provide the results in a particular format. Instead of a callable, a label for a pre-crafted result transformation can be given. [Default: None]
  • return_type ({'generator', 'list', 'item-or-list'}, optional) – return value behavior switch. If ‘item-or-list’ a single value is returned instead of a one-item return value list, or a list in case of multiple return values. None is return in case of an empty list. [Default: ‘list’]