datalad.api.meta_add

datalad.api.meta_add(metadata: Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]]]]], additionalvalues: Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]], List[Union[str, int, float, bool, None, Dict[str, Union[str, int, float, bool, None, Dict[str, Any], List[Any]]], List[Union[str, int, float, bool, None, Dict[str, Any], List[Any]]]]]]]]]] = None, dataset: Union[datalad.distribution.dataset.Dataset, str, None] = None, allow_override: bool = False, allow_unknown: bool = False, allow_id_mismatch: bool = False, json_lines: bool = False, batch_mode: bool = False)

Add metadata to a dataset.

This command reads metadata from a source and adds this metadata to a dataset. The source can either be a file, or standard input. The metadata format is a string with the JSON-serialized dictionary, or list of dictionaries (or individual dictionaries in JSON-Lines format) that describes the metadata.

In case of an API-call metadata can also be provided in a python dictionary or a list of dictionaries.

If metadata is read from a source, additional parameter can overwrite or amend information that is provided by the source.

The ADDITIONAL_VALUES arguments can be pre-fixed by ‘@’, in which case the pre-fixed argument is interpreted as a file-name and the argument value is read from the file.

The metadata key “dataset-id” must be identical to the ID of the dataset that receives the metadata, unless -i or –allow-id-mismatch is provided.

Examples

Parameters:
  • metadata (str) – path of the file that contains the metadata that should be added to the metadata store (metadata records must be provided as a JSON- serialized metadata dictionary). The file may contain a single metadata-record or a JSON-array with multiple metadata-records. If –json-lines is given, the file may also contain one dictionary per line. If the path is “-”, the metadata file is read from standard input. The dictionary must contain the following keys: ‘type’ ‘extractor_name’ ‘extractor_version’ ‘extraction_parameter’ ‘extraction_time’ ‘agent_name’ ‘agent_email’ ‘dataset_id’ ‘dataset_version’ ‘extracted_metadata’ If the metadata is associated with a file, the following key indicates the file path: ‘path’ If the metadata should refer to a sub-dataset element (that means if an “aggregated” record should be stored (see meta-aggregate for more info), the following key indicates the path of the sub-dataset from the root of the “containing dataset”: ‘dataset_path’ The containing dataset, aka. the “root dataset”, is the dataset version, specified by dataset-id and version that contains the given sub-dataset version at the path ‘dataset_path’. If the containing dataset, aka. the “root dataset”, is known, it can be specified by providing the following keys: ‘root_dataset_id’ ‘root_dataset_version’ If the version of the root dataset that contains the given subdataset at the given path is not known, the “root_dataset_*”-keys can be omitted. Such a situation might arise, if the sub-dataset metadata was extracted in an old version of the sub-dataset, and the relation of this old version to the root-dataset is not known (we assume that the root-dataset would be the dataset to which the metadata is added.) In this case the metadata is added with an “anonymous” root dataset, but with the given sub-dataset-path. (This makes sense, if the sub-dataset at the given path contains a version that is defined in the metadata, i.e. in dataset_version. The metadata will then be added to all versions that meet this condition.).
  • additionalvalues (str or None, optional) – A string that contains a JSON serialized dictionary of key value- pairs. These key values-pairs are used in addition to the key value pairs in the metadata dictionary to describe the metadata that should be added. If an additional key is already present in the metadata, an error is raised, unless -o, –allow-override is provided. In this case, the additional values will override the value in metadata and a warning is issued. NB! If multiple records are provided in METADATA, the additional values will be applied to all of them. [Default: None]
  • dataset (Dataset or None, optional) – “dataset to which metadata should be added. If not provided, the dataset is assumed to be given by the current directory. [Default: None]
  • allow_override (bool, optional) – Allow the additional values to override values given in metadata. [Default: False]
  • allow_unknown (bool, optional) – Allow unknown keys. By default, unknown keys generate an errors. If this switch is True, unknown keys will only be reported. For processing unknown keys will be ignored. [Default: False]
  • allow_id_mismatch (bool, optional) – Allow insertion of metadata, even if the “dataset-id” in the metadata source does not match the ID of the target dataset. [Default: False]
  • json_lines (bool, optional) – Interpret metadata input as JSON lines. i.e. expect one metadata record per line. This is the format used by commands like “datalad meta-dump”. [Default: False]
  • batch_mode (bool, optional) – Enable batch mode. In batch mode metadata-records are read from stdin, one record per line, and a result is written to stdout, one result per line. Batch mode can be exited by sending an empty line that just consists of a newline. Meta-add will return an empty line that just consists of a newline to confirm the exit request. When this flag is given, the metadata file name should be set to “-” (minus). [Default: False]
  • on_failure ({'ignore', 'continue', 'stop'}, optional) – behavior to perform on failure: ‘ignore’ any failure is reported, but does not cause an exception; ‘continue’ if any failure occurs an exception will be raised at the end, but processing other actions will continue for as long as possible; ‘stop’: processing will stop on first failure and an exception is raised. A failure is any result with status ‘impossible’ or ‘error’. Raised exception is an IncompleteResultsError that carries the result dictionaries of the failures in its failed attribute. [Default: ‘continue’]
  • result_filter (callable or None, optional) – if given, each to-be-returned status dictionary is passed to this callable, and is only returned if the callable’s return value does not evaluate to False or a ValueError exception is raised. If the given callable supports **kwargs it will additionally be passed the keyword arguments of the original API call. [Default: None]
  • result_renderer – select rendering mode command results. ‘tailored’ enables a command- specific rendering style that is typically tailored to human consumption, if there is one for a specific command, or otherwise falls back on the the ‘generic’ result renderer; ‘generic’ renders each result in one line with key info like action, status, path, and an optional message); ‘json’ a complete JSON line serialization of the full result record; ‘json_pp’ like ‘json’, but pretty-printed spanning multiple lines; ‘disabled’ turns off result rendering entirely; ‘<template>’ reports any value(s) of any result properties in any format indicated by the template (e.g. ‘{path}’, compare with JSON output for all key-value choices). The template syntax follows the Python “format() language”. It is possible to report individual dictionary values, e.g. ‘{metadata[name]}’. If a 2nd-level key contains a colon, e.g. ‘music:Genre’, ‘:’ must be substituted by ‘#’ in the template, like so: ‘{metadata[music#Genre]}’. [Default: ‘generic’]
  • result_xfm ({'datasets', 'successdatasets-or-none', 'paths', 'relpaths', 'metadata'} or callable or None, optional) – if given, each to-be-returned result status dictionary is passed to this callable, and its return value becomes the result instead. This is different from result_filter, as it can perform arbitrary transformation of the result value. This is mostly useful for top- level command invocations that need to provide the results in a particular format. Instead of a callable, a label for a pre-crafted result transformation can be given. [Default: None]
  • return_type ({'generator', 'list', 'item-or-list'}, optional) – return value behavior switch. If ‘item-or-list’ a single value is returned instead of a one-item return value list, or a list in case of multiple return values. None is return in case of an empty list. [Default: ‘list’]