datalad catalog
Synopsis
datalad catalog [-h] [--version]
Description
Generate a user-friendly web-based data catalog from structured metadata.
datalad catalog can be used to -create a new catalog,
-add and -remove metadata entries to/from an existing catalog,
start a local http server to -serve an existing catalog locally.
It can also -validate a metadata entry (validation is also
performed implicitly when adding), -set dataset properties
such as the home page to be shown by default, and -get
dataset properties such as the config, specific metadata,
or the home page.
Metadata can be provided to DataLad Catalog from any number of arbitrary metadata sources, as an aggregated set or as individual metadata items. DataLad Catalog has a dedicated schema (using the JSON Schema vocabulary) against which incoming metadata items are validated. This schema allows for standard metadata fields as one would expect for datasets of any kind (such as name, doi, url, description, license, authors, and more), as well as fields that support identification, versioning, dataset context and linkage, and file tree specification.
The output is a set of structured metadata files, as well as a Vue.js-based browser interface that understands how to render this metadata in the browser. These can be hosted on a platform of choice as a static webpage.
Note: in the catalog website, each dataset entry is displayed
under <main page>/#/dataset/<dataset_id>/<dataset_version>.
By default, the main page of the catalog will display a 404 error,
unless the default dataset is configured with datalad catalog-set
home.
Examples
CREATE a new catalog from scratch:
% datalad catalog-create -c /tmp/my-cat
ADD metadata to an existing catalog:
% datalad catalog-add -c /tmp/my-cat -m path/to/metadata.jsonl
SET a property of an existing catalog, such as the home page of an existing catalog - i.e. the first dataset displayed when navigating to the root URL of the catalog:
% datalad catalog-set -c /tmp/my-cat -i abcd -v 1234 home
SERVE the content of the catalog via a local HTTP server at http://localhost:8001:
% datalad catalog-serve -c /tmp/my-cat -p 8001
VALIDATE metadata against a catalog schema without adding it to the catalog:
% datalad catalog-validate -c /tmp/my-cat/-m path/to/metadata.jsonl'
GET a property of an existing catalog, such as the catalog configuration:
% datalad catalog-get -c /tmp/my-cat/ config
REMOVE a specific metadata record from an existing catalog:
% datalad catalog-remove -c /tmp/my-cat -i efgh -v 5678
TRANSLATE a metalad-extracted metadata item from a particular source structure into the catalog schema. A dedicated translator should be provided and exposed as an entry point (e.g. via a DataLad extension) as part of the 'datalad.metadata.translators' group.:
% datalad catalog-translate -c /tmp/my-cat -m path/to/metadata.jsonl
RUN A WORKFLOW for recursive metadata extraction (using datalad- metalad), translating metadata to the catalog schema, and adding the translated metadata to a new catalog:
% datalad catalog-workflow -t new -c /tmp/my-cat -d path/to/superdataset -e metalad_core
RUN A WORKFLOW for updating a catalog after registering a subdataset to the superdataset which the catalog represents. This workflow includes extraction (using datalad-metalad), translating metadata to the catalog schema, and adding the translated metadata to the existing catalog.:
% datalad catalog-workflow -t new -c /tmp/my-cat -d path/to/superdataset -s path/to/subdataset -e metalad_core
Options
-h, --help, --help-np
show this help message. --help-np forcefully disables the use of a pager for displaying the help message
--version
show the module and its version which provides the command