datalad_next.url_operations.HttpUrlOperations

class datalad_next.url_operations.HttpUrlOperations(cfg=None, headers: Dict | None = None)[source]

Bases: UrlOperations

Handler for operations on http(s):// URLs

This handler is built on the requests package. For authentication, it employes datalad_next.utils.requests_auth.DataladAuth, an adaptor that consults the DataLad credential system in order to fulfill HTTP authentication challenges.

download(from_url: str, to_path: Path | None, *, credential: str | None = None, hash: list[str] | None = None, timeout: float | None = None) Dict[source]

Download via HTTP GET request

See datalad_next.url_operations.UrlOperations.download() for parameter documentation and exception behavior.

Raises:

UrlOperationsResourceUnknown -- For download targets found absent.

get_headers(headers: Dict | None = None) Dict[source]
probe_url(url, timeout=10.0, headers=None)[source]

Probe a HTTP(S) URL for redirects and authentication needs

This functions performs a HEAD request against the given URL, while waiting at most for the given timeout duration for a server response.

Parameters:
  • url (str) -- URL to probe

  • timeout (float, optional) -- Maximum time to wait for a server response to the probe

  • headers (dict, optional) -- Any custom headers to use for the probe request. If none are provided, or the provided headers contain no 'user-agent' field, the default DataLad user agent is added automatically.

Returns:

The first value is the URL against the final request was performed, after following any redirects and applying normalizations.

The second value is a mapping with a particular set of properties inferred from probing the webserver. The following key-value pairs are supported:

  • 'is_redirect' (bool), True if any redirection occurred. This boolean property is a more accurate test than comparing input and output URL

  • 'status_code' (int), HTTP response code (of the final request in case of redirection).

  • 'auth' (dict), present if the final server response contained any 'www-authenticate' headers, typically the case for 401 responses. The dict contains a mapping of server-reported authentication scheme names (e.g., 'basic', 'bearer') to their respective properties (dict). These can be any nature and number, depending on the respective authentication scheme. Most notably, they may contain a 'realm' property that can be used to determine suitable credentials for authentication.

Return type:

str or None, dict

Raises:

requests.RequestException -- May raise any exception of the requests package, most notably ConnectionError, Timeout, TooManyRedirects, etc.

stat(url: str, *, credential: str | None = None, timeout: float | None = None) Dict[source]

Gather information on a URL target, without downloading it

See datalad_next.url_operations.UrlOperations.stat() for parameter documentation and exception behavior.

Raises:

UrlOperationsResourceUnknown -- For access targets found absent.