datalad_next.url_operations.UrlOperations
- class datalad_next.url_operations.UrlOperations(*, cfg: ConfigManager | None = None)[source]
Bases:
object
Abstraction for operations on URLs
Support for specific URL schemes can be implemented via sub-classes. Such classes must comply with the following conditions:
Any configuration look-up must be performed with the self.cfg property, which is guaranteed to be a ConfigManager instance.
When downloads are to be supported, implement the download() method and comply with the behavior described in its documentation.
This class provides a range of helper methods to aid computation of hashes and progress reporting.
- property cfg: ConfigManager
- delete(url: str, *, credential: str | None = None, timeout: float | None = None) Dict [source]
Delete a resource identified by a URL
- Parameters:
url (str) -- Valid URL with any scheme supported by a particular implementation.
credential (str, optional) -- The name of a dedicated credential to be used for authentication in order to perform the deletion. Particular implementations may or may not require or support authentication. They also may or may not support automatic credential lookup.
timeout (float, optional) -- If given, specifies a timeout in seconds. If the operation is not completed within this time, it will raise a TimeoutError-exception. If timeout is None, the operation will never timeout.
- Returns:
A mapping of property names to values for the deletion.
- Return type:
dict
- Raises:
UrlOperationsRemoteError -- This exception is raised on any deletion-related error on the remote side, with a summary of the underlying issues as its message. It may carry a status code (e.g. HTTP status code) as its
status_code
property. Any underlying exception must be linked via the __cause__ property (e.g. raise UrlOperationsRemoteError(...) from ...).UrlOperationsResourceUnknown -- Implementations that can distinguish several remote error types beyond indication a general
UrlOperationsRemoteError
:UrlOperationsInteractionError
general issues in communicating with the remote side;UrlOperationsAuthenticationError
for errors related to (failed) authentication at the remote;UrlOperationsAuthorizationError
for (lack of) authorizating to access a particular resource of perform a particular operation;UrlOperationsResourceUnknown
if the target of an operation does not exist.TimeoutError -- If timeout is given and the operation does not complete within the number of seconds that a specified by timeout.
- download(from_url: str, to_path: Path | None, *, credential: str | None = None, hash: list[str] | None = None, timeout: float | None = None) Dict [source]
Download from a URL to a local file or stream to stdout
- Parameters:
from_url (str) -- Valid URL with any scheme supported by a particular implementation.
to_path (Path or None) -- A local platform-native path or None. If None the downloaded data is written to stdout, otherwise it is written to a file at the given path. The path is assumed to not exist. Any existing file will be overwritten.
credential (str, optional) -- The name of a dedicated credential to be used for authentication in order to perform the download. Particular implementations may or may not require or support authentication. They also may or may not support automatic credential lookup.
hash (list(algorithm_names), optional) -- If given, must be a list of hash algorithm names supported by the hashlib module. A corresponding hash will be computed simultaneously to the download (without reading the data twice), and included in the return value.
timeout (float, optional) -- If given, specifies a timeout in seconds. If the operation is not completed within this time, it will raise a TimeoutError-exception. If timeout is None, the operation will never timeout.
- Returns:
A mapping of property names to values for the completed download. If hash algorithm names are provided, a corresponding key for each algorithm is included in this mapping, with the hexdigest of the corresponding checksum as the value.
- Return type:
dict
- Raises:
UrlOperationsRemoteError -- This exception is raised on any deletion-related error on the remote side, with a summary of the underlying issues as its message. It may carry a status code (e.g. HTTP status code) as its
status_code
property. Any underlying exception must be linked via the __cause__ property (e.g. raise UrlOperationsRemoteError(...) from ...).UrlOperationsResourceUnknown -- Implementations that can distinguish several remote error types beyond indication a general
UrlOperationsRemoteError
:UrlOperationsInteractionError
general issues in communicating with the remote side;UrlOperationsAuthenticationError
for errors related to (failed) authentication at the remote;UrlOperationsAuthorizationError
for (lack of) authorizating to access a particular resource of perform a particular operation;UrlOperationsResourceUnknown
if the target of an operation does not exist.TimeoutError -- If timeout is given and the operation does not complete within the number of seconds that a specified by timeout.
- stat(url: str, *, credential: str | None = None, timeout: float | None = None) Dict [source]
Gather information on a URL target, without downloading it
- Returns:
A mapping of property names to values of the URL target. The particular composition of properties depends on the specific URL. A standard property is 'content-length', indicating the size of a download.
- Return type:
dict
- Raises:
UrlOperationsRemoteError -- This exception is raised on any access-related error on the remote side, with a summary of the underlying issues as its message. It may carry a status code (e.g. HTTP status code) as its
status_code
property. Any underlying exception must be linked via the __cause__ property (e.g. raise UrlOperationsRemoteError(...) from ...).UrlOperationsResourceUnknown -- Implementations that can distinguish several remote error types beyond indication a general
UrlOperationsRemoteError
:UrlOperationsInteractionError
general issues in communicating with the remote side;UrlOperationsAuthenticationError
for errors related to (failed) authentication at the remote;UrlOperationsAuthorizationError
for (lack of) authorizating to access a particular resource of perform a particular operation;UrlOperationsResourceUnknown
if the target of an operation does not exist.TimeoutError -- If timeout is given and the operation does not complete within the number of seconds that a specified by timeout.
- upload(from_path: Path | None, to_url: str, *, credential: str | None = None, hash: list[str] | None = None, timeout: float | None = None) Dict [source]
Upload from a local file or stream to a URL
Whenever possible, uploads are performed atomically This means that the destination will never see a partially uploaded file. It will either see the previous content (or nothing) or the newly uploaded content. Note: this is not supported by all implementations of URL-operations.
- Parameters:
from_path (Path or None) -- A local platform-native path or None. If None the upload data is read from stdin, otherwise it is read from a file at the given path.
to_url (str) -- Valid URL with any scheme supported by a particular implementation. The target is assumed to not conflict with existing content, and may be overwritten.
credential (str, optional) -- The name of a dedicated credential to be used for authentication in order to perform the upload. Particular implementations may or may not require or support authentication. They also may or may not support automatic credential lookup.
hash (list(algorithm_names), optional) -- If given, must be a list of hash algorithm names supported by the hashlib module. A corresponding hash will be computed simultaneously to the upload (without reading the data twice), and included in the return value.
timeout (float, optional) -- If given, specifies a timeout in seconds. If the operation is not completed within this time, it will raise a TimeoutError-exception. If timeout is None, the operation will never timeout.
- Returns:
A mapping of property names to values for the completed upload. If hash algorithm names are provided, a corresponding key for each algorithm is included in this mapping, with the hexdigest of the corresponding checksum as the value.
- Return type:
dict
- Raises:
FileNotFoundError -- If the source file cannot be found.
UrlOperationsRemoteError -- This exception is raised on any deletion-related error on the remote side, with a summary of the underlying issues as its message. It may carry a status code (e.g. HTTP status code) as its
status_code
property. Any underlying exception must be linked via the __cause__ property (e.g. raise UrlOperationsRemoteError(...) from ...).UrlOperationsResourceUnknown -- Implementations that can distinguish several remote error types beyond indication a general
UrlOperationsRemoteError
:UrlOperationsInteractionError
general issues in communicating with the remote side;UrlOperationsAuthenticationError
for errors related to (failed) authentication at the remote;UrlOperationsAuthorizationError
for (lack of) authorization to access a particular resource of perform a particular operation;UrlOperationsResourceUnknown
if the target of an operation does not exist.TimeoutError -- If timeout is given and the operation does not complete within the number of seconds that a specified by timeout.