datalad.cmd

Wrapper for command and function calls, allowing for dry runs and output handling

class datalad.cmd.BatchedCommand(cmd, path=None, output_proc=None)[source]

Bases: datalad.cmd.SafeDelCloseMixin

Container for a process which would allow for persistent communication

close(return_stderr=False)[source]

Close communication and wait for process to terminate

Returns:stderr output if return_stderr and stderr file was there. None otherwise
Return type:str
proc1(arg)[source]

Same as __call__, but only takes a single command argument

and returns a single result.

yield_(cmds)[source]

Same as __call__, but requires cmds to be an iterable

and yields results for each item.

class datalad.cmd.GitRunner(*args, **kwargs)[source]

Bases: datalad.cmd.Runner, datalad.cmd.GitRunnerBase

A Runner for git and git-annex commands.

See GitRunnerBase it mixes in for more details

run(cmd, env=None, *args, **kwargs)[source]

Runs the command cmd using shell.

In case of dry-mode cmd is just added to commands and it is actually executed otherwise. Allows for separately logging stdout and stderr or streaming it to system’s stdout or stderr respectively.

Note: Using a string as cmd and shell=True allows for piping,
multiple commands, etc., but that implies split_cmdline() is not used. This is considered to be a security hazard. So be careful with input.
Parameters:
  • cmd (str, list) – String (or list) defining the command call. No shell is used if cmd is specified as a list
  • log_stdout (bool, optional) – If True, stdout is logged. Goes to sys.stdout otherwise.
  • log_stderr (bool, optional) – If True, stderr is logged. Goes to sys.stderr otherwise.
  • log_online (bool, optional) – Whether to log as output comes in. Setting to True is preferable for running user-invoked actions to provide timely output
  • expect_stderr (bool, optional) – Normally, having stderr output is a signal of a problem and thus it gets logged at level 11. But some utilities, e.g. wget, use stderr for their progress output. Whenever such output is expected, set it to True and output will be logged at level 5 unless exit status is non-0 (in non-online mode only, in online – would log at 5)
  • expect_fail (bool, optional) – Normally, if command exits with non-0 status, it is considered an error and logged at level 11 (above DEBUG). But if the call intended for checking routine, such messages are usually not needed, thus it will be logged at level 5.
  • cwd (string, optional) – Directory under which run the command (passed to Popen)
  • env (string, optional) – Custom environment to pass
  • shell (bool, optional) – Run command in a shell. If not specified, then it runs in a shell only if command is specified as a string (not a list)
  • stdin (file descriptor) – input stream to connect to stdin of the process.
Returns:

Return type:

(stdout, stderr) - bytes!

Raises:

CommandError – if command’s exitcode wasn’t 0 or None. exitcode is passed to CommandError’s code-field. Command’s stdout and stderr are stored in CommandError’s stdout and stderr fields respectively.

class datalad.cmd.GitRunnerBase[source]

Bases: object

Mix-in class for Runners to be used to run git and git annex commands

Overloads the runner class to check & update GIT_DIR and GIT_WORK_TREE environment variables set to the absolute path if is defined and is relative path

static get_git_environ_adjusted(env=None)[source]

Replaces GIT_DIR and GIT_WORK_TREE with absolute paths if relative path and defined

class datalad.cmd.GitWitlessRunner(*args, **kwargs)[source]

Bases: datalad.cmd.WitlessRunner, datalad.cmd.GitRunnerBase

A WitlessRunner for git and git-annex commands.

See GitRunnerBase it mixes in for more details

class datalad.cmd.KillOutput(done_future)[source]

Bases: datalad.cmd.WitlessProtocol

WitlessProtocol that swallows stdout/stderr of a subprocess

pipe_data_received(fd, data)[source]

Called when the subprocess writes data into stdout/stderr pipe.

fd is int file descriptor. data is bytes object.

proc_err = True
proc_out = True
class datalad.cmd.NoCapture(done_future)[source]

Bases: datalad.cmd.WitlessProtocol

WitlessProtocol that captures no subprocess output

As this is identical with the behavior of the WitlessProtocol base class, this class is merely a more readable convenience alias.

class datalad.cmd.Runner(cwd=None, env=None, protocol=None, log_outputs=None)[source]

Bases: object

Provides a wrapper for calling functions and commands.

An object of this class provides a methods that calls shell commands or python functions, allowing for protocolling the calls and output handling.

Outputs (stdout and stderr) can be either logged or streamed to system’s stdout/stderr during execution. This can be enabled or disabled for both of them independently. Additionally, a protocol object can be a used with the Runner. Such a protocol has to implement datalad.support.protocol.ProtocolInterface, is able to record calls and allows for dry runs.

call(f, *args, **kwargs)[source]

Helper to unify collection of logging all “dry” actions.

Calls f if Runner-object is not in dry-mode. Adds f along with its arguments to commands otherwise.

Parameters:f (callable) –
commands
cwd
dry
env
log(msg, *args, **kwargs)[source]

log helper

Logs at level 5 by default and adds “Protocol:”-prefix in order to log the used protocol.

log_cwd
log_env
log_outputs
log_stdin
protocol
run(cmd, log_stdout=True, log_stderr=True, log_online=False, expect_stderr=False, expect_fail=False, cwd=None, env=None, shell=None, stdin=None)[source]

Runs the command cmd using shell.

In case of dry-mode cmd is just added to commands and it is actually executed otherwise. Allows for separately logging stdout and stderr or streaming it to system’s stdout or stderr respectively.

Note: Using a string as cmd and shell=True allows for piping,
multiple commands, etc., but that implies split_cmdline() is not used. This is considered to be a security hazard. So be careful with input.
Parameters:
  • cmd (str, list) – String (or list) defining the command call. No shell is used if cmd is specified as a list
  • log_stdout (bool, optional) – If True, stdout is logged. Goes to sys.stdout otherwise.
  • log_stderr (bool, optional) – If True, stderr is logged. Goes to sys.stderr otherwise.
  • log_online (bool, optional) – Whether to log as output comes in. Setting to True is preferable for running user-invoked actions to provide timely output
  • expect_stderr (bool, optional) – Normally, having stderr output is a signal of a problem and thus it gets logged at level 11. But some utilities, e.g. wget, use stderr for their progress output. Whenever such output is expected, set it to True and output will be logged at level 5 unless exit status is non-0 (in non-online mode only, in online – would log at 5)
  • expect_fail (bool, optional) – Normally, if command exits with non-0 status, it is considered an error and logged at level 11 (above DEBUG). But if the call intended for checking routine, such messages are usually not needed, thus it will be logged at level 5.
  • cwd (string, optional) – Directory under which run the command (passed to Popen)
  • env (string, optional) – Custom environment to pass
  • shell (bool, optional) – Run command in a shell. If not specified, then it runs in a shell only if command is specified as a string (not a list)
  • stdin (file descriptor) – input stream to connect to stdin of the process.
Returns:

Return type:

(stdout, stderr) - bytes!

Raises:

CommandError – if command’s exitcode wasn’t 0 or None. exitcode is passed to CommandError’s code-field. Command’s stdout and stderr are stored in CommandError’s stdout and stderr fields respectively.

class datalad.cmd.SafeDelCloseMixin[source]

Bases: object

A helper class to use where __del__ would call .close() which might fail if “too late in GC game”

class datalad.cmd.StdErrCapture(done_future)[source]

Bases: datalad.cmd.WitlessProtocol

WitlessProtocol that only captures and returns stderr of a subprocess

proc_err = True
class datalad.cmd.StdOutCapture(done_future)[source]

Bases: datalad.cmd.WitlessProtocol

WitlessProtocol that only captures and returns stdout of a subprocess

proc_out = True
class datalad.cmd.StdOutErrCapture(done_future)[source]

Bases: datalad.cmd.WitlessProtocol

WitlessProtocol that captures and returns stdout/stderr of a subprocess

proc_err = True
proc_out = True
class datalad.cmd.WitlessProtocol(done_future)[source]

Bases: asyncio.protocols.SubprocessProtocol

Subprocess communication protocol base class for run_async_cmd

This class implements basic subprocess output handling. Derived classes like StdOutCapture should be used for subprocess communication that need to capture and return output. In particular, the pipe_data_received() method can be overwritten to implement “online” processing of process output.

This class defines a default return value setup that causes run_async_cmd() to return a 2-tuple with the subprocess’s exit code and a list with bytestrings of all captured output streams.

FD_NAMES = ['stdin', 'stdout', 'stderr']
connection_made(transport)[source]

Called when a connection is made.

The argument is the transport representing the pipe connection. To receive data, wait for data_received() calls. When the connection is closed, connection_lost() is called.

pipe_data_received(fd, data)[source]

Called when the subprocess writes data into stdout/stderr pipe.

fd is int file descriptor. data is bytes object.

proc_err = None
proc_out = None
process_exited()[source]

Called when subprocess has exited.

class datalad.cmd.WitlessRunner(cwd=None, env=None)[source]

Bases: object

Minimal Runner with support for online command output processing

It aims to be as simple as possible, providing only essential functionality.

cwd
env
run(cmd, protocol=None, stdin=None, cwd=None, env=None, **kwargs)[source]

Execute a command and communicate with it.

Parameters:
  • cmd (list) – Sequence of program arguments. Passing a single string means that it is simply the name of the program, no complex shell commands are supported.
  • protocol (WitlessProtocol, optional) – Protocol class handling interaction with the running process (e.g. output capture). A number of pre-crafted classes are provided (e.g KillOutput, NoCapture, GitProgress).
  • stdin (byte stream, optional) – File descriptor like, used as stdin for the process. Passed verbatim to subprocess.Popen().
  • cwd (path-like, optional) – If given, commands are executed with this path as PWD, the PWD of the parent process is used otherwise. Overrides any cwd given to the constructor.
  • env (dict, optional) – Environment to be used for command execution. If cwd was given, ‘PWD’ in the environment is set to its value. This must be a complete environment definition, no values from the current environment will be inherited. Overrides any env given to the constructor.
  • kwargs – Passed to the Protocol class constructor.
Returns:

At minimum there will be keys ‘stdout’, ‘stderr’ with unicode strings of the cumulative standard output and error of the process as values.

Return type:

dict

Raises:
  • CommandError – On execution failure (non-zero exit code) this exception is raised which provides the command (cmd), stdout, stderr, exit code (status), and a message identifying the failed command, as properties.
  • FileNotFoundError – When a given executable does not exist.
datalad.cmd.readline_rstripped(stdout)[source]
datalad.cmd.run_async_cmd(loop, cmd, protocol, stdin, protocol_kwargs=None, **kwargs)[source]

Run a command in a subprocess managed by asyncio

This implementation has been inspired by https://pymotw.com/3/asyncio/subprocesses.html

Parameters:
  • loop (asyncio.AbstractEventLoop) – asyncio event loop instance. Must support subprocesses on the target platform.
  • cmd (list) – Command to be executed, passed to subprocess_exec.
  • protocol (WitlessProtocol) – Protocol class to be instantiated for managing communication with the subprocess.
  • stdin (file-like or None) – Passed to the subprocess as its standard input.
  • protocol_kwargs (dict, optional) – Passed to the Protocol class constructor.
  • kwargs (Pass to subprocess_exec, will typically be parameters) – supported by subprocess.Popen.
Returns:

The nature of the return value is determined by the given protocol class.

Return type:

undefined

datalad.cmd.run_gitcommand_on_file_list_chunks(func, cmd, files, *args, **kwargs)[source]

Run a git command multiple times if files is too long

Parameters:
  • func (callable) – Typically a Runner.run variant. Assumed to return a 2-tuple with stdout and stderr as strings.
  • cmd (list) – Base Git command argument list, to be amended with ‘–’, followed by a file list chunk.
  • files (list) – List of files.
  • kwargs (args,) – Passed to func
Returns:

Concatenated stdout and stderr.

Return type:

str, str