datalad_next.itertools.route_out

datalad_next.itertools.route_out(iterable: Iterable, data_store: list, splitter: Callable[[Any], tuple[Any, Any]]) Generator[source]

Route data around the consumer of this iterable

route_out() allows its user to:

  1. store data that is received from an iterable,

  2. determine whether this data should be yielded to a consumer of route_out, by calling splitter().

To determine which data is to be yielded to the consumer and which data should only be stored but not yielded, route_out() calls splitter(). splitter() is called for each item of the input iterable, with the item as sole argument. The function should return a tuple of two elements. The first element is the data that is to be yielded to the consumer. The second element is the data that is to be stored in the list data_store. If the first element of the tuple is datalad_next.itertools.StoreOnly, no data is yielded to the consumer.

route_in() can be used to combine data that was previously stored by route_out() with the data that is yielded by route_out() and with the data the was not processed, i.e. not yielded by route_out().

The items yielded by route_in() will be in the same order in which they were passed into route_out(), including the items that were not yielded by route_out() because splitter() returned StoreOnly in the first element of the result-tuple.

The combination of the two functions route_out() and route_in() can be used to "carry" additional data along with data that is processed by iterators. And it can be used to route data around iterators that cannot process certain data.

For example, a user has an iterator to divide the number 2 by all numbers in a list. The user wants the iterator to process all numbers in a divisor list, except from zeros, In this case route_out() and route_in() can be used as follows:

from math import nan
from datalad_next.itertools import route_out, route_in, StoreOnly

def splitter(divisor):
    # if divisor == 0, return `StoreOnly` in the first element of the
    # result tuple to indicate that route_out should not yield this
    # element to its consumer
    return (StoreOnly, divisor) if divisor == 0 else (divisor, divisor)

def joiner(processed_data, stored_data):
    #
    return nan if processed_data is StoreOnly else processed_data

divisors = [0, 1, 0, 2, 0, 3, 0, 4]
store = list()
r = route_in(
    map(
        lambda x: 2.0 / x,
        route_out(
            divisors,
            store,
            splitter
        )
    ),
    store,
    joiner
)
print(list(r))

The example about will print [nan, 2.0, nan, 1.0, nan, 0.6666666666666666, nan, 0.5].

Parameters:
  • iterable (Iterable) -- The iterable that yields the input data

  • data_store (list) -- The list that is used to store the data that is routed out

  • splitter (Callable[[Any], tuple[Any, Any | None]]) -- The function that is used to determine which part of the input data, if any, is to be yielded to the consumer and which data is to be stored in the list data_store. The function is called for each item of the input iterable with the item as sole argument. It should return a tuple of two elements. If the first element is not datalad_next.itertools.StoreOnly, it is yielded to the consumer. If the first element is datalad_next.itertools.StoreOnly, nothing is yielded to the consumer. The second element is stored in the list data_store. The cardinality of data_store will be the same as the cardinality of the input iterable.