impuls.errors

exception impuls.errors.DataError

Bases: ValueError

DataError represents any error related to incorrect input data.

The main point of DataErrors is that it may be caught and any underlying process may continue. Thus, any process raising DataError must not leave the pipeline in an undefined state.

exception impuls.errors.InputNotModified

Bases: Exception

InputNotModified is raised by the Pipeline when no resources have changed, preventing pointless processing of the same data.

exception impuls.errors.MultipleDataErrors(when: str, errors: list[DataError])

Bases: DataError

MultipleDataErrors is raised when a process encounters a non-zero amount of DataErrors.

For most use cases the catch_all helper can be used to catch any DataErrors that might be encountered.

classmethod catch_all(context: str, may_raise_data_error: Iterable[_T]) list[_T]

catch_all takes a generator that may raise DataError when retrieving the next item; and catches all the DataErrors to raise a single MultipleDataErrors once the generator is exhausted.

If no DataErrors were thrown, returns all non-None items from the generator.

Any other Exception is passed through to the caller.

Note that may_raise_data_error must not be a generator expression, and should usually be a map object. Generator expressions stop at the first raised exception, making the whole endeavor useless.

>>> def some_function(x: int) -> int:
...    if x % 5 == 0:
...        raise DataError(f"Oh no, got {x}")
...    return x
>>> MultipleDataErrors.catch_all("foo", map(some_function, range(1, 5)))
[1, 2, 3, 4]
>>> MultipleDataErrors.catch_all("foo", map(some_function, range(1, 6)))
Traceback (most recent call last):
...
impuls.errors.MultipleDataErrors: 1 error(s) encountered during foo:
    Oh no, got 5
>>> MultipleDataErrors.catch_all("foo", map(some_function, range(1, 11)))
Traceback (most recent call last):
...
impuls.errors.MultipleDataErrors: 2 error(s) encountered during foo:
    Oh no, got 5
    Oh no, got 10
errors: list[DataError]
exception impuls.errors.ResourceNotCached(resource_name: str)

Bases: DataError

ResourceNotCached is raised by the Pipeline run with the from_cache option on when a Resource is not available locally.

resource_name: str