impuls.tasks.modify_from_csv¶
- class impuls.tasks.modify_from_csv.ModifyStopsFromCSV¶
- class impuls.tasks.modify_from_csv.ModifyRoutesFromCSV¶
- class impuls.tasks.modify_from_csv.CSVFieldData(entity_field: str, converter: Callable[[str], Any] | None = None)¶
Bases:
objectCSVFieldData describes metadata about a field from CSV field with new data to be applied.
- converter: Callable[[str], Any] | None = None¶
- entity_field: str¶
- class impuls.tasks.modify_from_csv.ModifyFromCSV(resource: str, must_curate_all: bool = False, silent: bool = False)¶
Bases:
TaskModifyFromCSV is a base class for modifying entities from a given table with data from a CSV file.
See ModifyXXXFromCSV for table-specific options.
- Parameters:
resource (str) – name of the resource with data (in CSV).
must_curate_all (bool) – if True, then this task will fail if some entities weren’t curated. Defaults to False.
silent (bool) – if True, doesn’t warn every time an entity from CSV isn’t found in the DB.
- check_if_all_entities_were_curated(db: DBConnection) None¶
- clear_state() None¶
- abstract static csv_column_mapping() Mapping[str, CSVFieldData]¶
csv_field_mapping returns the mapping from a CSV column name to metadata about the column - the corresponding entity field and a converter from string to a value of an appropriate type.
- csv_rows(resource: ManagedResource) Iterator[tuple[int, Mapping[str, str]]]¶
csv_rows generates all rows from the provided resource
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- abstract static model_class() Type[Entity]¶
model_class returns the type from impuls.model whose entities are going to be modified
- abstract static primary_key_csv_column() str¶
primary_key_csv_field returns the CSV column name which contains the primary key
- abstract static query_for_all_ids() str¶
query_for_all_ids returns an SQL query string which returns all the known IDs of all entities of given type.
- try_curate(db: DBConnection, line_no: int, row: Mapping[str, str]) None¶
try_curate tries to curate a single entity corresponding to a CSV row
- missing_ids: set[str]¶
missing_ids contains IDs of all entities from the provided resource which were missing in the database. Cleared on every entry to
execute().
- must_curate_all: bool¶
- resource: str¶
- seen_ids: set[str]¶
seen_ids contains IDs of all curated entities. Cleared on every entry to
execute().
- silent: bool¶