impuls.tasks¶
- class impuls.tasks.AddEntity(entity: Entity, task_name: str = 'AddEntity')¶
Bases:
TaskAddEntity is a simple task that adds the provided entity to the DB.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- class impuls.tasks.ExecuteSQL(task_name: str, statement: str)¶
Bases:
TaskExecuteSQL task simply executes the provided statement.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- statement: str¶
- class impuls.tasks.GenerateTripHeadsign(name: str | None = None)¶
Bases:
TaskGenerateTripHeadsign is a task which fills the trip_headsign field for all
Tripswhich don’t already have a headsign.The generated headsign is the name of the last stop of the trip. This step will break if there are trips without any stops.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- class impuls.tasks.LoadBusManMDB(resource: str, agency_id: str, ignore_route_id: bool = False, ignore_stop_id: bool = False)¶
Bases:
TaskLoadBusManMDB loads data into the database from a BusMan MDB database.
Only the following entities are loaded:
Route,Stop,Calendar,TripandStopTime.Agencyhas to be manually curated beforehand (e.g. withAddEntitytask).The imported
Calendarentities will be empty after the import. Providing dates for calendars must be done manually afterwards.Most MDB databases seen in the wild have no stop positions. This step will set the latitude and longitude to 0. Further curation is usually necessary.
Parameters:
resource: name of the resource with MDB fileagency_id: ID of the manually curated Agencyignore_route_id: use route_short_name as the ID, instead of the BusMan internal IDignore_stop_id: use stop_code as the ID, instead of the BusMan internal ID
This task additionally requires mdbtools to be installed. This package is available in most package managers.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- load_calendars(mdb_path: Path, db: DBConnection) None¶
- load_routes(mdb_path: Path, db: DBConnection) None¶
- load_stop_times(mdb_path: Path, db: DBConnection) None¶
- load_stops(mdb_path: Path, db: DBConnection) None¶
- load_trips(mdb_path: Path, db: DBConnection) None¶
- agency_id: str¶
- ignore_route_id: bool¶
- ignore_stop_id: bool¶
- resource: str¶
- class impuls.tasks.LoadGTFS(resource: str)¶
Bases:
TaskLoadGTFS attempts to load GTFS data from a ZIP archive.
The loader only supports a subset of the GTFS schema. Due to implementation details, some invalid values may be accepted and some valid values may be rejected. In particular:
stops.txt location_types 3, 4, and 5 will cause an error,
parent_station may only refer to stop_ids defined in earlier lines,
agency_id in fare_attributes.txt is required if it’s present in agency.txt, even if there’s only one agency defined in the dataset.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- extract_gtfs_to(zip_path: str | PathLike[str], dir_path: str | PathLike[str]) None¶
- resource: str¶
- class impuls.tasks.ModifyRoutesFromCSV(resource: str, must_curate_all: bool = False, silent: bool = False)¶
Bases:
ModifyFromCSVModifyStopsFromCSV implements the
ModifyFromCSVstep forRoutes.The CSV file pointed by the provided resource must have a header row and must have a
route_idfield.The following fields may be present, and will be used to update the metadata of the matching
Route:route_short_nameroute_long_nameroute_typeroute_colorroute_text_colorroute_sort_order
- Parameters:
resource (str) – name of the resource with data (in CSV).
must_curate_all (bool) – if True, then this task will fail if some entities weren’t curated. Defaults to False.
silent (bool) – if True, doesn’t warn every time an entity from CSV isn’t found in the DB.
- static csv_column_mapping() Mapping[str, CSVFieldData]¶
csv_field_mapping returns the mapping from a CSV column name to metadata about the column - the corresponding entity field and a converter from string to a value of an appropriate type.
- static model_class() Type[Entity]¶
model_class returns the type from impuls.model whose entities are going to be modified
- static primary_key_csv_column() str¶
primary_key_csv_field returns the CSV column name which contains the primary key
- static query_for_all_ids() str¶
query_for_all_ids returns an SQL query string which returns all the known IDs of all entities of given type.
- class impuls.tasks.ModifyStopsFromCSV(resource: str, must_curate_all: bool = False, silent: bool = False)¶
Bases:
ModifyFromCSVModifyStopsFromCSV implements the
ModifyFromCSVstep forStops.The CSV file pointed by the provided resource must have a header row and must have a
stop_idfield.The following fields may be present, and will be used to update the metadata of the matching
Stop:stop_namestop_codestop_latstop_lonzone_idwheelchair_boardingplatform_code
- Parameters:
resource (str) – name of the resource with data (in CSV).
must_curate_all (bool) – if True, then this task will fail if some entities weren’t curated. Defaults to False.
silent (bool) – if True, doesn’t warn every time an entity from CSV isn’t found in the DB.
- static csv_column_mapping() Mapping[str, CSVFieldData]¶
csv_field_mapping returns the mapping from a CSV column name to metadata about the column - the corresponding entity field and a converter from string to a value of an appropriate type.
- static model_class() Type[Entity]¶
model_class returns the type from impuls.model whose entities are going to be modified
- static primary_key_csv_column() str¶
primary_key_csv_field returns the CSV column name which contains the primary key
- static query_for_all_ids() str¶
query_for_all_ids returns an SQL query string which returns all the known IDs of all entities of given type.
- class impuls.tasks.RemoveUnusedEntities¶
Bases:
TaskRemoveUnusedEntities removes entities from the database which serve no purpose:
Calendarswithout any active dates,Stops(withLocationType.STOP) with noStopTimes,Stations(withLocationType.STATION) with no childStops,Routeswith noTrips <impuls.model.Trip,
- drop_agencies_without_routes(db: DBConnection) None¶
- drop_calendars_without_dates(db: DBConnection) None¶
- drop_calendars_without_trips(db: DBConnection) None¶
- drop_routes_without_trips(db: DBConnection) None¶
- drop_stations_without_stops(db: DBConnection) None¶
- drop_stops_without_stop_times(db: DBConnection) None¶
- drop_trips_with_at_most_one_stop(db: DBConnection) None¶
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- class impuls.tasks.SaveDB(to: str | PathLike[str])¶
Bases:
TaskSaveDB saves the contained data as-is to a database at a provided path.
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- to: Path¶
- class impuls.tasks.SaveGTFS(headers: Mapping[str, Sequence[str]], target: str | PathLike[str], emit_empty_calendars: bool = False)¶
Bases:
TaskSaveGTFS exports the contained data to as a GTFS zip file at the provided path.
headersis a mapping from a GTFS table (excluding the .txt extension) to a sequence of colum names. SaveGTFS doesn’t validate the provided mapping, so the caller must ensure all required columns and files are provided.When
emit_empty_calendarsis set to True (default is False), empty calendars will still be generated in the calendar.txt file.- create_zip(dir: str | PathLike[str]) None¶
- dump_tables(gtfs_dir: str | PathLike[str], db: DBConnection) None¶
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- emit_empty_calendars: bool¶
- headers: Mapping[str, Sequence[str]]¶
- target: Path¶
- class impuls.tasks.TruncateCalendars(target: EmptyDateRange | InfiniteDateRange | LeftUnboundedDateRange | RightUnboundedDateRange | BoundedDateRange, fail_on_empty: bool = True)¶
Bases:
TaskTruncateCalendars removes any services beyond the provided range.
For simplicity, all
Calendarsare converted to exception-based (all active dates represented byCalendarException).- apply_changes(db: DBConnection) None¶
- check_if_empty() None¶
- clear_state() None¶
- compute_changes(db: DBConnection) None¶
- compute_truncated_days_of(calendar: Calendar, db: DBConnection) set[Date]¶
- drop_calendars(db: DBConnection) None¶
- execute(r: TaskRuntime) None¶
execute process the data in the runtime environment.
As of now, Tasks are guaranteed to run in a single thread with a single runtime, but execute may be called multiple times in different runtime. Thus, it is safe for Task implementations to hold some execute-related state, but that state should be reset on entry to execute.
- make_all_calendars_use_exceptions(db: DBConnection) None¶
- set_exceptions_on_calendars(db: DBConnection) None¶
- fail_on_empty: bool¶