SimTabLegs

class passengersim.summaries.legs.SimTabLegs(data: dict[str, pd.DataFrame] = None, *, config: Config | None = None, cnx: Database | None = None, sim: Simulation | None = None, n_total_samples: int = 0, items: Collection[str] = (), callback_data: CallbackData | None = None)[source]

Bases: GenericSimulationTables

Container for summary tables and figures extracted from a Simulation.

This class is a subclass of GenericSimulationTables, which is defined in the generic module. It lists the items that are available in the SimulationTables class, and provides type hints and (optionally, but ideally) documentation for the data that is stored in each item.

Methods

__init__([data, config, cnx, sim, ...])

aggregate(summaries)

Aggregate multiple summary tables.

extract(sim[, items])

Extract summary data from a Simulation.

fig_leg_bid_price_detail_rake(*, leg_id[, ...])

fig_leg_bid_price_history(carrier, *, measure)

fig_leg_booking_detail_rake(*, leg_id[, ...])

fig_leg_load_factor_distribution([...])

Figure showing the distribution of leg load factors.

fig_leg_load_v_distance(*[, orig, dest, ...])

fig_leg_load_v_local(*[, orig, dest, place, ...])

Figure showing the relationship between leg load factor and local share.

fig_leg_local_share_distribution([...])

Figure showing the distribution of leg local shares.

file_info()

Return information about the file store.

from_file(filename[, read_latest, lazy])

Load the object from a file.

from_pickle(filename[, read_latest])

Load the object from a pickle file.

metadata([key])

Return a metadata value.

remove_data(keys)

Remove data from the summary tables.

run_queries([cnx, items, scenario, burn_samples])

Query summary data from a Database.

save(filename, *[, timestamp, make_dirs, ...])

Save the object to a set of files.

subclasses()

Return a list of all concrete subclasses.

to_file(filename[, add_timestamp_ext, ...])

Write simulation tables to a file.

to_html(filename, *[, cfg, make_dirs, ...])

Write simulation tables report summary to html.

to_pickle(filename[, add_timestamp_ext, ...])

Save to a pickle file.

to_xlsx(filename)

Write simulation tables to excel.

Attributes

callback_data

config

leg_bid_price_detail

leg_booking_detail

leg_defs

A DataFrame containing the definitions of the legs in the simulation.

leg_detail

Sample / DCP level detail for legs - a lot of data

legs

Leg-level summary data.

legs_

A DataFrame containing the leg summary data, merged with the leg definitions.

local_fraction_by_place

The local share of passengers by carrier and place.

cnx

Database connection for the Simulation run.

sim

Simulation object for the Simulation run.

n_total_samples

Total number of sample departures simulated to create these summaries.

meta_summaries

Summaries that were aggregated to create this summary.

legs : pd.DataFrame

Leg-level summary data.

property leg_defs

A DataFrame containing the definitions of the legs in the simulation.

This DataFrame is constructed from the leg definitions defined in the simulation config, and does not depend on the simulation results.

Returns:

pd.DataFrame

property legs_

A DataFrame containing the leg summary data, merged with the leg definitions.

This DataFrame is constructed by merging the legs DataFrame with the leg_defs DataFrame, so it includes all the summary data for each leg, as well as all the attributes of each leg defined in the config.

Returns:

pd.DataFrame

property local_fraction_by_place : DataFrame

The local share of passengers by carrier and place.

The index of this DataFrame contains all possible places, and the columns contain the carriers.

For each carrier and place, this is the percentage of leg passengers on legs arriving or departing from that place that are local passengers (i.e. not connecting passengers). Passengers are considered connecting whether the connection is at this place, or at another place.

If a carrier does not operate any legs to or from a place, or if legs are operated but no passengers are booked (which probably indicates a config error), the local share is NaN.

Returns:

pd.DataFrame

fig_leg_load_factor_distribution(by_carrier: bool | str = True, breakpoints: Collection[int] = None, normalize: bool = False, *, raw_df: bool = False, also_df: bool = False) alt.Chart | pd.DataFrame | tuple[alt.Chart, pd.DataFrame][source]

Figure showing the distribution of leg load factors.

Parameters:
by_carrier : bool or str, default True

If True, show the distribution by carrier. If a string, show the distribution for that carrier. If False, show the distribution aggregated over all carriers.

breakpoints : Collection[int, ...], default (25, 30, 35, 40, ..., 90, 95, 100)

The breakpoints for the load factor ranges, which represent the lowest load factor value in each bin. The first and last breakpoints are always bounded to 0 and 101, respectively; these bounds can be included explicitly or omitted to be included implicitly. Setting the top value to 101 ensures that the highest load factor value (100) is included in the last bin.

normalize : bool, default False

If True, normalize the frequency by the total number of legs for each carrier, so that the sum of the frequencies for each carrier is 1.

raw_df : bool, default False

Return the raw data for this figure as a pandas DataFrame, instead of generating the figure itself.

also_df : bool, default False

If True, return the raw data for this figure as a pandas DataFrame, in addition to the figure itself.

Returns:

alt.Chart or pd.DataFrame or tuple[alt.Chart, pd.DataFrame]

fig_leg_local_share_distribution(by_carrier: bool | str = True, breakpoints: Collection[int] = None, normalize: bool = False, *, raw_df=False, also_df: bool = False) alt.Chart | pd.DataFrame | tuple[alt.Chart, pd.DataFrame][source]

Figure showing the distribution of leg local shares.

The local share is the percentage of passengers on a leg that are local to the leg’s origin and destination (i.e. not connecting).

Parameters:
by_carrier : bool or str, default True

If True, show the distribution by carrier. If a string, show the distribution for that carrier. If False, show the distribution aggregated over all carriers.

breakpoints : Collection[int, ...], default (0, 10, 20, ..., 90, 100)

The breakpoints for the load factor ranges, which represent the lowest load factor value in each bin. The first and last breakpoints are always bounded to 0 and 101, respectively; these bounds can be included explicitly or omitted to be included implicitly. Setting the top value to 101 ensures that the highest load factor value (100) is included in the last bin.

normalize : bool, default False

If True, normalize the frequency by the total number of legs for each carrier, so that the sum of the frequencies for each carrier is 1.

raw_df : bool, default False

Return the raw data for this figure as a pandas DataFrame, instead of generating the figure itself.

also_df : bool, default False

If True, return the raw data for this figure as a pandas DataFrame, in addition to the figure itself.

Returns:

alt.Chart or pd.DataFrame

fig_leg_load_v_local(*, orig: str | None = None, dest: str | None = None, place: str | None = None, carrier: str | None = None, raw_df: bool = False, also_df: bool = False, facet_columns: int | None = 2, select_leg: bool = False) alt.Chart | pd.DataFrame[source]

Figure showing the relationship between leg load factor and local share.

Parameters:
orig : str or None, default None

Filter the data to only include legs with this origin.

dest : str or None, default None

Filter the data to only include legs with this destination.

place : str or None, default None

Filter the data to only include legs with this origin or destination.

carrier : str or None, default None

Filter the data to only include legs operated by this carrier.

raw_df : bool, default False

If True, return the raw data for this figure as a pandas DataFrame, instead of generating the figure itself.

also_df : bool, default False

If True, return the raw data for this figure as a pandas DataFrame, in addition to the figure itself.

facet_columns : int or None, default 2

The number of columns to use for faceting the plot by carrier. If None, all facets will appear on one row.

select_leg : bool, default False

If True, return an interactive widget that allows the user to select specific legs and view their path_legs. This feature is experimental and may change without notice.

Returns:

alt.Chart or pd.DataFrame

fig_leg_load_v_distance(*, orig: str | None = None, dest: str | None = None, place: str | None = None, carrier: str | None = None, raw_df: bool = False, also_df: bool = False, facet_columns: int | None = 2, beeswarm: int | tuple[int, float] = 0)[source]
leg_detail : pd.DataFrame

Sample / DCP level detail for legs - a lot of data

classmethod aggregate(summaries: Collection[GenericSimulationTables]) Self

Aggregate multiple summary tables.

property callback_data
property config
classmethod extract(sim: Simulation, items: Collection[str] = ()) Self

Extract summary data from a Simulation.

fig_leg_bid_price_detail_rake(*, leg_id: int, raw_df: bool = False, color: str = '#6a3d9a', mean_color: str | None = '#ff7f00')
fig_leg_bid_price_history(carrier: str, *, measure: 'mean' | 'q10' | 'q25' | 'q50' | 'q75' | 'q90' | 'median', haul_category_labels: tuple[str, ...] | None = ('a. Short: ', 'b. Medium: ', 'c. Long: ', 'd. Longest: '), opacity: float = 0.25, max_rows: int = 5000) alt.Chart
fig_leg_booking_detail_rake(*, leg_id: int, raw_df: bool = False, color: str = 'red')
file_info()

Return information about the file store.

classmethod from_file(filename: str | Path, read_latest: bool = True, lazy: bool = True)

Load the object from a file.

Parameters:
filename : str or Path-like

The filename to load the object from.

read_latest : bool, default True

If True, read the latest file matching the pattern.

lazy : bool, default True

If True, load the data lazily (as needed). Otherwise, load the data immediately.

classmethod from_pickle(filename: str | Path, read_latest: bool = True)

Load the object from a pickle file.

Parameters:
filename : str or Path-like

The filename to load the object from.

read_latest : bool, default True

If True, read the latest file matching the pattern.

property leg_bid_price_detail
property leg_booking_detail
metadata(key: str = '')

Return a metadata value.

remove_data(keys: Collection[str] | str) Self

Remove data from the summary tables.

This can be used to reduce the size of the summary tables when saving to a file, or to remove sensitive data before sharing the summary tables.

Parameters:
keys : Collection[str] or str

The key(s) of the data to remove.

Returns:

Self – The summary tables object, with the specified data removed.

run_queries(cnx: Database = None, items: Collection[str] | None = None, *, scenario: str = None, burn_samples: int | None = None) Self

Query summary data from a Database.

The requested items will be queried from the database and stored in this summary object. If the item is not available, an exception will be raised.

Parameters:
cnx : Database, optional

Database connection to use for querying.

items : Collection[str], optional

The items to query. If None, or if only “*” is given, then all available items will be queried.

scenario : str, optional

The scenario to use for querying.

burn_samples : int, optional

The number of burn samples to use for querying. If explicitly None, the burn_samples value from the configuration will be used if available, otherwise the default value of 100 will be used.

save(filename: str | Path, *, timestamp: float | struct_time | datetime | None = None, make_dirs: True | False | 'git' = True, cfg: Config | None = None, extra_html: tuple = ()) dict[str, Path]

Save the object to a set of files.

This method will write both an HTML report on this simulation tables object and a “.pxsim” file allowing the content to be restored.

Parameters:
filename : Path-like

The file stem to use for writing files.

timestamp : float or time.struct_time or datetime, optional

The timestamp to use for the filenames. If not provided, the current time will be used.

make_dirs : bool or "git", default True

If True, create the parent directory for the files if it does not already exist. If the directory is created, it will be created with a .gitignore file to prevent accidental inclusion of output in Git repositories, unless the value is “git”, in which case no .gitignore file is created and the results will be eligible for inclusion in Git.

cfg : Config, optional

The configuration to use for the HTML report. If None, the configuration from the simulation object will be used if available.

extra_html : tuple, optional

Additional data to include in the HTML report. This argument is passed to to_html, see that function for more details.

Returns:

dict – A dictionary of filenames written, including the timestamp added.

classmethod subclasses() list[type[GenericSimulationTables]]

Return a list of all concrete subclasses.

User defined subclasses (those not in the passengersim package) are at the front of the list, so they come first in MRO and thus can override native subclasses.

to_file(filename: str | Path, add_timestamp_ext: bool = True, *, preserve_config: bool = True, make_dirs: True | False | 'git' = True) Path

Write simulation tables to a file.

Parameters:
filename : Path-like

The file to write.

add_timestamp_ext : bool, default True

Add a timestamp extension to the filename.

preserve_config : bool, default True

Preserve the config attribute in the saved object. This includes the entire network, and can potentially be a lot of data.

make_dirs : bool or "git", default True

If True, create the parent directory for the file if it does not already exist. If the directory is created, it will be created with a .gitignore file to prevent accidental inclusion of output in Git repositories, unless the value is “git”, in which case no .gitignore file is created and the results will be eligible for inclusion in Git.

Returns:

Path-like – The resolved filename for the saved outputs.

to_html(filename: str | Path, *, cfg: Config | None = None, make_dirs: bool = True, extra: tuple = (), add_timestamp: bool = True) Path

Write simulation tables report summary to html.

Parameters:
filename : Path-like, optional

The html file to write.

cfg : Config, optional

The configuration to use for the report. If None, the configuration from the simulation object will be used.

make_dirs : bool, default True

If True, create any necessary directories.

extra : tuple, optional

Additional data to include in the report. Each item in the tuple should either a section or subsection title, or a tuple of (title, func), or just a function. If a function is provided, it should take the summary as its only argument and return a figure (altair.Chart or xmle.Elem) or table (pandas.DataFrame). The function will be called with the summary as its only argument. To use a function that requires other arguments, use functools.partial provide the other arguments.

add_timestamp : bool, default True

If True, append a timestamp to the filename. This ensures that each report is unique and does not overwrite previous reports. If False, the filename will be used as-is. Set this to False if you want to overwrite previous reports with the same filename, or if you are already setting the timestamp yourself.

Returns:

Path-like – The resolved filename for the saved outputs.

to_pickle(filename: str | Path, add_timestamp_ext: bool = True, *, preserve_meta_summaries: bool = False, preserve_config: bool = True, make_dirs: True | False | 'git' = True) Path

Save to a pickle file.

This method uses lz4 compression if the lz4.frame module is available.

Parameters:
filename : str or Path-like

The filename to save the object to. An extension map be added or modified, to optionally add a time stamp and/or compression flag.

add_timestamp_ext : bool, default True

Add a timestamp extension to the filename.

preserve_meta_summaries : bool, default False

Preserve the meta_summaries attribute in the saved object.

preserve_config : bool, default False

Preserve the config attribute in the saved object. This includes the entire network, and can potentially be a lot of data.

make_dirs : bool or "git", default True

If True, create the parent directory for the pickle file if it does not already exist. If the directory is created, it will be created with a .gitignore file to prevent accidental inclusion of pickled output in Git repositories, unless the value is “git”, in which case no .gitignore file is created and the results will be eligible for inclusion in Git.

Returns:

Path-like – The resolved filename for the saved outputs.

to_xlsx(filename: str | Path) None

Write simulation tables to excel.

Parameters:
filename : Path-like

The excel file to write.

cnx

Database connection for the Simulation run.

sim

Simulation object for the Simulation run.

n_total_samples

Total number of sample departures simulated to create these summaries.

This excludes any burn samples.

meta_summaries

Summaries that were aggregated to create this summary.