Counting Simulations¶
When we run a PassengerSim simulation, we are simulating the booking and operations of one or more airlines over a number of departure days. We abstract away many of the nuances of the real world, such as seasonality and specific calendar dates, instead just simulating the same schedule repeatedly.
A simulation run consists of a number of independent trials, and each trial is made up of a sequence of dependent samples – earlier samples in a trial are used to develop forecasts and train optimization algorithms used by carriers in later samples of the same trial.
The number of trials is set by the
num_trials configuration,
and the number of samples in each trial is set by
num_samples. Both values
can be found in the simulation_controls
configuration inputs.
We can think of a sample as a “typical” departure day. When generating results, the first few
samples from each trial are discarded, as these are during a “burn period” when the simulation is
getting started and sufficient history is being generated to use for forecasts and other steps. The
number of samples in the burn period is set by the
burn_samples
configuration value.
As a general rule, all PassengerSim outputs will reflect results collected only from the samples that are after the burn period. For example, if we have 300 samples and a burn period of 50 samples, then all outputs will be based on the 250 samples that are after the burn period. This is important to keep in mind when interpreting results, as the burn period is essentially a “warm-up” period for the simulation and does not reflect a steady-state behavior of the system.
It is still possible to access data from the burn period samples through the use of custom callbacks and firehose outputs. This is generally not recommended for analysis, as the burn period is not representative of the steady-state behavior of the system, but it can be useful for debugging or other purposes. If you do choose to access burn period data, just be sure to keep in mind that it may not reflect the same dynamics as the post-burn period samples.
When running a simulation, within each trial everything is simulated serially (i.e., one thing at a
time), because each sample is dependent on the previous samples in the same trial. However, since
the trials themselves are independent, they can be run in parallel. This means that if you have the
computational resources to run multiple trials at the same time, you can significantly reduce the
overall runtime of your simulations by doing so. There are two different but closely related ways to
run PassegnerSim simulations: using the Simulation driver to run
everything sequentially in a single process, or using the
MultiSimulation driver to run multiple trials in parallel across
multiple processes. The former is generally easier to set up and debug, while the latter can provide
significant speedups if you have the computational resources to run multiple trials at the same
time. Since PassengerSim uses reproducible random seeds attached to the trial and sample numbers, it
will not change the results either way, other than runtimes.