Getting Started¶
This page assumes that you have already installed the
epifx
package, and shows how to generate forecasts using an SEIR model.
Simulation parameters¶
Default values for the particle filter parameters are provided:
-
epifx.
default_params
(p_exp, px_scale, model=epifx.SEIR)¶ The default simulation parameters.
Parameters: - p_exp – The daily probability of seeding an exposure event in a naive population.
- px_scale – The ratio of particles to seeded exposures.
- model – The infection model.
Note that epifx
uses an independent pseudo-random number generator (PRNG)
for sampling from model priors and adding stochastic noise to model parameters
and flows (params['epifx']['rnd']
) as distinct from the pypfilt
PRNG
instance (params['rnd']
).
This PNRG can be initialised with a custom seed:
-
epifx.
init_prng
(params, seed)¶ Initialise the
epifx
PRNG instance (params['epifx']['rnd']
).Parameters: - params – The simulation parameters.
- seed – The PRNG seed; see the
numpy.random.RandomState
documentation for details.
The behaviour of epifx 0.1.x
(one common PRNG instance) can be recovered
by setting the following parameter:
params['epifx']['independent_prng'] = False
Temporal forcing¶
The force of infection can be subject to temporal forcing (e.g., by climatic
factors such as absolute humidity, or social factors such as media coverage)
as mediated by the model parameter \(\sigma\) (see Infection models).
This requires the simulation parameters to include a function that maps
datetime.datetime
instances to scalar values:
-
epifx.
daily_forcing
(filename, date_fmt=u'%Y-%m-%d')¶ Return a temporal forcing look-up function, which should be stored in
params['epifx']['forcing']
in order to enable temporal forcing.Parameters: - filename – A file that contains two columns separated by whitespace, the column first being the date and the second being the force of the temporal modulation. Note that the first line of this file is assumed to contain column headings and will be ignored.
- date_fmt – The format in which dates are stored.
Infection models¶
-
class
epifx.
SEIR
¶ An SEIR compartment model for a single circulating influenza strain, under the assumption that recovered individuals are completely protected against reinfection.
\[\begin{split}\frac{dS}{dt} &= - \alpha \cdot S^\eta I \\[0.5em] \frac{dE}{dt} &= \alpha \cdot S^\eta I - \beta \cdot E \\[0.5em] \frac{dI}{dt} &= \beta \cdot E - \gamma I \\[0.5em] \frac{dR}{dt} &= \gamma I\end{split}\]Parameter Meaning \(\alpha\) Force of infection \(\beta\) Incubation period (day -1) \(\gamma\) Infectious period (day -1) \(\eta\) Inhomogeneous social mixing \(\sigma\) Temporal forcing coefficient The force of infection can be subject to temporal forcing \(F(t)\), as mediated by \(\sigma\):
\[\alpha'(t) = \alpha \cdot (1 + \sigma \cdot F(t))\]Note that this requires the forcing function \(F(t)\) to be defined in the simulation parameters (see the Temporal Forcing documentation).
-
static
priors
(params)¶ Return a dictionary of model parameter priors.
Parameters: params – Simulation parameters.
-
static
Observation models¶
Observation models comprise a log-likelihood function (log_llhd
) that
relates observations to particles, and an expectation function (expect
)
that relates particles to observations:
def log_llhd(params, op, obs, pr_indiv, curr, hist):
"""Calculate the log-likelihood of obtaining one or more observations
from each particle.
:param params: The simulation parameters.
:param op: The observation model parameters.
:param obs: The list of observations for the current time-step.
:param pr_indiv: A 2 x N matrix (for N particles) of the probability
of an individual being uninfected and infected, respectively.
:param curr: The particle state vectors.
:param hist: The particle state histories, indexed by observation period.
"""
return ...
def expect(params, op, period, pr_inf, prev, curr):
"""Determine the expected value for a given infection probability.
:param params: The observation model parameters.
:param op: The observation model parameters.
:param period: The duration of the observation period (in days).
:param pr_inf: The probability of an individual becoming infected
during the observation period.
:param prev: The state vectors at the start of the observation period.
:param curr: The state vectors at the end of the observation period.
"""
return ...
Observation models are identified by unit
, and must be stored in the
parameters dictionary as shown below:
params = epifx.default_params(...)
obs_unit = 'Some identifier'
params['obs'][obs_unit] = {
'log_llhd_fn': log_llhd,
'expect_fn': expect,
# Add observation model parameters here.
}
The epifx.obs
module provides generic
observation models for both count data and
fractional data, as well as functions for reading observations from disk.
Forecast summaries¶
-
epifx.summary.
make
(params, all_obs, extra_tbls=None, pkgs=None, **kwargs)¶ A convenience function that collects all of the summary statistics defined in the
pypfilt.summary
andepifx.summary
modules.Parameters: - params – The simulation parameters.
- all_obs – A list of all observations.
- extra_tbls – A list of extra summary statistic tables to include.
- pkgs – A dictionary of python modules whose versions should be
recorded in the simulation metadata. By default, all of the modules
recorded by
pypfilt.summary.metadata
are included, as is theepifx
package itself. - **kwargs – Extra arguments to pass to
pypfilt.summary.HDF5
.
-
class
epifx.summary.
PrOutbreak
(name='pr_epi')¶ Record the daily outbreak probability, defined as the sum of the weights of all particles in which an outbreak has been seeded.
Parameters: name – the name of the table in the output file.
-
class
epifx.summary.
PeakMonitor
¶ Record epidemic peak forecasts, for use with other statistics.
-
peak_size
= None¶ A dictionary that maps observation systems to the size of each particle’s peak with respect to that system:
peak_size[(unit, period)]
.Note that this is only valid for tables to inspect in the
finished()
method, and not in theadd_rows()
method.
-
peak_date
= None¶ A dictionary that maps observation systems to the date of each particle’s peak with respect to that system:
peak_date[(unit, period)]
.Note that this is only valid for tables to inspect in the
finished()
method, and not in theadd_rows()
method.
-
peak_time
= None¶ A dictionary that maps observation systems to the time of each particle’s peak with respect to that system, measured in (fractional) days from the start of the forecasting period:
peak_time[(unit, period)]
.Note that this is only valid for tables to inspect in the
finished()
method, and not in theadd_rows()
method.
-
peak_weight
= None¶ A dictionary that maps observation systems to the weight of each particle at the time that its peak occurs:
peak_weight[(unit, period)]
.Note that this is only valid for tables to inspect in the
finished()
method, and not in theadd_rows()
method.
-
expected_obs
= None¶ The expected observation for each particle for the duration of the current simulation window.
Note that this is only valid for tables to inspect in each call to
add_rows()
, and not in a call tofinished()
.
-
days_to
(date)¶ Convert a date to the (fractional) number of days from the start of the forecasting period.
-
-
class
epifx.summary.
PeakForecastEnsembles
(peak_monitor, name='peak_ensemble')¶ Record the weighted ensemble of peak size and time predictions for each forecasting simulation.
Parameters: - peak_monitor – an instance of
PeakMonitor
. - name – the name of the table in the output file.
- peak_monitor – an instance of
-
class
epifx.summary.
PeakForecastCIs
(peak_monitor, probs=None, name='peak_cints')¶ Record fixed-probability central credible intervals for the peak size and time predictions.
Parameters: - peak_monitor – an instance of
PeakMonitor
. - probs – an array of probabilities that define the size of each
central credible interval.
The default value is
numpy.uint8([0, 50, 90, 95, 99, 100])
. - name – the name of the table in the output file.
- peak_monitor – an instance of
-
class
epifx.summary.
PeakSizeAccuracy
(peak_monitor, name='peak_size_acc', toln=None)¶ Record the accuracy of the peak size predictions against multiple accuracy tolerances.
Parameters: - peak_monitor – an instance of
PeakMonitor
. - name – the name of the table in the output file.
- toln – The accuracy thresholds for peak size predictions, expressed
as percentages of the true size.
The default is
np.array([10, 20, 25, 33])
.
- peak_monitor – an instance of
-
class
epifx.summary.
PeakTimeAccuracy
(peak_monitor, name='peak_time_acc', toln=None)¶ Record the accuracy of the peak time predictions against multiple accuracy tolerances.
Parameters: - peak_monitor – an instance of
PeakMonitor
. - name – the name of the table in the output file.
- toln – The accuracy thresholds for peak time predictions, expressed
as numbers of days. The default is
np.array([7, 10, 14])
.
- peak_monitor – an instance of
-
class
epifx.summary.
ExpectedObs
(peak_monitor, probs=None, name='forecasts')¶ Record fixed-probability central credible intervals for the expected observations.
Parameters: - peak_monitor – an instance of
PeakMonitor
. - probs – an array of probabilities that define the size of each
central credible interval.
The default value is
numpy.uint8([0, 50, 90, 95, 99, 100])
. - name – the name of the table in the output file.
- peak_monitor – an instance of
Generating forecasts¶
import datetime
import epifx
# Simulation parameters
p_exp = 1.0 / 36
px_scale = 100
params = epifx.default_params(p_exp, px_scale)
epifx.init_prng(params, seed=42)
# Simulate from the 1st of May to the 31st of October, 2015.
start = datetime.datetime(2015, 5, 1)
until = start + datetime.timedelta(days=183)
# Load the observations and define the observation model parameters.
obs_list = ...
params['obs'][obs_unit] = {...}
# Generate weekly forecasts for the first 9 weeks (and the start date).
fs_dates = [start + datetime.timedelta(days=week * 7)
for week in range(10)]
# Summary statistics and output file.
summary = epifx.summary.make(params, obs_list))
out = "forecasts.hdf5"
# Generate the forecasts and save them to disk.
pypfilt.forecast(params, start, until, [obs_list], fs_dates, summary, out)
Comparing forecast outputs¶
Output files can be compared for equality, which is useful for ensuring that different systems produce identical results.
-
epifx.cmp.
files
(path1, path2, verbose=True, examine=None)¶ Compare two HDF5 files for identical simulation outputs.
Parameters: - path1 – The filename of the first HDF5 file.
- path2 – The filename of the second HDF5 file.
- verbose – Whether to report successful matches.
- examine – The data groups to examine for equality; the default is to
examine the simulation outputs (
'/data'
) and ignore the associated metadata ('/meta'
).
Returns: True
if the files contain identical simulation outputs, otherwiseFalse
.
This functionality is also provided as a command-line script:
cmp-output --help
cmp-output file1.hdf5 file2.hdf5
Note that simulation outputs have been observed to differ depending on the installed versions of the NumPy and SciPy packages, due to changes in the random number generator.