---------------------------------------------
lines 5-203 of file: example/user/covid_19.py
---------------------------------------------

# {xrst_begin user_covid_19.py}
# {xrst_spell
#       dt
#       rcll
# }
# {xrst_comment_ch #}
#
# Model The Covid-19 Epidemic
# ###########################
#
# Goal
# ****
# Use previous cumulative death data for a location to predict future
# cumulative deaths.
#
# Data File
# *********
# The cumulative death data is the most reliable data available for the
# Covid-19 epidemic. It is also the statistic we are most interested in predicting.
# Our example data has the following columns:
#
# .. list-table::
#       :widths: auto
#
#       * - *day*
#         - day that this row of data corresponds to
#       * - *death*
#         - cumulative covid_19 death as fraction of population
#       * - *mobility*
#         - mobility shifted and scaled to be in [-1, 0.]
#       * - *testing*
#         - testing shifted and scaled to be in [0., 1]
#
# We assume that difference in the cumulative death data are independent; i.e.,
# the amount added to the cumulative death is the actual measurement recorded
# at the end of a time interval.
# For the purpose of this example, we mimic a csv file with a string that
# has this information.
#
# SIR Model
# *********
#
# ODE
# ===
# We use :math:`Q` in place of :math:`S` in the SIR Model
# to avoid confusion with the dismod meaning of :math:`S`.
# The SIR Model for an epidemic is the following differential equations:
#
# .. math::
#
#   \begin{array}{rcll}
#    \dot{Q} & = & - \beta Q I \\
#    \dot{I} & = & + \beta Q I & - ( \gamma + \chi ) I \\
#    \dot{R} & = & + \gamma I
#   \end{array}
#
# .. csv-table::
#       :widths: auto
#
#       :math:`Q(t)`,susceptible fraction of the population
#       :math:`I(t)`,infectious fraction of the population
#       :math:`R(t)`,recovered fraction of the population
#       :math:`\beta(t)`,infectious rate
#       :math:`\gamma(t)`,recovery rate
#       :math:`\chi(t)`,excess mortality rate (for this disease)
#
# Data Model
# ==========
# The model for a measurement of the i-th difference in cumulative death is
#
# .. math::
#
#   d_i = e_i + \int_{a(i)}^{b(i)} \chi(t) I (t) dt
#
# .. csv-table::
#       :widths: auto
#
#       :math:`d_i`,the i-th difference in cumulative death
#       :math:`e_i`,noise in the i-th difference in cumulative death
#       :math:`a_i`,start time for i-th difference in cumulative death
#       :math:`b_i`,end time for i-th difference in cumulative death
#
# Discussion
# ==========
# The model does not include births and deaths due to other causes, but this
# would introduce even more unknowns.
# Another problem with the model above is that it is not identifiable given just
# measurements of cumulative death; i.e., there are two many unknown functions.
# One approach to this problem is to assume we know
# the rates :math:`\gamma(t)`, :math:`\chi(t)` and
# only estimate :math:`\beta(t)` and the initial conditions
# :math:`Q(0)`, :math:`I(0)`, :math:`R(0)`.
# Another problem is that deaths are often under reported; e.g., they might only
# be the deaths in the hospitals or for people who have been confirmed to have the
# disease.
#
# Dismod Model
# ************
# The Dismod model for an epidemic is the following ODE:
#
# .. math::
#
#   \begin{array}{rcll}
#    \dot{S} & = & - \iota S & + \rho            C    \\
#    \dot{C} & = & + \iota S & - ( \rho + \chi ) C
#   \end{array}
#
# together with the following cumulative death model
#
# .. math::
#
#   d_i = e_i + \int_{a(i)}^{b(i)} \chi(t) \frac{C(t)}{S(t) + C(t)} dt
#
# Remission :math:`\rho` is :math:`\gamma` in the SIR model.
# Prevalence :math:`C(t) / [ S(t) + C(t) ]`
# is to :math:`I(t)` in the SIR model.
# Susceptible :math:`S(t)` is :math:`Q(t) + R(t)` in the SIR model.
# Not all the members of S are susceptible to the disease; i.e., the members of R.
# It would be nice to complete the :ref:`wish_list@Immunity`
# wish list item so this would not be necessary.
# However, because of under reporting of deaths, :math:`Q(t)` in the SIR model
# also has members that are not susceptible to the disease.
#
# Data
# ****
# For this example, we get the cumulative death data and covariates from a
# text string version of a CSV file with the following columns:
#
# .. csv-table::
#       :widths: auto
#
#       *day*,days since the first cumulative death value
#       *death*,cumulative death value for this day
#       *mobility*,a social mobility covariate
#       *testing*,level of testing for the disease covariate
#
# The *mobility* covariate has been shifted and scaled to be in the
# interval [-1, 0] with zero corresponding to normal mobility.
# The *testing* covariate has been scaled to the interval [0, 1]
# with zero corresponding to no testing.
#
# Model Variables
# ***************
# There is only one location for this example, so there are no random effects.
# If you had data from multiple locations you could use random effects to improve
# the estimation.
#
# Rates
# =====
# The prior for the value of :math:`\iota(t)` is uniform with
# lower limit 1e-6, upper limit 1.0, and mean 1e-3.
# The mean is only used as a starting point and scaling point for the optimization.
# The prior for the forward difference of :math:`\iota(t)` between grid points is
# log Gaussian with mean zero, standard deviation 0.1 and the offset in the
# log transformation is 1e-5. The corresponds to a difference of 10 percent between
# days being a weighted residual of one.
# The other rates are modeled as the following known constants:
# :math:`\rho = 0.1` and :math:`\chi = 0.01`.
#
# Covariates
# ==========
# There are two covariate multipliers for this example, one for mobility
# and the other for testing. These affect the rate :math:`\iota(t)`.
# They are both have one grid point (are constant in time) and
# have a uniform prior with lower limit -1, upper limit +1, and mean zero.
# It appears that the bounds on the covariate multipliers are active at the
# solution; i.e., more effect would improve the objective; see the file
# :ref:`db2csv_command@variable.csv` generated by the
# ``db2csv`` command after the fit was done.
#
# Predictions
# ***********
# To do predictions for the next week, we could fit a function to the previous
# weeks baseline value of :math:`\iota(t)` (value without the covariates effects).
# This gives us a prediction for the baseline value of :math:`\iota(t)`
# for the next week. We would then include some assumed covariate values for the next
# and predict the differences in cumulative death for each day during the next week.
# The subtitle point here is that the prior for :math:`\iota(t)` is a constant,
# but the posterior is not. Actually :math:`\beta(t)` in the SIR model
# is more likely to be constant and it might be better to fit the :math:`\beta(t)`
# that corresponds to the previous fit for :math:`I(t) = C(t) / [ S(t) + C(t) ]`.
#
# Display Fit
# ***********
# If this variable is true, some of the fit results will be printed and plotted:
# {xrst_spell_off}
# {xrst_code py}
display_fit = True
# {xrst_code}
# {xrst_spell_on}
#
# Source Code
# ***********
# {xrst_literal
#       BEGIN PYTHON
#       END PYTHON
# }
#
# {xrst_end user_covid_19.py}