\(\newcommand{\B}[1]{ {\bf #1} }\) \(\newcommand{\R}[1]{ {\rm #1} }\) \(\newcommand{\W}[1]{ \; #1 \; }\)
create_database#
View page sourceCreate a Dismod_at Database#
Prototype#
def create_database(
file_name,
age_list,
time_list,
integrand_table = list(),
node_table = list(),
subgroup_table = list(),
weight_table = list(),
covariate_table = list(),
avgint_table = list(),
data_table = list(),
prior_table = list(),
smooth_table = list(),
nslist_dict = dict(),
rate_table = list(),
mulcov_table = list(),
option_table = list(),
rate_eff_cov_table = list(),
) :
Purpose#
This routine makes it easy to create a dismod_at
database
with all of its input tables.
This is only meant for small example and testing cases and is not efficient.
Primary Key#
For each of the lists above, the order of the
elements in the corresponding table is the same as the corresponding list.
For example, age_list [ i ] corresponds to the i-th row
of the age
table which has
Primary Key value age_id = i .
Name Column#
The name columns are created with th unique
constraint; i.e., it will be an error to have the same value appear
twice in a column table_name _ name
in the table
table_name .
file_name#
is as str
containing the name of the file where the data base
is stored.
If this file already exists, it is deleted and a database is created.
age_list#
is a list
of float
that
specify age values by indices.
time_list#
is a list
of float
that
specify time values by indices.
integrand_table#
This is a list of dict
that define the rows of the integrand_table .
The dictionary integrand_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th integrand |
minimum_meas_cv |
str |
minimum measurement cv for this integrand |
The key minimum_meas_cv
is optional.
If it is not present, 0.0
is used for the corresponding value.
node_table#
This is a list of dict
that define the rows of the node_table .
The dictionary node_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th node |
parent |
str |
name of parent of the i-th node |
Note that if the i-th node does not have a parent, the empty string should be used for the parent of that node.
subgroup_table#
This is a list of dict
that define the rows of the subgroup_table .
The dictionary node_table [ i ] has the following:
Key |
Value Type |
Description |
subgroup |
str |
name for the i-th subgroup |
group |
str |
name of group that subgroup is in |
Backward Compatibility#
To get backward compatibility to before the subgroup information was added,
add the following table to the create_database
call
(just after the node_table ):
subgroup_table = [ {
'subgroup'
:'world'
,'group'
:'world'
} ]
No other changes to the create_database
call should be necessary
(for backward compatibility).
weight_table#
This is a list of dict
that define the rows of the weight_table and
weight_grid_table .
The dictionary weight_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name of i-th weighting |
age_id |
list of int |
indices for age grid |
time_id |
list of int |
indices for time grid |
fun |
function |
w = fun ( a , t ) |
The float w is the value of this weighting a the corresponding float age a and float time t . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .
covariate_table#
This is a list of dict
that define the rows of the covariate_table .
The dictionary covariate_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th covariate |
reference |
float |
reference value for i-th covariate |
max_difference |
float |
maximum difference for i-th covariate |
If max_difference is None
, the corresponding table entry
is null and this corresponds to an infinite maximum difference.
If max_difference does not appear, null is written for the
corresponding covariate entry.
avgint_table#
This is a list of dict
that define the rows of the avgint_table .
The dictionary avgint_table [ i ] has the following:
Key |
Value Type |
Description |
integrand |
str |
integrand for i-th data |
node |
str |
name of node in graph |
subgroup |
str |
name of subgroup |
weight |
str |
weighting function name |
age_lower |
float |
lower age limit |
age_upper |
float |
upper age limit |
time_lower |
float |
lower time limit |
time_lower |
float |
upper time limit |
c_0 |
float |
value of first covariate |
… |
… |
… |
c_J |
float |
value of last covariate |
subgroup#
If the subgroup
key is not present, the first subgroup in
subgroup_table is used
and a warning is printed.
weight#
The weighting function name identifies an entry in the weight_table by its name . If weight is the empty string, the constant weighting is used.
covariates#
Note that J = len
( covariate_table ) - 1
and for
j = 0 , … , J ,
c_j = covariate_table [ j ][
'name'
]
We refer to the columns above as the required columns for avgint_table .
avgint_extra_columns#
If a row of option_table has row [ 'name'
]
equal to 'avgint_extra_columns'
, the corresponding
row [ 'value'
]. split
() is the list of extra avgint table columns.
Otherwise the list of extra avgint table columns is empty.
data_table#
This is a list of dict
that define the rows of the data_table .
It has all the columns required for the avgint_table .
In addition, the dictionary data_table [ i ] has the following:
Key |
Value Type |
Description |
hold_out |
bool |
hold out flag |
density |
str |
|
meas_value |
float |
measured value |
meas_std |
float |
standard deviation |
eta |
float |
offset in log-transform |
nu |
float |
Student’s-t degrees of freedom |
sample_size |
int |
sample size for a binomial distribution |
meas_std, eta, nu, sample_size#
The columns keys meas_std
, eta
, nu
, and sample_size
are optional. If they are not present, the value null
is used
for the corresponding row of the data table.
subgroup#
if the subgroup
key is not present, the first subgroup in
subgroup_table is used
and a warning is printed.
data_extra_columns#
If a row of option_table has row [ 'name'
]
equal to 'data_extra_columns'
, the corresponding
row [ 'value'
]. split
() is the list of extra data table columns.
Otherwise the list of extra data table columns is empty.
prior_table#
This is a list of dict
that define the rows of the prior_table .
The dictionary prior_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name of i-th prior |
lower |
float |
lower limit |
upper |
float |
upper limit |
std |
float |
standard deviation |
density |
str |
|
eta |
float |
offset in log densities |
nu |
float |
degrees of freed in Student densities |
The columns keys
lower
, upper
, std
, eta
, and nu
are optional. If they are not present, the value null
is used
for the corresponding row of the prior table.
smooth_table#
This is a list of dict
that define the rows of the smooth_table and
smooth_grid_table .
The dictionary smooth_table [ i ] has the following keys:
name#
an str
specifying the name used to reference the i-th smoothing.
age_id#
a list of int
specifying the age values for this smoothing
as indices in age_list .
time_id#
a list of int
specifying the time values for this smoothing
as indices in time_list .
mulstd_value_prior_name#
an str
specifying the prior used for the value multiplier
for the i-th smoothing; see
mulstd_value_prior_id
This key is optional and its default value is None
which corresponds
to null
in the database.
mulstd_dage_prior_name#
an str
specifying the prior used for the age difference multiplier
for the i-th smoothing; see
mulstd_dage_prior_id
This key is optional and its default value is None
which corresponds
to null
in the database.
mulstd_dtime_prior_name#
an str
specifying the prior used for the time difference multiplier
for the i-th smoothing; see
mulstd_dtime_prior_id
This key is optional and its default value is None
which corresponds
to null
in the database.
fun#
This is a function with the following syntax:
( v , da , dt ) = fun ( a , t )
The str
results v , da , and dt
are the names for the value prior, age difference prior,
and time difference prior corresponding to the i-th smoothing.
The value da is not used,
when age a = age_id [ -1
] .
The value dt is not used,
when time t = time_id [ -1
] .
Note that there is an i , j such that
a = age_list [ age_id [ i ]] and
t = time_list [ time_id [ j ]] .
const_value#
The fun return value v may be a float
.
In this case, the value of the smoothing, at the corresponding age and time,
is constrained to be v using the
const_value column in the
smooth_grid
table.
nslist_dict#
This is a dict
that specifies the
nslist_table and the nslist_pair_table .
For each nslist_name ,
nslist_dict [ nslist_name ] = [ ( node_name , smooth_name ), … ]
Note that each pair above is a python tuple
:
Variable |
Value Type |
Description |
nslist_name |
str |
name of one list of node,smoothing pairs |
node_name |
str |
name of the node for this pair |
smooth_name |
str |
name of the smoothing for this pair |
rate_table#
This is a list of dict
that define the rows of the rate_table .
The dictionary rate_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
pini, iota, rho, chi, or omega |
parent_smooth |
str |
parent smoothing |
child_smooth |
str |
a single child smoothing |
child_nslist |
str |
list of child smoothings |
The value None
is used to represent a null
value for
the parent and child smoothings.
If a key name does not appear, null is used for the corresponding value.
If a name ; e.g. rho
, does not appear, the value
null is used for the parent and child smoothings for the corresponding rate.
mulcov_table#
This is a list of dict
that define the rows of the mulcov_table .
The dictionary mulcov_table [ i ] has the following:
Key |
Value Type |
Description |
covariate |
str |
is the covariate column |
type |
str |
|
effected |
str |
integrand or rate affected |
group |
str |
the group that is affected |
smooth |
str |
smoothing at group level |
subsmooth |
str |
smoothing at subgroup level |
effected#
If type is rate_value
, effected is a rate.
Otherwise it is an integrand.
group#
If the group
key is not present, the first group in
subgroup_table is used.
subsmooth#
If the subsmooth
key is not present, the value null is used for
the subgroup smoothing in the corresponding row and a warning is printed.
option_table#
This is a list of dict
that define the values
option_name ,
option_value in the option table.
The i-th row of the table will have
'name'
]'value'
]rate_eff_cov_table#
This is a list of dict
that define the rows of the rate_eff_cov_table .
The dictionary rate_eff_cov_table [ i ] has the following:
Key |
Value Type |
Description |
|
str |
identifies the node for the i-th row |
|
str |
identifies the covariate for the i-th row |
|
float |
value of the splitting covariate |
|
str |
identifies weighting for this row |
Contents#
Name |
Title |
---|---|
create_database.py |
Example#
The file create_database.py contains
and example and test of create_database
.