\(\newcommand{\B}[1]{ {\bf #1} }\) \(\newcommand{\R}[1]{ {\rm #1} }\) \(\newcommand{\W}[1]{ \; #1 \; }\)
create_database¶
View page sourceCreate a Dismod_at Database¶
Prototype¶
def create_database(
file_name,
age_list,
time_list,
integrand_table = list(),
node_table = list(),
subgroup_table = list(),
weight_table = list(),
covariate_table = list(),
avgint_table = list(),
data_table = list(),
prior_table = list(),
smooth_table = list(),
nslist_dict = dict(),
rate_table = list(),
mulcov_table = list(),
option_table = list(),
rate_eff_cov_table = list(),
) :
Purpose¶
This routine makes it easy to create a dismod_at database
with all of its input tables.
This is only meant for small example and testing cases and is not efficient.
Primary Key¶
For each of the lists above, the order of the
elements in the corresponding table is the same as the corresponding list.
For example, age_list [ i ] corresponds to the i-th row
of the age table which has
Primary Key value age_id = i .
Name Column¶
The name columns are created with th unique
constraint; i.e., it will be an error to have the same value appear
twice in a column table_name _ name in the table
table_name .
file_name¶
is as str containing the name of the file where the data base
is stored.
If this file already exists, it is deleted and a database is created.
age_list¶
is a list of float that
specify age values by indices.
time_list¶
is a list of float that
specify time values by indices.
integrand_table¶
This is a list of dict
that define the rows of the integrand_table .
The dictionary integrand_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th integrand |
minimum_meas_cv |
str |
minimum measurement cv for this integrand |
The key minimum_meas_cv is optional.
If it is not present, 0.0 is used for the corresponding value.
node_table¶
This is a list of dict
that define the rows of the node_table .
The dictionary node_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th node |
parent |
str |
name of parent of the i-th node |
Note that if the i-th node does not have a parent, the empty string should be used for the parent of that node.
subgroup_table¶
This is a list of dict
that define the rows of the subgroup_table .
The dictionary node_table [ i ] has the following:
Key |
Value Type |
Description |
subgroup |
str |
name for the i-th subgroup |
group |
str |
name of group that subgroup is in |
Backward Compatibility¶
To get backward compatibility to before the subgroup information was added,
add the following table to the create_database call
(just after the node_table ):
subgroup_table = [ {
'subgroup':'world','group':'world'} ]
No other changes to the create_database call should be necessary
(for backward compatibility).
weight_table¶
This is a list of dict
that define the rows of the weight_table and
weight_grid_table .
The dictionary weight_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name of i-th weighting |
age_id |
list of int |
indices for age grid |
time_id |
list of int |
indices for time grid |
fun |
function |
w = fun ( a , t ) |
The float w is the value of this weighting a the corresponding float age a and float time t . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .
covariate_table¶
This is a list of dict
that define the rows of the covariate_table .
The dictionary covariate_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name for the i-th covariate |
reference |
float |
reference value for i-th covariate |
max_difference |
float |
maximum difference for i-th covariate |
If max_difference is None , the corresponding table entry
is null and this corresponds to an infinite maximum difference.
If max_difference does not appear, null is written for the
corresponding covariate entry.
avgint_table¶
This is a list of dict
that define the rows of the avgint_table .
The dictionary avgint_table [ i ] has the following:
Key |
Value Type |
Description |
integrand |
str |
integrand for i-th data |
node |
str |
name of node in graph |
subgroup |
str |
name of subgroup |
weight |
str |
weighting function name |
age_lower |
float |
lower age limit |
age_upper |
float |
upper age limit |
time_lower |
float |
lower time limit |
time_lower |
float |
upper time limit |
c_0 |
float |
value of first covariate |
… |
… |
… |
c_J |
float |
value of last covariate |
subgroup¶
If the subgroup key is not present, the first subgroup in
subgroup_table is used
and a warning is printed.
weight¶
The weighting function name identifies an entry in the weight_table by its name . If weight is the empty string, the constant weighting is used.
covariates¶
Note that J = len ( covariate_table ) - 1 and for
j = 0 , … , J ,
c_j = covariate_table [ j ][
'name']
We refer to the columns above as the required columns for avgint_table .
avgint_extra_columns¶
If a row of option_table has row [ 'name' ]
equal to 'avgint_extra_columns' , the corresponding
row [ 'value' ]. split () is the list of extra avgint table columns.
Otherwise the list of extra avgint table columns is empty.
data_table¶
This is a list of dict
that define the rows of the data_table .
It has all the columns required for the avgint_table .
In addition, the dictionary data_table [ i ] has the following:
Key |
Value Type |
Description |
hold_out |
bool |
hold out flag |
density |
str |
|
meas_value |
float |
measured value |
meas_std |
float |
standard deviation |
eta |
float |
offset in log-transform |
nu |
float |
Student’s-t degrees of freedom |
sample_size |
int |
sample size for a binomial distribution |
meas_std, eta, nu, sample_size¶
The columns keys meas_std , eta , nu , and sample_size
are optional. If they are not present, the value null is used
for the corresponding row of the data table.
subgroup¶
if the subgroup key is not present, the first subgroup in
subgroup_table is used
and a warning is printed.
data_extra_columns¶
If a row of option_table has row [ 'name' ]
equal to 'data_extra_columns' , the corresponding
row [ 'value' ]. split () is the list of extra data table columns.
Otherwise the list of extra data table columns is empty.
prior_table¶
This is a list of dict
that define the rows of the prior_table .
The dictionary prior_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
name of i-th prior |
lower |
float |
lower limit |
upper |
float |
upper limit |
std |
float |
standard deviation |
density |
str |
|
eta |
float |
offset in log densities |
nu |
float |
degrees of freed in Student densities |
The columns keys
lower , upper , std , eta , and nu
are optional. If they are not present, the value null is used
for the corresponding row of the prior table.
smooth_table¶
This is a list of dict
that define the rows of the smooth_table and
smooth_grid_table .
The dictionary smooth_table [ i ] has the following keys:
name¶
an str specifying the name used to reference the i-th smoothing.
age_id¶
a list of int specifying the age values for this smoothing
as indices in age_list .
time_id¶
a list of int specifying the time values for this smoothing
as indices in time_list .
mulstd_value_prior_name¶
an str specifying the prior used for the value multiplier
for the i-th smoothing; see
mulstd_value_prior_id
This key is optional and its default value is None which corresponds
to null in the database.
mulstd_dage_prior_name¶
an str specifying the prior used for the age difference multiplier
for the i-th smoothing; see
mulstd_dage_prior_id
This key is optional and its default value is None which corresponds
to null in the database.
mulstd_dtime_prior_name¶
an str specifying the prior used for the time difference multiplier
for the i-th smoothing; see
mulstd_dtime_prior_id
This key is optional and its default value is None which corresponds
to null in the database.
fun¶
This is a function with the following syntax:
( v , da , dt ) = fun ( a , t )
The str results v , da , and dt
are the names for the value prior, age difference prior,
and time difference prior corresponding to the i-th smoothing.
The value da is not used,
when age a = age_id [ -1 ] .
The value dt is not used,
when time t = time_id [ -1 ] .
Note that there is an i , j such that
a = age_list [ age_id [ i ]] and
t = time_list [ time_id [ j ]] .
const_value¶
The fun return value v may be a float .
In this case, the value of the smoothing, at the corresponding age and time,
is constrained to be v using the
const_value column in the
smooth_grid table.
nslist_dict¶
This is a dict that specifies the
nslist_table and the nslist_pair_table .
For each nslist_name ,
nslist_dict [ nslist_name ] = [ ( node_name , smooth_name ), … ]
Note that each pair above is a python tuple :
Variable |
Value Type |
Description |
nslist_name |
str |
name of one list of node,smoothing pairs |
node_name |
str |
name of the node for this pair |
smooth_name |
str |
name of the smoothing for this pair |
rate_table¶
This is a list of dict
that define the rows of the rate_table .
The dictionary rate_table [ i ] has the following:
Key |
Value Type |
Description |
name |
str |
pini, iota, rho, chi, or omega |
parent_smooth |
str |
parent smoothing |
child_smooth |
str |
a single child smoothing |
child_nslist |
str |
list of child smoothings |
The value None is used to represent a null value for
the parent and child smoothings.
If a key name does not appear, null is used for the corresponding value.
If a name ; e.g. rho , does not appear, the value
null is used for the parent and child smoothings for the corresponding rate.
mulcov_table¶
This is a list of dict
that define the rows of the mulcov_table .
The dictionary mulcov_table [ i ] has the following:
Key |
Value Type |
Description |
covariate |
str |
is the covariate column |
type |
str |
|
effected |
str |
integrand or rate affected |
group |
str |
the group that is affected |
smooth |
str |
smoothing at group level |
subsmooth |
str |
smoothing at subgroup level |
effected¶
If type is rate_value , effected is a rate.
Otherwise it is an integrand.
group¶
If the group key is not present, the first group in
subgroup_table is used.
subsmooth¶
If the subsmooth key is not present, the value null is used for
the subgroup smoothing in the corresponding row and a warning is printed.
option_table¶
This is a list of dict
that define the values
option_name ,
option_value in the option table.
The i-th row of the table will have
'name' ]'value' ]rate_eff_cov_table¶
This is a list of dict
that define the rows of the rate_eff_cov_table .
The dictionary rate_eff_cov_table [ i ] has the following:
Key |
Value Type |
Description |
|
str |
identifies the node for the i-th row |
|
str |
identifies the covariate for the i-th row |
|
float |
value of the splitting covariate |
|
str |
identifies weighting for this row |
Contents¶
Name |
Title |
|---|---|
create_database.py |
Example¶
The file create_database.py contains
and example and test of create_database .