create_database¶

Create a Dismod_at Database¶

Prototype¶

def create_database(
   file_name,
   age_list,
   time_list,
   integrand_table = list(),
   node_table      = list(),
   subgroup_table  = list(),
   weight_table    = list(),
   covariate_table = list(),
   avgint_table    = list(),
   data_table      = list(),
   prior_table     = list(),
   smooth_table    = list(),
   nslist_dict     = dict(),
   rate_table      = list(),
   mulcov_table    = list(),
   option_table    = list(),
   rate_eff_cov_table  = list(),
) :

Purpose¶

This routine makes it easy to create a dismod_at database with all of its input tables. This is only meant for small example and testing cases and is not efficient.

Primary Key¶

For each of the lists above, the order of the elements in the corresponding table is the same as the corresponding list. For example, age_list [ i ] corresponds to the i-th row of the age table which has Primary Key value age_id = i .

Name Column¶

The name columns are created with th unique constraint; i.e., it will be an error to have the same value appear twice in a column table_name _ name in the table table_name .

file_name¶

is as str containing the name of the file where the data base is stored. If this file already exists, it is deleted and a database is created.

age_list¶

is a list of float that specify age values by indices.

time_list¶

is a list of float that specify time values by indices.

integrand_table¶

This is a list of dict that define the rows of the integrand_table . The dictionary integrand_table [ i ] has the following:

Key	Value Type	Description
name	str	name for the i-th integrand
minimum_meas_cv	str	minimum measurement cv for this integrand

The key minimum_meas_cv is optional. If it is not present, 0.0 is used for the corresponding value.

node_table¶

This is a list of dict that define the rows of the node_table . The dictionary node_table [ i ] has the following:

Key	Value Type	Description
name	str	name for the i-th node
parent	str	name of parent of the i-th node

Note that if the i-th node does not have a parent, the empty string should be used for the parent of that node.

subgroup_table¶

This is a list of dict that define the rows of the subgroup_table . The dictionary node_table [ i ] has the following:

Key	Value Type	Description
subgroup	str	name for the i-th subgroup
group	str	name of group that subgroup is in

Backward Compatibility¶

To get backward compatibility to before the subgroup information was added, add the following table to the create_database call (just after the node_table ):

subgroup_table = [ { 'subgroup' : 'world' , 'group' : 'world' } ]

No other changes to the create_database call should be necessary (for backward compatibility).

weight_table¶

This is a list of dict that define the rows of the weight_table and weight_grid_table . The dictionary weight_table [ i ] has the following:

Key	Value Type	Description
name	str	name of i-th weighting
age_id	list of int	indices for age grid
time_id	list of int	indices for time grid
fun	function	w = fun ( a , t )

The float w is the value of this weighting a the corresponding float age a and float time t . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

covariate_table¶

This is a list of dict that define the rows of the covariate_table . The dictionary covariate_table [ i ] has the following:

Key	Value Type	Description
name	str	name for the i-th covariate
reference	float	reference value for i-th covariate
max_difference	float	maximum difference for i-th covariate

If max_difference is None , the corresponding table entry is null and this corresponds to an infinite maximum difference. If max_difference does not appear, null is written for the corresponding covariate entry.

avgint_table¶

This is a list of dict that define the rows of the avgint_table . The dictionary avgint_table [ i ] has the following:

Key	Value Type	Description
integrand	str	integrand for i-th data
node	str	name of node in graph
subgroup	str	name of subgroup
weight	str	weighting function name
age_lower	float	lower age limit
age_upper	float	upper age limit
time_lower	float	lower time limit
time_lower	float	upper time limit
c_0	float	value of first covariate
…	…	…
c_J	float	value of last covariate

subgroup¶

If the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

weight¶

The weighting function name identifies an entry in the weight_table by its name . If weight is the empty string, the constant weighting is used.

covariates¶

Note that J = len ( covariate_table ) - 1 and for j = 0 , … , J ,

c_j = covariate_table [ j ][ 'name' ]

We refer to the columns above as the required columns for avgint_table .

avgint_extra_columns¶

If a row of option_table has row [ 'name' ] equal to 'avgint_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra avgint table columns. Otherwise the list of extra avgint table columns is empty.

data_table¶

This is a list of dict that define the rows of the data_table . It has all the columns required for the avgint_table . In addition, the dictionary data_table [ i ] has the following:

Key	Value Type	Description
hold_out	bool	hold out flag
density	str	density_name
meas_value	float	measured value
meas_std	float	standard deviation
eta	float	offset in log-transform
nu	float	Student’s-t degrees of freedom
sample_size	int	sample size for a binomial distribution

meas_std, eta, nu, sample_size¶

The columns keys meas_std , eta , nu , and sample_size are optional. If they are not present, the value null is used for the corresponding row of the data table.

subgroup¶

if the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

data_extra_columns¶

If a row of option_table has row [ 'name' ] equal to 'data_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra data table columns. Otherwise the list of extra data table columns is empty.

prior_table¶

This is a list of dict that define the rows of the prior_table . The dictionary prior_table [ i ] has the following:

Key	Value Type	Description
name	str	name of i-th prior
lower	float	lower limit
upper	float	upper limit
std	float	standard deviation
density	str	density_name
eta	float	offset in log densities
nu	float	degrees of freed in Student densities

The columns keys lower , upper , std , eta , and nu are optional. If they are not present, the value null is used for the corresponding row of the prior table.

smooth_table¶

This is a list of dict that define the rows of the smooth_table and smooth_grid_table . The dictionary smooth_table [ i ] has the following keys:

name¶

an str specifying the name used to reference the i-th smoothing.

age_id¶

a list of int specifying the age values for this smoothing as indices in age_list .

time_id¶

a list of int specifying the time values for this smoothing as indices in time_list .

mulstd_value_prior_name¶

an str specifying the prior used for the value multiplier for the i-th smoothing; see mulstd_value_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dage_prior_name¶

an str specifying the prior used for the age difference multiplier for the i-th smoothing; see mulstd_dage_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dtime_prior_name¶

an str specifying the prior used for the time difference multiplier for the i-th smoothing; see mulstd_dtime_prior_id This key is optional and its default value is None which corresponds to null in the database.

fun¶

This is a function with the following syntax:

( v , da , dt ) = fun ( a , t )

The str results v , da , and dt are the names for the value prior, age difference prior, and time difference prior corresponding to the i-th smoothing. The value da is not used, when age a = age_id [ -1 ] . The value dt is not used, when time t = time_id [ -1 ] . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

const_value¶

The fun return value v may be a float . In this case, the value of the smoothing, at the corresponding age and time, is constrained to be v using the const_value column in the smooth_grid table.

nslist_dict¶

This is a dict that specifies the nslist_table and the nslist_pair_table . For each nslist_name ,

nslist_dict [ nslist_name ] = [ ( node_name , smooth_name ), … ]

Note that each pair above is a python tuple :

Variable	Value Type	Description
nslist_name	str	name of one list of node,smoothing pairs
node_name	str	name of the node for this pair
smooth_name	str	name of the smoothing for this pair

rate_table¶

This is a list of dict that define the rows of the rate_table . The dictionary rate_table [ i ] has the following:

Key	Value Type	Description
name	str	pini, iota, rho, chi, or omega
parent_smooth	str	parent smoothing
child_smooth	str	a single child smoothing
child_nslist	str	list of child smoothings

The value None is used to represent a null value for the parent and child smoothings. If a key name does not appear, null is used for the corresponding value. If a name ; e.g. rho , does not appear, the value null is used for the parent and child smoothings for the corresponding rate.

mulcov_table¶

This is a list of dict that define the rows of the mulcov_table . The dictionary mulcov_table [ i ] has the following:

Key	Value Type	Description
covariate	str	is the covariate column
type	str	`rate_value` , `meas_value` , or `meas_noise`
effected	str	integrand or rate affected
group	str	the group that is affected
smooth	str	smoothing at group level
subsmooth	str	smoothing at subgroup level

effected¶

If type is rate_value , effected is a rate. Otherwise it is an integrand.

group¶

If the group key is not present, the first group in subgroup_table is used.

subsmooth¶

If the subsmooth key is not present, the value null is used for the subgroup smoothing in the corresponding row and a warning is printed.

option_table¶

This is a list of dict that define the values option_name , option_value in the option table. The i-th row of the table will have

option_name = option_table [ i ][ 'name' ]

option_value = option_table [ i ][ 'value' ]

rate_eff_cov_table¶

This is a list of dict that define the rows of the rate_eff_cov_table . The dictionary rate_eff_cov_table [ i ] has the following:

Key	Value Type	Description
`'node_name'`	str	identifies the node for the i-th row
`'covariate_name'`	str	identifies the covariate for the i-th row
`split_value`	float	value of the splitting covariate
`'weight_name'`	str	identifies weighting for this row

Contents¶

Name	Title
create_database.py	create_database: Example and Test

Example¶

The file create_database.py contains and example and test of create_database .