create_database#

View page source

Create a Dismod_at Database#

Prototype#

def create_database(
   file_name,
   age_list,
   time_list,
   integrand_table = list(),
   node_table      = list(),
   subgroup_table  = list(),
   weight_table    = list(),
   covariate_table = list(),
   avgint_table    = list(),
   data_table      = list(),
   prior_table     = list(),
   smooth_table    = list(),
   nslist_dict     = dict(),
   rate_table      = list(),
   mulcov_table    = list(),
   option_table    = list(),
   rate_eff_cov_table  = list(),
) :

Purpose#

This routine makes it easy to create a dismod_at database with all of its input tables. This is only meant for small example and testing cases and is not efficient.

Primary Key#

For each of the lists above, the order of the elements in the corresponding table is the same as the corresponding list. For example, age_list [ i ] corresponds to the i-th row of the age table which has Primary Key value age_id = i .

Name Column#

The name columns are created with th unique constraint; i.e., it will be an error to have the same value appear twice in a column table_name _ name in the table table_name .

file_name#

is as str containing the name of the file where the data base is stored. If this file already exists, it is deleted and a database is created.

age_list#

is a list of float that specify age values by indices.

time_list#

is a list of float that specify time values by indices.

integrand_table#

This is a list of dict that define the rows of the integrand_table . The dictionary integrand_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th integrand

minimum_meas_cv

str

minimum measurement cv for this integrand

The key minimum_meas_cv is optional. If it is not present, 0.0 is used for the corresponding value.

node_table#

This is a list of dict that define the rows of the node_table . The dictionary node_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th node

parent

str

name of parent of the i-th node

Note that if the i-th node does not have a parent, the empty string should be used for the parent of that node.

subgroup_table#

This is a list of dict that define the rows of the subgroup_table . The dictionary node_table [ i ] has the following:

Key

Value Type

Description

subgroup

str

name for the i-th subgroup

group

str

name of group that subgroup is in

Backward Compatibility#

To get backward compatibility to before the subgroup information was added, add the following table to the create_database call (just after the node_table ):

subgroup_table = [ { 'subgroup' : 'world' , 'group' : 'world' } ]

No other changes to the create_database call should be necessary (for backward compatibility).

weight_table#

This is a list of dict that define the rows of the weight_table and weight_grid_table . The dictionary weight_table [ i ] has the following:

Key

Value Type

Description

name

str

name of i-th weighting

age_id

list of int

indices for age grid

time_id

list of int

indices for time grid

fun

function

w = fun ( a , t )

The float w is the value of this weighting a the corresponding float age a and float time t . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

covariate_table#

This is a list of dict that define the rows of the covariate_table . The dictionary covariate_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th covariate

reference

float

reference value for i-th covariate

max_difference

float

maximum difference for i-th covariate

If max_difference is None , the corresponding table entry is null and this corresponds to an infinite maximum difference. If max_difference does not appear, null is written for the corresponding covariate entry.

avgint_table#

This is a list of dict that define the rows of the avgint_table . The dictionary avgint_table [ i ] has the following:

Key

Value Type

Description

integrand

str

integrand for i-th data

node

str

name of node in graph

subgroup

str

name of subgroup

weight

str

weighting function name

age_lower

float

lower age limit

age_upper

float

upper age limit

time_lower

float

lower time limit

time_lower

float

upper time limit

c_0

float

value of first covariate

c_J

float

value of last covariate

subgroup#

If the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

weight#

The weighting function name identifies an entry in the weight_table by its name . If weight is the empty string, the constant weighting is used.

covariates#

Note that J = len ( covariate_table ) - 1 and for j = 0 , … , J ,

c_j = covariate_table [ j ][ 'name' ]

We refer to the columns above as the required columns for avgint_table .

avgint_extra_columns#

If a row of option_table has row [ 'name' ] equal to 'avgint_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra avgint table columns. Otherwise the list of extra avgint table columns is empty.

data_table#

This is a list of dict that define the rows of the data_table . It has all the columns required for the avgint_table . In addition, the dictionary data_table [ i ] has the following:

Key

Value Type

Description

hold_out

bool

hold out flag

density

str

density_name

meas_value

float

measured value

meas_std

float

standard deviation

eta

float

offset in log-transform

nu

float

Student’s-t degrees of freedom

sample_size

int

sample size for a binomial distribution

meas_std, eta, nu, sample_size#

The columns keys meas_std , eta , nu , and sample_size are optional. If they are not present, the value null is used for the corresponding row of the data table.

subgroup#

if the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

data_extra_columns#

If a row of option_table has row [ 'name' ] equal to 'data_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra data table columns. Otherwise the list of extra data table columns is empty.

prior_table#

This is a list of dict that define the rows of the prior_table . The dictionary prior_table [ i ] has the following:

Key

Value Type

Description

name

str

name of i-th prior

lower

float

lower limit

upper

float

upper limit

std

float

standard deviation

density

str

density_name

eta

float

offset in log densities

nu

float

degrees of freed in Student densities

The columns keys lower , upper , std , eta , and nu are optional. If they are not present, the value null is used for the corresponding row of the prior table.

smooth_table#

This is a list of dict that define the rows of the smooth_table and smooth_grid_table . The dictionary smooth_table [ i ] has the following keys:

name#

an str specifying the name used to reference the i-th smoothing.

age_id#

a list of int specifying the age values for this smoothing as indices in age_list .

time_id#

a list of int specifying the time values for this smoothing as indices in time_list .

mulstd_value_prior_name#

an str specifying the prior used for the value multiplier for the i-th smoothing; see mulstd_value_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dage_prior_name#

an str specifying the prior used for the age difference multiplier for the i-th smoothing; see mulstd_dage_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dtime_prior_name#

an str specifying the prior used for the time difference multiplier for the i-th smoothing; see mulstd_dtime_prior_id This key is optional and its default value is None which corresponds to null in the database.

fun#

This is a function with the following syntax:

( v , da , dt ) = fun ( a , t )

The str results v , da , and dt are the names for the value prior, age difference prior, and time difference prior corresponding to the i-th smoothing. The value da is not used, when age a = age_id [ -1 ] . The value dt is not used, when time t = time_id [ -1 ] . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

const_value#

The fun return value v may be a float . In this case, the value of the smoothing, at the corresponding age and time, is constrained to be v using the const_value column in the smooth_grid table.

nslist_dict#

This is a dict that specifies the nslist_table and the nslist_pair_table . For each nslist_name ,

nslist_dict [ nslist_name ] = [ ( node_name , smooth_name ), … ]

Note that each pair above is a python tuple :

Variable

Value Type

Description

nslist_name

str

name of one list of node,smoothing pairs

node_name

str

name of the node for this pair

smooth_name

str

name of the smoothing for this pair

rate_table#

This is a list of dict that define the rows of the rate_table . The dictionary rate_table [ i ] has the following:

Key

Value Type

Description

name

str

pini, iota, rho, chi, or omega

parent_smooth

str

parent smoothing

child_smooth

str

a single child smoothing

child_nslist

str

list of child smoothings

The value None is used to represent a null value for the parent and child smoothings. If a key name does not appear, null is used for the corresponding value. If a name ; e.g. rho , does not appear, the value null is used for the parent and child smoothings for the corresponding rate.

mulcov_table#

This is a list of dict that define the rows of the mulcov_table . The dictionary mulcov_table [ i ] has the following:

Key

Value Type

Description

covariate

str

is the covariate column

type

str

rate_value , meas_value , or meas_noise

effected

str

integrand or rate affected

group

str

the group that is affected

smooth

str

smoothing at group level

subsmooth

str

smoothing at subgroup level

effected#

If type is rate_value , effected is a rate. Otherwise it is an integrand.

group#

If the group key is not present, the first group in subgroup_table is used.

subsmooth#

If the subsmooth key is not present, the value null is used for the subgroup smoothing in the corresponding row and a warning is printed.

option_table#

This is a list of dict that define the values option_name , option_value in the option table. The i-th row of the table will have

      option_name = option_table [ i ][ 'name' ]
      option_value = option_table [ i ][ 'value' ]

rate_eff_cov_table#

This is a list of dict that define the rows of the rate_eff_cov_table . The dictionary rate_eff_cov_table [ i ] has the following:

Key

Value Type

Description

'node_name'

str

identifies the node for the i-th row

'covariate_name'

str

identifies the covariate for the i-th row

split_value

float

value of the splitting covariate

'weight_name'

str

identifies weighting for this row

Contents#

Name

Title

create_database.py

create_database: Example and Test

Example#

The file create_database.py contains and example and test of create_database .