create_database

View page source

Create a Dismod_at Database

Prototype

def create_database(
   file_name,
   age_list,
   time_list,
   integrand_table = list(),
   node_table      = list(),
   subgroup_table  = list(),
   weight_table    = list(),
   covariate_table = list(),
   avgint_table    = list(),
   data_table      = list(),
   prior_table     = list(),
   smooth_table    = list(),
   nslist_dict     = dict(),
   rate_table      = list(),
   mulcov_table    = list(),
   option_table    = list(),
   rate_eff_cov_table  = list(),
) :

Purpose

This routine makes it easy to create a dismod_at database with all of its input tables. This is only meant for small example and testing cases and is not efficient.

Primary Key

For each of the lists above, the order of the elements in the corresponding table is the same as the corresponding list. For example, age_list [ i ] corresponds to the i-th row of the age table which has Primary Key value age_id = i .

Name Column

The name columns are created with th unique constraint; i.e., it will be an error to have the same value appear twice in a column table_name _ name in the table table_name .

file_name

is as str containing the name of the file where the data base is stored. If this file already exists, it is deleted and a database is created.

age_list

is a list of float that specify age values by indices.

time_list

is a list of float that specify time values by indices.

integrand_table

This is a list of dict that define the rows of the integrand_table . The dictionary integrand_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th integrand

minimum_meas_cv

str

minimum measurement cv for this integrand

The key minimum_meas_cv is optional. If it is not present, 0.0 is used for the corresponding value.

node_table

This is a list of dict that define the rows of the node_table . The dictionary node_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th node

parent

str

name of parent of the i-th node

Note that if the i-th node does not have a parent, the empty string should be used for the parent of that node.

subgroup_table

This is a list of dict that define the rows of the subgroup_table . The dictionary node_table [ i ] has the following:

Key

Value Type

Description

subgroup

str

name for the i-th subgroup

group

str

name of group that subgroup is in

Backward Compatibility

To get backward compatibility to before the subgroup information was added, add the following table to the create_database call (just after the node_table ):

subgroup_table = [ { 'subgroup' : 'world' , 'group' : 'world' } ]

No other changes to the create_database call should be necessary (for backward compatibility).

weight_table

This is a list of dict that define the rows of the weight_table and weight_grid_table . The dictionary weight_table [ i ] has the following:

Key

Value Type

Description

name

str

name of i-th weighting

age_id

list of int

indices for age grid

time_id

list of int

indices for time grid

fun

function

w = fun ( a , t )

The float w is the value of this weighting a the corresponding float age a and float time t . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

covariate_table

This is a list of dict that define the rows of the covariate_table . The dictionary covariate_table [ i ] has the following:

Key

Value Type

Description

name

str

name for the i-th covariate

reference

float

reference value for i-th covariate

max_difference

float

maximum difference for i-th covariate

If max_difference is None , the corresponding table entry is null and this corresponds to an infinite maximum difference. If max_difference does not appear, null is written for the corresponding covariate entry.

avgint_table

This is a list of dict that define the rows of the avgint_table . The dictionary avgint_table [ i ] has the following:

Key

Value Type

Description

integrand

str

integrand for i-th data

node

str

name of node in graph

subgroup

str

name of subgroup

weight

str

weighting function name

age_lower

float

lower age limit

age_upper

float

upper age limit

time_lower

float

lower time limit

time_lower

float

upper time limit

c_0

float

value of first covariate

c_J

float

value of last covariate

subgroup

If the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

weight

The weighting function name identifies an entry in the weight_table by its name . If weight is the empty string, the constant weighting is used.

covariates

Note that J = len ( covariate_table ) - 1 and for j = 0 , … , J ,

c_j = covariate_table [ j ][ 'name' ]

We refer to the columns above as the required columns for avgint_table .

avgint_extra_columns

If a row of option_table has row [ 'name' ] equal to 'avgint_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra avgint table columns. Otherwise the list of extra avgint table columns is empty.

data_table

This is a list of dict that define the rows of the data_table . It has all the columns required for the avgint_table . In addition, the dictionary data_table [ i ] has the following:

Key

Value Type

Description

hold_out

bool

hold out flag

density

str

density_name

meas_value

float

measured value

meas_std

float

standard deviation

eta

float

offset in log-transform

nu

float

Student’s-t degrees of freedom

sample_size

int

sample size for a binomial distribution

meas_std, eta, nu, sample_size

The columns keys meas_std , eta , nu , and sample_size are optional. If they are not present, the value null is used for the corresponding row of the data table.

subgroup

if the subgroup key is not present, the first subgroup in subgroup_table is used and a warning is printed.

data_extra_columns

If a row of option_table has row [ 'name' ] equal to 'data_extra_columns' , the corresponding row [ 'value' ]. split () is the list of extra data table columns. Otherwise the list of extra data table columns is empty.

prior_table

This is a list of dict that define the rows of the prior_table . The dictionary prior_table [ i ] has the following:

Key

Value Type

Description

name

str

name of i-th prior

lower

float

lower limit

upper

float

upper limit

std

float

standard deviation

density

str

density_name

eta

float

offset in log densities

nu

float

degrees of freed in Student densities

The columns keys lower , upper , std , eta , and nu are optional. If they are not present, the value null is used for the corresponding row of the prior table.

smooth_table

This is a list of dict that define the rows of the smooth_table and smooth_grid_table . The dictionary smooth_table [ i ] has the following keys:

name

an str specifying the name used to reference the i-th smoothing.

age_id

a list of int specifying the age values for this smoothing as indices in age_list .

time_id

a list of int specifying the time values for this smoothing as indices in time_list .

mulstd_value_prior_name

an str specifying the prior used for the value multiplier for the i-th smoothing; see mulstd_value_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dage_prior_name

an str specifying the prior used for the age difference multiplier for the i-th smoothing; see mulstd_dage_prior_id This key is optional and its default value is None which corresponds to null in the database.

mulstd_dtime_prior_name

an str specifying the prior used for the time difference multiplier for the i-th smoothing; see mulstd_dtime_prior_id This key is optional and its default value is None which corresponds to null in the database.

fun

This is a function with the following syntax:

( v , da , dt ) = fun ( a , t )

The str results v , da , and dt are the names for the value prior, age difference prior, and time difference prior corresponding to the i-th smoothing. The value da is not used, when age a = age_id [ -1 ] . The value dt is not used, when time t = time_id [ -1 ] . Note that there is an i , j such that a = age_list [ age_id [ i ]] and t = time_list [ time_id [ j ]] .

const_value

The fun return value v may be a float . In this case, the value of the smoothing, at the corresponding age and time, is constrained to be v using the const_value column in the smooth_grid table.

nslist_dict

This is a dict that specifies the nslist_table and the nslist_pair_table . For each nslist_name ,

nslist_dict [ nslist_name ] = [ ( node_name , smooth_name ), … ]

Note that each pair above is a python tuple :

Variable

Value Type

Description

nslist_name

str

name of one list of node,smoothing pairs

node_name

str

name of the node for this pair

smooth_name

str

name of the smoothing for this pair

rate_table

This is a list of dict that define the rows of the rate_table . The dictionary rate_table [ i ] has the following:

Key

Value Type

Description

name

str

pini, iota, rho, chi, or omega

parent_smooth

str

parent smoothing

child_smooth

str

a single child smoothing

child_nslist

str

list of child smoothings

The value None is used to represent a null value for the parent and child smoothings. If a key name does not appear, null is used for the corresponding value. If a name ; e.g. rho , does not appear, the value null is used for the parent and child smoothings for the corresponding rate.

mulcov_table

This is a list of dict that define the rows of the mulcov_table . The dictionary mulcov_table [ i ] has the following:

Key

Value Type

Description

covariate

str

is the covariate column

type

str

rate_value , meas_value , or meas_noise

effected

str

integrand or rate affected

group

str

the group that is affected

smooth

str

smoothing at group level

subsmooth

str

smoothing at subgroup level

effected

If type is rate_value , effected is a rate. Otherwise it is an integrand.

group

If the group key is not present, the first group in subgroup_table is used.

subsmooth

If the subsmooth key is not present, the value null is used for the subgroup smoothing in the corresponding row and a warning is printed.

option_table

This is a list of dict that define the values option_name , option_value in the option table. The i-th row of the table will have

      option_name = option_table [ i ][ 'name' ]
      option_value = option_table [ i ][ 'value' ]

rate_eff_cov_table

This is a list of dict that define the rows of the rate_eff_cov_table . The dictionary rate_eff_cov_table [ i ] has the following:

Key

Value Type

Description

'node_name'

str

identifies the node for the i-th row

'covariate_name'

str

identifies the covariate for the i-th row

split_value

float

value of the splitting covariate

'weight_name'

str

identifies weighting for this row

Contents

Name

Title

create_database.py

create_database: Example and Test

Example

The file create_database.py contains and example and test of create_database .