-------------------------------------------------- lines 5-311 of file: xrst/math/censor_density.xrst -------------------------------------------------- {xrst_begin censor_density} {xrst_spell erf ll } The Censored Gaussian and Laplace Densities ########################################### References ********** See `censoring `_ and the heading *Likelihoods for mixed continuous-discrete distributions* on the `wiki page `_ for likelihood functions. Discussion ********** We use :math:`\mu` for the mean and :math:`\delta > 0` for the standard deviation of a Gaussian or Laplace random variable :math:`y`. We use :math:`c \leq \mu` for the value we are censoring the random variable at. The censored random variable is defined by .. math:: \underline{y} = \left\{ \begin{array}{ll} c & \R{if} \; y \leq c \\ y & \R{otherwise} \end{array} \right. The crucial property is that the censored density functions (defined below) are smooth function with respect to the mean value :math:`\mu` (but not even continuous with respect to :math:`c` or :math:`y`). Simulation Test *************** The file ``test/user/censor_density.py`` contains a test of maximum likelihood estimation using the continuous-discrete densities proposed below. Gaussian ******** Density, G(y,mu,delta) ====================== The Gaussian density function is given by .. math:: G( y , \mu , \delta ) = \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] Error Function ============== The Error function is defined (for :math:`0 \leq x`) by .. math:: \R{erf}(x) = \sqrt{ \frac{1}{\pi} } \int_{-x}^{+x} \exp \left( - t^2 \right) \; \R{d} t Using he change of variables :math:`t = \sqrt{2}^{-1} (y - \mu) / \delta )` we have :math:`y = \mu + t \delta \sqrt{2}` and .. math:: \R{erf}(x) = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{\mu - x \delta \sqrt{2}}^{\mu + x \delta \sqrt{2}} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y Setting :math:`x = \sqrt{2}^{-1} ( \mu - c ) / \delta` we obtain .. math:: \R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right) = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{c}^{2 \mu - c} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y Note that this integral is negative when :math:`c > \mu`. The Gaussian density is symmetric about :math:`y = \mu` and its integral from minus infinity to plus infinity is one. Hence .. math:: \frac{ 1 - \R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right) }{2} = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{-\infty}^{c} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y Censored Density, G(y,mu,delta,c) ================================= The censored Gaussian density is defined by .. math:: G ( \underline{y} , \mu , \delta , c ) = \left\{ \begin{array}{ll} \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) / 2 & \R{if} \; \underline{y} = c \\ G( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right. This density function is with respect to the Lebesgue measure plus an atom with mass one at :math:`\underline{y} = c`. Difference Between Means ======================== We use :math:`\overline{\underline{y}}` to denote the mean after censoring the distribution: .. math:: \frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) + \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \int_c^{+\infty} \frac{y - \mu}{ \delta } \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y .. math:: \frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) - \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \left[ \exp \left( - \frac{1}{2} \left[ \frac{y - \mu}{\delta} \right]^2 \right) \right]_c^{+\infty} .. math:: \overline{\underline{y}} - \mu = \frac{c - \mu}{2} \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) + \sqrt{ \frac{1}{ 2 \pi } } \exp \left( - \frac{1}{2} \left[ \frac{c - \mu}{\delta} \right]^2 \right) Laplace ******* Density, L(y,mu,delta) ====================== The Laplace density function is given by .. math:: L( y , \mu , \delta ) = \sqrt{ \frac{1}{2 \delta^2 } } \exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right] Indefinite Integral =================== The indefinite integral with respect to :math:`y`, for :math:`x \leq \mu`, is .. math:: \int_{-\infty}^{x} L( y , \mu , \delta ) \; \R{d} y = \sqrt{ \frac{1}{2 \delta^2 } } \int_{-\infty}^{x} \exp \left( - \sqrt{2} \frac{\mu - y}{\delta} \right) \; \R{d} y Using :math:`c \leq \mu`, we obtain .. math:: \int_{-\infty}^{c} L( y , \mu , \delta ) \; \R{d} y = \frac{1}{2} \exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right) Censored Density, L(y,mu,delta,c) ================================= The censored Laplace density is defined by .. math:: L ( \underline{y} , \mu , \delta , c ) = \left\{ \begin{array}{ll} (1 / 2 ) \exp \left( - ( \mu - c ) \sqrt{2} / \delta \right) & \R{if} \; \underline{y} = c \\ L( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right. This density function is with respect to the Lebesgue measure plus an atom with mass one at :math:`\underline{y} = c`. Difference Between Means ======================== We use :math:`\overline{\underline{y}}` to denote the mean after censoring the distribution: .. math:: \frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right) + \sqrt{ \frac{1}{2 \delta^2 } } \int_c^{+\infty} \frac{y - \mu}{\delta} \exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right] \; \R{d} y Using integration by parts, one can obtain a formula for :math:`\overline{\underline{y}} - \mu` in a manner similar to calculation of the :ref:`censor_density@Gaussian@Difference Between Means` for the Gaussian case. Log Gaussian ************ Suppose that :math:`\log(y + \eta )` has a Gaussian distribution with mean :math:`\log( \mu + \eta )` and standard deviation :math:`\delta` , and we are censoring the distribution at the value :math:`\log( c + \eta)` . For this case .. math:: G( y , \mu , \delta ) & = \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \exp \left[ - \frac{1}{2} \left( \frac{ \log(y + \eta) - \log( \mu + \eta) } {\delta} \right)^2 \right] \\ G ( \underline{y} , \mu , \delta , c ) & = \left\{ \begin{array}{ll} \left[ 1 - \R{erf}\left( \sqrt{2}^{-1} [ \log( \mu + \eta ) - \log( c + \eta )] / \delta \right) \right] / 2 & \R{if} \; \underline{y} = c \\ G( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right. Log Laplace *********** Suppose that :math:`\log(y + \eta )` has a Laplace distribution with mean :math:`\log( \mu + \eta )` and standard deviation :math:`\delta` , and we are censoring the distribution at the value :math:`\log( c + \eta)` . For this case .. math:: L( y , \mu , \delta ) & = \sqrt{ \frac{1}{2 \delta^2 } } \exp \left[ - \sqrt{2} \left| \frac{\log(y + \eta) - \log(\mu + \eta)}{\delta} \right| \right] \\ L ( \underline{y} , \mu , \delta , c ) & = \left\{ \begin{array}{ll} (1 / 2 ) \exp \left( - [ \log( \mu + \eta ) - \log( c + \eta) ] \sqrt{2} / \delta \right) & \R{if} \; \underline{y} = c \\ L( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right. {xrst_end censor_density}