censor_density#

View page source

The Censored Gaussian and Laplace Densities#

References#

See censoring and the heading Likelihoods for mixed continuous-discrete distributions on the wiki page for likelihood functions.

Discussion#

We use \(\mu\) for the mean and \(\delta > 0\) for the standard deviation of a Gaussian or Laplace random variable \(y\). We use \(c \leq \mu\) for the value we are censoring the random variable at. The censored random variable is defined by

\[\begin{split}\underline{y} = \left\{ \begin{array}{ll} c & \R{if} \; y \leq c \\ y & \R{otherwise} \end{array} \right.\end{split}\]

The crucial property is that the censored density functions (defined below) are smooth function with respect to the mean value \(\mu\) (but not even continuous with respect to \(c\) or \(y\)).

Simulation Test#

The file test/user/censor_density.py contains a test of maximum likelihood estimation using the continuous-discrete densities proposed below.

Gaussian#

Density, G(y,mu,delta)#

The Gaussian density function is given by

\[G( y , \mu , \delta ) = \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]\]

Error Function#

The Error function is defined (for \(0 \leq x\)) by

\[\R{erf}(x) = \sqrt{ \frac{1}{\pi} } \int_{-x}^{+x} \exp \left( - t^2 \right) \; \R{d} t\]

Using he change of variables \(t = \sqrt{2}^{-1} (y - \mu) / \delta )\) we have \(y = \mu + t \delta \sqrt{2}\) and

\[\R{erf}(x) = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{\mu - x \delta \sqrt{2}}^{\mu + x \delta \sqrt{2}} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y\]

Setting \(x = \sqrt{2}^{-1} ( \mu - c ) / \delta\) we obtain

\[\R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right) = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{c}^{2 \mu - c} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y\]

Note that this integral is negative when \(c > \mu\). The Gaussian density is symmetric about \(y = \mu\) and its integral from minus infinity to plus infinity is one. Hence

\[\frac{ 1 - \R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right) }{2} = \sqrt{ \frac{1}{2 \pi \delta^2} } \int_{-\infty}^{c} \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y\]

Censored Density, G(y,mu,delta,c)#

The censored Gaussian density is defined by

\[\begin{split}G ( \underline{y} , \mu , \delta , c ) = \left\{ \begin{array}{ll} \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) / 2 & \R{if} \; \underline{y} = c \\ G( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right.\end{split}\]

This density function is with respect to the Lebesgue measure plus an atom with mass one at \(\underline{y} = c\).

Difference Between Means#

We use \(\overline{\underline{y}}\) to denote the mean after censoring the distribution:

\[\frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) + \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \int_c^{+\infty} \frac{y - \mu}{ \delta } \exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right] \; \R{d} y\]
\[\frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) - \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \left[ \exp \left( - \frac{1}{2} \left[ \frac{y - \mu}{\delta} \right]^2 \right) \right]_c^{+\infty}\]
\[\overline{\underline{y}} - \mu = \frac{c - \mu}{2} \left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) + \sqrt{ \frac{1}{ 2 \pi } } \exp \left( - \frac{1}{2} \left[ \frac{c - \mu}{\delta} \right]^2 \right)\]

Laplace#

Density, L(y,mu,delta)#

The Laplace density function is given by

\[L( y , \mu , \delta ) = \sqrt{ \frac{1}{2 \delta^2 } } \exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right]\]

Indefinite Integral#

The indefinite integral with respect to \(y\), for \(x \leq \mu\), is

\[\int_{-\infty}^{x} L( y , \mu , \delta ) \; \R{d} y = \sqrt{ \frac{1}{2 \delta^2 } } \int_{-\infty}^{x} \exp \left( - \sqrt{2} \frac{\mu - y}{\delta} \right) \; \R{d} y\]

Using \(c \leq \mu\), we obtain

\[\int_{-\infty}^{c} L( y , \mu , \delta ) \; \R{d} y = \frac{1}{2} \exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right)\]

Censored Density, L(y,mu,delta,c)#

The censored Laplace density is defined by

\[\begin{split}L ( \underline{y} , \mu , \delta , c ) = \left\{ \begin{array}{ll} (1 / 2 ) \exp \left( - ( \mu - c ) \sqrt{2} / \delta \right) & \R{if} \; \underline{y} = c \\ L( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right.\end{split}\]

This density function is with respect to the Lebesgue measure plus an atom with mass one at \(\underline{y} = c\).

Difference Between Means#

We use \(\overline{\underline{y}}\) to denote the mean after censoring the distribution:

\[\frac{ \overline{\underline{y}} - \mu }{ \delta } = \frac{c - \mu}{2 \delta } \exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right) + \sqrt{ \frac{1}{2 \delta^2 } } \int_c^{+\infty} \frac{y - \mu}{\delta} \exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right] \; \R{d} y\]

Using integration by parts, one can obtain a formula for \(\overline{\underline{y}} - \mu\) in a manner similar to calculation of the Difference Between Means for the Gaussian case.

Log Gaussian#

Suppose that \(\log(y + \eta )\) has a Gaussian distribution with mean \(\log( \mu + \eta )\) and standard deviation \(\delta\) , and we are censoring the distribution at the value \(\log( c + \eta)\) . For this case

\[\begin{split}G( y , \mu , \delta ) & = \sqrt{ \frac{1}{ 2 \pi \delta^2 } } \exp \left[ - \frac{1}{2} \left( \frac{ \log(y + \eta) - \log( \mu + \eta) } {\delta} \right)^2 \right] \\ G ( \underline{y} , \mu , \delta , c ) & = \left\{ \begin{array}{ll} \left[ 1 - \R{erf}\left( \sqrt{2}^{-1} [ \log( \mu + \eta ) - \log( c + \eta )] / \delta \right) \right] / 2 & \R{if} \; \underline{y} = c \\ G( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right.\end{split}\]

Log Laplace#

Suppose that \(\log(y + \eta )\) has a Laplace distribution with mean \(\log( \mu + \eta )\) and standard deviation \(\delta\) , and we are censoring the distribution at the value \(\log( c + \eta)\) . For this case

\[\begin{split}L( y , \mu , \delta ) & = \sqrt{ \frac{1}{2 \delta^2 } } \exp \left[ - \sqrt{2} \left| \frac{\log(y + \eta) - \log(\mu + \eta)}{\delta} \right| \right] \\ L ( \underline{y} , \mu , \delta , c ) & = \left\{ \begin{array}{ll} (1 / 2 ) \exp \left( - [ \log( \mu + \eta ) - \log( c + \eta) ] \sqrt{2} / \delta \right) & \R{if} \; \underline{y} = c \\ L( \underline{y} , \mu , \delta ) & \R{otherwise} \end{array} \right.\end{split}\]