Back to top
Toggle Light / Dark / Auto color theme
Toggle table of contents sidebar
\(\newcommand{\B}[1]{ {\bf #1} }\)
\(\newcommand{\R}[1]{ {\rm #1} }\)
\(\newcommand{\W}[1]{ \; #1 \; }\)
censor_density
View page source
The Censored Gaussian and Laplace Densities
References
See
censoring
and the heading
Likelihoods for mixed continuous-discrete distributions
on the
wiki page
for likelihood functions.
Discussion
We use \(\mu\) for the mean and
\(\delta > 0\) for the standard deviation
of a Gaussian or Laplace random variable \(y\) .
We use \(c \leq \mu\) for the value we are
censoring the random variable at.
The censored random variable is defined by
\[\begin{split}\underline{y} = \left\{ \begin{array}{ll}
c & \R{if} \; y \leq c
\\
y & \R{otherwise}
\end{array} \right.\end{split}\]
The crucial property is that the
censored density functions (defined below)
are smooth function with respect to the mean value \(\mu\)
(but not even continuous with respect to \(c\) or \(y\) ).
Simulation Test
The file test/user/censor_density.py
contains a test
of maximum likelihood estimation using the continuous-discrete densities
proposed below.
Gaussian
Density, G(y,mu,delta)
The Gaussian density function is given by
\[G( y , \mu , \delta )
=
\sqrt{ \frac{1}{ 2 \pi \delta^2 } }
\exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]\]
Error Function
The Error function is defined (for \(0 \leq x\) ) by
\[\R{erf}(x)
=
\sqrt{ \frac{1}{\pi} } \int_{-x}^{+x} \exp \left( - t^2 \right) \; \R{d} t\]
Using he change of variables \(t = \sqrt{2}^{-1} (y - \mu) / \delta )\)
we have \(y = \mu + t \delta \sqrt{2}\) and
\[\R{erf}(x)
=
\sqrt{ \frac{1}{2 \pi \delta^2} }
\int_{\mu - x \delta \sqrt{2}}^{\mu + x \delta \sqrt{2}}
\exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]
\; \R{d} y\]
Setting \(x = \sqrt{2}^{-1} ( \mu - c ) / \delta\) we obtain
\[\R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right)
=
\sqrt{ \frac{1}{2 \pi \delta^2} }
\int_{c}^{2 \mu - c}
\exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]
\; \R{d} y\]
Note that this integral is negative when \(c > \mu\) .
The Gaussian density is symmetric about \(y = \mu\)
and its integral from minus infinity to plus infinity is one.
Hence
\[\frac{
1 - \R{erf}\left( \sqrt{2}^{-1} ( \mu - c ) / \delta \right)
}{2}
=
\sqrt{ \frac{1}{2 \pi \delta^2} }
\int_{-\infty}^{c}
\exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]
\; \R{d} y\]
Censored Density, G(y,mu,delta,c)
The censored Gaussian density is defined by
\[\begin{split}G ( \underline{y} , \mu , \delta , c )
=
\left\{ \begin{array}{ll}
\left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right) / 2
& \R{if} \; \underline{y} = c
\\
G( \underline{y} , \mu , \delta ) & \R{otherwise}
\end{array} \right.\end{split}\]
This density function is with respect to the
Lebesgue measure plus an atom with mass one at \(\underline{y} = c\) .
Difference Between Means
We use \(\overline{\underline{y}}\) to
denote the mean after censoring the distribution:
\[\frac{ \overline{\underline{y}} - \mu }{ \delta }
=
\frac{c - \mu}{2 \delta }
\left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right)
+
\sqrt{ \frac{1}{ 2 \pi \delta^2 } }
\int_c^{+\infty} \frac{y - \mu}{ \delta }
\exp \left[ - \frac{1}{2} \left( \frac{y - \mu}{\delta} \right)^2 \right]
\; \R{d} y\]
\[\frac{ \overline{\underline{y}} - \mu }{ \delta }
=
\frac{c - \mu}{2 \delta }
\left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right)
-
\sqrt{ \frac{1}{ 2 \pi \delta^2 } }
\left[
\exp \left( - \frac{1}{2} \left[ \frac{y - \mu}{\delta} \right]^2 \right)
\right]_c^{+\infty}\]
\[\overline{\underline{y}} - \mu
=
\frac{c - \mu}{2}
\left( 1 - \R{erf}\left( \sqrt{2}^{-1} (\mu - c) / \delta \right) \right)
+
\sqrt{ \frac{1}{ 2 \pi } }
\exp \left( - \frac{1}{2} \left[ \frac{c - \mu}{\delta} \right]^2 \right)\]
Laplace
Density, L(y,mu,delta)
The Laplace density function is given by
\[L( y , \mu , \delta )
=
\sqrt{ \frac{1}{2 \delta^2 } }
\exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right]\]
Indefinite Integral
The indefinite integral with respect to \(y\) ,
for \(x \leq \mu\) , is
\[\int_{-\infty}^{x} L( y , \mu , \delta ) \; \R{d} y
=
\sqrt{ \frac{1}{2 \delta^2 } }
\int_{-\infty}^{x}
\exp \left( - \sqrt{2} \frac{\mu - y}{\delta} \right) \; \R{d} y\]
Using \(c \leq \mu\) , we obtain
\[\int_{-\infty}^{c} L( y , \mu , \delta ) \; \R{d} y
=
\frac{1}{2}
\exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right)\]
Censored Density, L(y,mu,delta,c)
The censored Laplace density is defined by
\[\begin{split}L ( \underline{y} , \mu , \delta , c )
=
\left\{ \begin{array}{ll}
(1 / 2 )
\exp \left( - ( \mu - c ) \sqrt{2} / \delta \right)
& \R{if} \; \underline{y} = c
\\
L( \underline{y} , \mu , \delta )
& \R{otherwise}
\end{array} \right.\end{split}\]
This density function is with respect to the
Lebesgue measure plus an atom with mass one at \(\underline{y} = c\) .
Difference Between Means
We use \(\overline{\underline{y}}\) to
denote the mean after censoring the distribution:
\[\frac{ \overline{\underline{y}} - \mu }{ \delta }
=
\frac{c - \mu}{2 \delta }
\exp \left( - \sqrt{2} \frac{\mu - c}{\delta} \right)
+
\sqrt{ \frac{1}{2 \delta^2 } }
\int_c^{+\infty} \frac{y - \mu}{\delta}
\exp \left[ - \sqrt{2} \left| \frac{y - \mu}{\delta} \right| \right]
\; \R{d} y\]
Using integration by parts,
one can obtain a formula for \(\overline{\underline{y}} - \mu\)
in a manner similar to calculation of the
Difference Between Means
for the Gaussian case.
Log Gaussian
Suppose that
\(\log(y + \eta )\) has a Gaussian distribution with mean
\(\log( \mu + \eta )\) and standard deviation \(\delta\) ,
and we are censoring the distribution at the value \(\log( c + \eta)\) .
For this case
\[\begin{split}G( y , \mu , \delta )
& =
\sqrt{ \frac{1}{ 2 \pi \delta^2 } }
\exp \left[ - \frac{1}{2} \left(
\frac{ \log(y + \eta) - \log( \mu + \eta) } {\delta}
\right)^2 \right]
\\
G ( \underline{y} , \mu , \delta , c )
& =
\left\{ \begin{array}{ll}
\left[ 1 - \R{erf}\left(
\sqrt{2}^{-1} [ \log( \mu + \eta ) - \log( c + \eta )] / \delta
\right) \right] / 2
& \R{if} \; \underline{y} = c
\\
G( \underline{y} , \mu , \delta ) & \R{otherwise}
\end{array} \right.\end{split}\]
Log Laplace
Suppose that
\(\log(y + \eta )\) has a Laplace distribution with mean
\(\log( \mu + \eta )\) and standard deviation \(\delta\) ,
and we are censoring the distribution at the value \(\log( c + \eta)\) .
For this case
\[\begin{split}L( y , \mu , \delta )
& =
\sqrt{ \frac{1}{2 \delta^2 } }
\exp \left[ - \sqrt{2} \left|
\frac{\log(y + \eta) - \log(\mu + \eta)}{\delta}
\right| \right]
\\
L ( \underline{y} , \mu , \delta , c )
& =
\left\{ \begin{array}{ll}
(1 / 2 )
\exp \left(
- [ \log( \mu + \eta ) - \log( c + \eta) ] \sqrt{2} / \delta
\right)
& \R{if} \; \underline{y} = c
\\
L( \underline{y} , \mu , \delta )
& \R{otherwise}
\end{array} \right.\end{split}\]