Concentration models

Let \(x\) denote a random variable from a lognormal distribution. Then, the log transformed variable \(y = ln(x)\) is normally distributed with \(\mu\) and variance \(\sigma\). The probability density function (p.d.f.) of y may be expressed as:

\[f_{y}(y, p_{0}, \mu_{y}, \sigma_{y}^2) = p_{0}I(y;0) + (1 - p_{0})(1 - I(y;0)) \cdot \frac{1}{\sqrt{2\pi\sigma_{y}}} \exp \frac{(y - \mu_{y})^2}{2\sigma_{y}^2}\]

where \(p_{0} = \mathit{Pr}(y< log(X_{lor})), x_{lor}\) is the limit of reporting and \(I(y;0)\) is an indicator function for \(y< log(X_{lor})\). For \(p_{0} = 0\) the p.d.f. of \(y\) reduces to the usual lognormal density. The left truncated density for \(y \geq \log(X_{lor})\) may be expressed as:

\[f_{y}(y; \mu_{y}, \sigma_{y}^2) = \frac{1}{\sqrt{2\pi\sigma_{y}}} \exp \frac{(y - \mu_{y})^2}{2\sigma_{y}^2} / (1 - \Phi(z))\]

with \(\Phi(\cdot)\) the standard normal c.d.f. and \(z = (\log(x_{lor}) - \mu_{y}) / \sigma_{z}\). Model parameters are estimated using maximum likelihood estimation based on the loglikelihood functions specified below. The loglikelihood functions are evaluated in R, using the optim algorithm to find estimates for \(\mu_{y}, \sigma_{y}^2\) and \(p_{0}\).

Mixture zero spike and censored lognormal

The loglikelihood may be expressed as:

\[\log L ( p_{0}, \mu_{y}, \sigma_{y}^2) = \sum_{i=1}^{n_{0}} \log(p_{0} + (1 - p_{0})\Phi(z_{i})) + n_{1} \log ( \frac{1 - p_{0}} {\sqrt{2\pi\sigma_{y}}}) - \sum_{i = n_{0} + 1}^n \frac{(y_{i} - \mu_{y})^2}{2\sigma_{y}^2}\]

where \(y_{i} = \log(x_{i})\), \(\Phi(\cdot)\) is the standard normal c.d.f., \(z = (\log(x_{i,lor}) - \mu_{y}) / \sigma_{y}\), \(z_{lor} = (\log(lor) - \mu_{y}) / \sigma_{y}\) with \(n_{0}\) number of censored values \((x_{i} < x_{i,lor}), n_{1}\) number of uncensored values \((x_{i} \geq x_{i,lor})\) and \(x_i{,} i = 1 \cdots n\).

Multiple values for LOR are allowed.

Censored lognormal

When \(p_{0} = 0\) the loglikelihood reduces to:

\[\log L ( \mu_{y}, \sigma_{y}^2) = \sum_{i=1}^{n_{0}} \log(\Phi(z)) + n_{1} \log ( \frac{1} {\sqrt{2\pi\sigma_{y}}}) - \sum_{i = n_{0} + 1}^n \frac{(y_{i} - \mu_{y})^2}{2\sigma_{y}^2}\]

Multiple values for LOR are allowed.

Mixture censored spike and truncated lognormal

Ignoring the \(n_{0}\) values below \(x_{lor}\), the loglikelihood may be expressed as:

\[\log L ( \mu_{y}, \sigma_{y}^2) = -n_{1} \log(1 - \Phi(z)) + n_{1} \log ( \frac{1} {\sqrt{2\pi\sigma_{y}}}) - \sum_{i = n_{0} + 1}^n \frac{(y_{i} - \mu_{y})^2}{2\sigma_{y}^2}\]

Only one value for LOR is allowed.

Mixture censored spike and lognormal

Ignoring the \(n_{0}\) values below \(x_{lor}\), the loglikelihood may be expressed as:

\[\log L ( \mu_{y}, \sigma_{y}^2) = n_{1} \log ( \frac{1} {\sqrt{2\pi\sigma_{y}}}) - \sum_{i = n_{0} + 1}^n \frac{(y_{i} - \mu_{y})^2}{2\sigma_{y}^2}\]

Only one value for LOR is allowed.