Unit variability

A composite sample for food \(k\) is composed of \(nu_{k}\) units with nominal unit weight \(wu_{k}\). The weight of a composite sample is \(wm_{k} = nu_{k} \cdot wu_{k}\) with mean residue value \(cm_{k}\).

Beta distribution

Under the beta model simulated unit values are drawn from a bounded distribution on the interval (0, \(c_{max}\)) with \(c_{max} = nu_{k} \cdot cm_{k}\). The standard beta distribution is defined on the interval (0, 1) and is usually characterised by two parameters \(a\) and \(b\), with \(a>0\), \(b>0\) (see e.g. Mood et al. 1974) [33]. Alternatively, it can be parameterised by the mean

\[\mu = a/(a+b)\]

and the variance

\[\sigma^2 = ab/(a+b+1)^{-1}(a+b)^{-2}\]

or, as applied in MCRA, by the mean \(\mu\) and the squared coefficient of variation

\[cv^2 = ba^{-1}(a+b+1)^{-1}\]

For the simulated unit values in each iteration of the program we require an expected value \(cm_{k}\). This scales down to a mean value \(\mu = cm_{k}/c_{max} = 1/nu_{k}\) in the (standard) beta distribution. From this value for \(\mu\) and an externally specified value for \(cv_{k}\) the parameters \(a\) and \(b\) of the beta distribution are calculated as:

\[a = b(nu_{k} - 1)^{-1}\]

and

\[b = \frac{(nu_{k} - 1)(nu_{k} - 1 - cv_{k}^2)}{nu_{k}cv_{k}^2}\]

From the second formula it can be seen that \(cv_{k}\) should not be larger than \(\sqrt{nu_{k} -1}\) in order to avoid negative values for \(b\). When the unit variability is specified by a variability factor

\[v_{k} = \frac{p97.5_{k}}{cm_{k}}\]

instead of a coefficient of variation \(cv_{k}\) then MCRA applies a bisection algorithm to find a such that the cumulative probability

\[P[Beta(a,b)] = 0.975\]

for \(b = a(nu_{k}-1)\).

Sampled values from the beta distribution are rescaled by multiplication with \(cm_{max}\) to unit concentrations \(c_{ijk}\) on the interval \((0, cm_{max}\)).

Lognormal distribution

The lognormal distribution is characterised by \(\mu\) and \(\sigma\), which are the mean and standard deviation of the log-transformed concentrations. The unit log-concentrations are drawn from a normal distribution with mean \(\mu = ln(cm_{ik}) - 1/2\sigma^2\). The coefficient of variation \(cv\) is turned into the standard deviation \(\sigma\) on the log-transformed scale with:

\[\sigma = \sqrt{ln(cv^2 + 1)}\]

The variability factor is defined as the 97.5th percentile of the concentration in the individual measurements divided by the corresponding mean concentration seen in the composite sample. A variability factor \(v\) is converted into the standard deviation \(\sigma\) as follows:

\[v = \frac{p97.5}{mean} = \frac{e^{\mu + 1.96\sigma} }{e^{\mu + 1/2 \sigma^2}} = e^{1.96\sigma - 1/2 \sigma^2}\]

with μ and \(\sigma\) representing the mean and standard deviation of the log-transformed concentrations. So

\[ln(v) = 1.96\sigma – 1/2\sigma^2\]

Solving for \(\sigma\) gives:

\[\sigma^2 – 2 \cdot 1.96\sigma + 2log(v) = 0\]

with roots for \(\sigma\) according to:

\[\sigma = 1.96 \pm \sqrt{(1.96^2 – 2log(v))}\]

The smallest positive root is taken as an estimate for \(\sigma\).

Bernoulli distribution

The bernoulli model is a limiting case of the beta model, which can be used if no information on unit variability is available, but only the number of units in a composite sample is known (see van der Voet et al. 2001). As a worst case approach we may take the coefficient of variation \(cv\) as large as possible. When \(cv\) is equal to the maximum possible value \(\sqrt{nu_{k}-1}\), the (unstandardised) beta distribution simplifies to a bernoulli distribution with probability

\[\left( nu_{k} – 1 \right) / nu_{k}\]

or

\[(v_{k}-1)/ v_{k}\]

for the value 0 and probability

\[1/nu_{k}\]

or

\[1/v_{k}\]

for the value \(c_{max} = nu_{k} \cdot cm_{k}\).

In MCRA values 0 are actually replaced by \(cm_{k}\), to keep all values on the conservative side. For example, with \(nu_{k}\) = 5, there will be 80% probability at \(c_{ijk} = cm_{k}\) and 20% probability at \(c_{ijk} = c_{max}\). When the number of units \(nu_{k}\) in the composite sample is missing, the nominal unit weight \(wu_{k}\) is used to calculate the parameter for unit variability.