Concentration model types

Empirical model

Data points are sampled at random from the available set. Non-detects are handled by imputation. If occurrence patterns are used, a proportion \(p_{0} / p_{ND}\) of non-detects is set as 0. See also appendix.

../../../_images/empirical.svg

Figure 11 Empirical distribution

Non-detect spike lognormal model

A binomial model is used to estimate the proportion \(p\) of positive values (detects). This is just the proportion observed in the data (unless agricultural use data have been used to set a proportion of true zeroes). A lognormal model is fitted to the positive data. This provides estimates of \(\mu\) and \(\sigma\), which are the mean and standard deviation of the natural logarithm of the concentration. Simulated concentrations are a non-detect with probability \(p_{ND} = 1-p\) or a value sampled from the fitted lognormal distribution with probability \(p\). Non-detects are handled by imputation. If occurrence patterns are used, a proportion \(p_{0} / p_{ND}\) of non-detects is set as 0. Minimum requirements: at least two positive concentration values. See also appendix.

../../../_images/nondetectspikelognormal.svg

Figure 12 Nondetect Spike Lognormal distribution

Non-Detect-Spike Truncated lognormal model

A binomial model is used to estimate the proportion \(p\) of positive values (detects). This is just the proportion observed in the data (unless agricultural use data have been used to set a proportion of true zeroes in which case \(p\) is calculated on the remaining proportion). A truncated lognormal model, with LOR as the truncation limit, is fitted to the positive data, leading to estimates of \(\mu\) and \(\sigma\), which are the mean and standard deviation of the natural logarithm of the concentration. Simulated concentrations are a non-detect with probability \(p_{ND} = 1-p\) or a value sampled from the fitted lognormal distribution with probability \(p\). Non-detects are handled by imputation. If occurrence patterns are used, a proportion \(p_{0} / p_{ND}\) of non-detects is set as 0. Minimum requirements: at least two positive concentration values, all non-detects must have one LOR value. See also appendix.

../../../_images/NonDetectSpikeTruncatedLogNormal.svg

Figure 13 Nondetect Spike Truncated Lognormal distribution

Censored Lognormal model

A censored lognormal model, with LOR as the censoring limit, is fitted to the data, both positives and non-detects. This provides estimates of \(\mu\) and \(\sigma\), which are the mean and standard deviation of the natural logarithm of the concentration. If agricultural use data are being used, then a proportion \(p_{0} / p_{ND}\) of non-detects will be excluded, where \(p_{0}\) will be lowered to \(p_{ND}\) if it would be higher. Simulated concentrations are sampled from the fitted lognormal distribution. If agricultural use data have been used, simulated concentrations are 0 with probability \(p_{0}\) or are sampled from the fitted lognormal distribution with probability \(1-p_{0}\). Minimum requirements: at least one positive concentration value. See also appendix.

../../../_images/CensoredLogNormal.svg

Figure 14 Censored Lognormal distribution

Zero-spike censored lognormal model

A mixture distribution of a spike of true zeroes and a censored lognormal model, with LOR as the censoring limit, is fitted to the data (non-detects and positives. This provides estimates of \(p_{0}\), which is the proportion of true zeroes, and of \(\mu\) and \(\sigma\), which are the mean and standard deviation of the natural logarithm of the concentration. Simulated concentrations are 0 with probability \(p_{0}\) and are sampled from the fitted lognormal distribution with probability \(1-p_{0}\). Minimum requirements: at least one positive concentration value, no agricultural use data for the food-compound combination (which directly specify \(p_{0}\), therefore it should not be estimated from the data). See also appendix.

../../../_images/ZeroSpikeCensoredLogNormal.svg

Figure 15 Zero Spike Censored Lognormal distribution

Non-detect spike MRL model

This model simply takes values specified in an input table as Maximum Residue Limit (MRL) to be used for the proportion of positive values in the concentration dataset, and can be used to force the use of a pessimistic value.

Summary statistics model

For this model, no individual measurements on raw agricultural commodities are needed. The final estimates of µ and σ are simply provided or pooled or estimated using e.g. a coefficient of variation. Specific use of this model is found in Total Diet Studies. In general, each TDS food sample is prepared only once, yielding one measurement for a TDS food sample. The variability of the underlying distribution is unknown. However, a rough guess can be made using the e.g. coefficient of variation of the subsamples (in general raw agricultural commodities) that compose the TDS food sample. The estimated standard deviation is calculated as a pooled estimate using the coefficient of variation and the count of each subsample in the TDS food.