Occurrence patterns calculation
Assumptions can be made for each food on the basis of findings in concentration data.
Tier 1: 0% occurrence is assumed for all substances with no positive concentrations at all; 100% occurrence is assumed for all substances with at least one positive concentration;
Tier 2: 0% occurrence is assumed for all substances with no positive concentrations at all; for substance-food combinations with at least one positive (finding), use findings patterns to implement a specific interpretation of Option 5 in the SANTE document, as described below.
Therefore in both tiers, substance-food combinations without any positive finding are handled in the optimistic way by assuming absolute zeroes for any censored observation.
If Tier 2 is selected, then for each of the modelled foods a tabulation is made of the observed frequencies of positives for all substance combinations (including the empty set), based on the active substance concentrations. For an OP consisting of just one substance, the basic frequency is the number of samples with a positive concentration divided by the number of samples where the substance has been measured (i.e., is not a MV). For an OP consisting of multiple substances, the basic frequency is the number of samples with all concentrations positive for the members divided by the number of samples where all members of the set have been measured.
After calculation of the basic frequencies for all occurrence patterns, these frequencies are rescaled such that the overall sum of frequencies is 100%. When substance authorisations are available, then patterns involving unauthorised substances are not rescaled and only those patterns for which all substances are authorised are rescaled such that the sum of all frequencies is 100%.
Note: the Tier 2 procedure is not what is literally written in the SANTE document, but is an interpretation agreed upon by EFSA and RIVM. An alternative model, not yet implemented, but perhaps more in line with the text of the SANTE document, would be to double the basic frequencies to modelled occurrence pattern frequencies. Only if the sum of all frequencies becomes larger than 100%, the set of frequencies would be normalised to 100% sum.