Chronic exposure assessment, episodically consumed foods

For episodically consumed foods we need to take the probability of consumption into account. Define \(p_{i}\) as the probability that individual \(i\) consumes the food on any given day. The usual intake for this individual is then given by the product of \(p_{i}\) and \(T_{i}\) which is now defined as the usual amount on consumption days. Since individuals will vary in their probability pi, besides modelling the amounts as for daily consumed foods, it is also necessary to model the frequency of consumption. A three stage analysis of 24-hour recall data is the necessary:

  1. A model for the frequency of consumption

  2. A model for the intakes on consumption days

  3. Integration of both models in order to obtain a usual intake distribution.

Step 2 uses the analysis outlined in the previous section for the positive intakes only. For step 1 two popular models which describe between-individual variation for the probability of consumption are the beta-binomial model and the logistic-normal model.

Beta-Binomial model for frequencies (BBN)

Let \(n_{i}\) be the total number of recall days for individual \(i\) and \(X_{i}\) the number of days with a positive intake. The distribution of \(X_{i}\), with \(p_{i}\) the probability of consumption for individual \(i\), is given by

\[X_{i} = \mathit{Binomial}(n_{i}, p_{i})\]

In this model the probability \(p_{i}\) varies among individuals according to the Beta distribution:

\[f(p) = B^{-1} (\alpha, \beta) p^{\alpha - 1} (1 - p)^{\beta-1}\]

with

\[B(\alpha, \beta) = \frac{\Gamma(\alpha)\Gamma(\beta)}{\Gamma{(\alpha + \beta)}}\]

Combining the binomial and the Beta distribution results in the betabinomial distribution:

\[\begin{split}P(X_{i} = x) = \left( \begin{array}{c} n_{i} \\ r \end{array} \right) \frac{B(\alpha + x, n_{i} + \beta - x )}{B(\alpha, \beta)}\end{split}\]

The mean and variance of the betabinomial distribution are given by

\[E[X_{i}] = n_{i} \frac{\alpha}{\alpha + \beta}\]

and

\[\mathit{Var} [X_{i}] = n_{i} \frac{\alpha \beta(\alpha + \beta + n_{i})}{(\alpha + \beta)^2 (\alpha + \beta + 1)}\]

Using the reparameterization \(\pi = \alpha/(\alpha+\beta)\) and \(\phi= 1/(\alpha + \beta + 1)\), it follows that

\[E[X_{i}] = n_{i} \pi\]

and

\[\mathit{Var}[X_{i} ]= n_{i} \pi(1-\pi)[1+(n_{i}-1)\phi]\]

This reparameterization enables to model the probability \(\pi_{i}\) of consumption for individual \(i\) directly as a logistic regression:

\[\mathit{logit}(\pi_{i}) = \gamma_{0} + \gamma_{1}^t Z_{i}\]

Note that the dispersion parameter \(\phi\): is assumed to be equal for all individuals. The betabinomial logistic regression model can be fitted by means of maximum likelihood.

Model based frequencies for usual intake

For the model based usual intake distribution the estimated parameters \(\pi_{i}\) and \(\phi\) are backtransformed using \(\alpha_{i}=\pi_{i} \phi/(1-\phi)\) and \(\beta_{i}=(1-\pi_{i} ) \phi/(1-\phi )\). These can then be used to draw from the Beta distribution.

Model assisted frequencies for usual intake

For the model assisted usual intake distribution a prediction of the consumption probability is required for every individual. Simple predictions are

  1. the observed frequencies for every individual or

  2. the fitted probability for every individual. When there are no covariables the fitted probability is the same for every individual.

  3. Alternatively one can use the approach outlined in Kipnis et al (2009) employing the conditional expectation of the probability given the observed frequency:

\[E(p_{i} | X_{i} = x ) = \int_{p} p f (p | X_{i} = x) \mathrm{d}p\]
\[= \int_{p} p \frac{f(X_{i} = x | p) f(p) }{ \int f(X_{i} = x | p) f(p) dp} \mathrm{d}p\]
\[\begin{split}= \frac{1}{P(x_{i} = x)} \int_{p} p \left( \begin{array}{c} n_{i} \\ r \end{array} \right) p^x (1 - p) ^{n_{i} - x} B^{-1} (\alpha_{i},\beta_{i}) p ^{\alpha_{i} - 1} (1 - p)^{\beta_{i} -1} \mathrm{d}p\end{split}\]
\[\begin{split}= \frac{B^{-1}(\alpha_{i}, \beta_{i})}{P(x_{i} = x)} \left( \begin{array}{c} n_{i} \\ r \end{array} \right) \int_p p^{\alpha_{i} + x} (1 - p) ^{n_{i} + \beta_{i} -x -1} \mathrm{d}p\end{split}\]
\[= \frac{B(\alpha_{i} + x + 1, n_{i} + \beta_{i} -x)}{B(\alpha_{i} + x, n_{i} + \beta_{i} -x)}\]
\[= \frac{\alpha_{i} + x}{\alpha_{i} + \beta_{i} -x}\]

For individual with zero intakes on all recall days a prediction for the random individual amount effect \(b_{i}\) is not available. There seem to be two option for predicting the usual intake for such individuals:

  • Set the individual intake to zero

  • Simulate a model based prediction for the amount and combine this with the conditional expected probability given above to obtain an individual usual intake.

Logistic-Normal model for frequencies (LNN0)

In this model the distribution of \(X_{i}\) is again binomial:

\[X_{i} = \mathit{Binomial}(n_{i}, p_{i})\]

The probability \(p_{i}\) is now given by a logistic regression with a random effect in the linear predictor which represents the between-individual variation in the probability \(p_{i}\)

\(\mathit{logit}(p_{i}) = \lambda_{i} + v_{i}\) with \(v_{i} \sim N(0,\sigma_{v}^2)\) and the regression equation \(\lambda_{i}=\gamma_{0}+\gamma_{1}^t Z_{i}\)

The marginal probability \(\pi_{i}\) is obtained by integrating over the random effect \(v_{i}\), i.e. using Gauss-Hermite integration

\[\pi_{i} = \int H(\lambda_{i} + v )f(v)dv \approx \frac{1}{\sqrt{\pi}} \sum_{j=1}^{N} w_{j} H ( \lambda_{i} + \sqrt{2} \sigma_{v}x_{j})\]

in which \(H()\) is the inverse of the logit transformation. Note that this is different from \(\mathit{logit}^{-1} (\lambda_{i})\) which is the median probability. The model can be fitted by maximum likelihood using Gauss-Hermite integration. An (approximate) maximum likelihood procedure is implemented in routine glmer of the lme4 package in R. For a new vector of covariates \(Z_{i}^*\) the linear predictor \(\lambda_{i}^*\) can be calculated along with its standard error \(\mathit{Se}(\lambda_{i}^* )\). The marginal predicted probability \(\pi_{i}^*\) can be calculated by means of Gauss-Hermite integration and the standard error of the predicted probability can be calculated by means of the usual Taylor series expansion:

\[\mathit{Se}(\pi_{i}^*) \approx \frac{\mathit{Se}(\lambda_{i}^*)}{\sqrt{\pi}} \sum_{j=1}^{N} w_{j} \frac{d}{d\lambda_{i}^*} H(\lambda_{i}^* + \sqrt{2} \sigma_{v}x_{j})\]
\[= \frac{\mathit{Se}(\lambda_{i}^*)}{\sqrt{\pi}} \sum_{j=1}^{N} w_{j} H(\lambda_{i}^* + \sqrt{2} \sigma_{v}x_{j}) [1 - H(\lambda_{i}^* + \sqrt{2} \sigma_{v}x_{j})]\]

Model based frequencies for usual intake

For the model based usual intake distribution the estimated parameters \(\lambda_{i}\) and \(\sigma_{v}^2\) can be used to generate individual probabilities.

Model assisted frequencies for usual intake

For the model assisted usual intake distribution simple predictors are (a) the observed frequencies and (b) the marginal probability \(\pi_{i}\). The conditional expectation (c) is given by

\[E(p_{i} | X_{i} = x) = \int_{v} H(\lambda_{i} + v) f(v|X_{i} = x) \mathrm{d}v\]
\[= \int_{v} H(\lambda_{i} + v) \frac{f(X_{i} = x_{i} | v ) f(v)}{\int f(X_{i} = x_{i} | v) f(v) \mathrm{d}v } \mathrm{d}v\]
\[=\frac{ \int_{v} H(\lambda{_i} + v)[H(\lambda{_i} + v)]^{x_{i}} [1 - H(\lambda{_i} + v)]^{n_{i} - x_{i}} f(v) \mathrm{d}v }{\int_{v} [H(\lambda{_i} + v)]^{x_{i}} [1 - H(\lambda{_i} + v)]^{n_{i} - x_{i}} f(v) \mathrm{d}v}\]

and both nominator and denominator can be approximated by means of the Gauss-Hermite integration. For individual with zero intakes on all recall days see above for the two options.

Logistic-Normal model for frequencies correlated with amounts (LNN)

This model is extends the LNN0 model with a correlation between the individual random effect \(b_{i}\) for amounts and the individual random effect \(v_{i}\) for frequencies. This model is also known as the NCI model and is introduced by Tooze et al (2006) [Tooze et al., 2006] with further mathematical details in Kipnis et al (2009) [Kipnis et al., 2009]. The model can be written as

\[\mathit{logit}(P(Y_{ij} > 0)) = \lambda_{i} + v_{i}\]
\[g(Y_{ij}) = \mu_{i} + b_{i} + w_{ij}\]

and \((v_{i}, b_{i}) \sim \mathit{BivariateNormal} (0,0,\sigma_{v}^2, \sigma_{b}^2, \rho)\) and \(w_{ij} \sim N(0,\sigma_{w}^2)\)

The model can be fitted by maximum likelihood employing two-dimensional Gauss-Hermite integration.

Model based usual intake

Model based usual intake requires generation of the pair \((v_{i}, b_{i})\) for many hypothetical individual. The usual intake \(U_{i}\) for such a hypothetical individual is then given by

\[U_{i}= H (\lambda_{i}+\nu_{i} ) T_{i}\]
\[= H(\lambda_{i}+\nu_{i} ) E_{w} [g^{-1} (\mu_{i}+b_{i}+w_{ij} ) | b_{i} ]\]
\[= H(\lambda_{i}+\nu_{i} ) F(b_{i} )\]

The second term can be calculated using the method outlined for daily intakes.

Model assisted usual intake

This requires simultaneous prediction of the random effect for frequency and for amount as outlined in Kipnis et al (2009) [Kipnis et al., 2009]. We have for individual \(i\) in the study \((U_{i} | Y_{i1}, \cdots, Y_{in_{i} }) = (U_{i} | Y_{i1}^*, \cdots ,Y_{in{_i}}^*)= (U_{i} | x_{i}, \bar{Y} _{i}^*)\) where \(x_{i}\) is the number of positive intakes and \(\bar{Y} _{i}^*\) is the mean of the transformed positive intakes. It follows that the required conditional expectation \(P_{i}\) equals

\[P_{i} = E[U_{i} | x_{i}, \bar{Y} _{i}^*]\]
\[= E_{v_{i}, b_{i}} [H(\lambda_{i} + v_{i})F(b_{i}) | x_{i}, \bar{Y} _{i}^*]\]
\[= \frac{ \int \int H(\lambda_{i} + v_{i})F(b_{i}) f(x_{i}, \bar{Y}_{i}^* | v_{i}, b_{i})\phi(v_{i}, b_{i})dv_{i}db_{i} } {\int\int f(x_{i}, \bar{Y}_{i}^* | v_{i}, b_{i}) \phi(v_{i}, b_{i})dv_{i}db_{i} }\]

where

\[f(x_{i}, \bar{Y}_{i}^* | v_{i}, b_{i}) = [H(\lambda_{i} + v_{i})]^{x_{i}} [1 - H(\lambda_{i} + v_{i})]^{n_{i}-x_{i}} \phi(\bar{Y}_{i}^* - \mu_{i} - b_{i} ; 0, \sigma_{w}^2/x_{i} )\]

Both nominator and denominator can be approximated by a two-dimensional Gauss-Hermite integration. Note that for the log-transform \(F(b_{i} )=T_{i}\) = \(\exp(\mu_{i}+b_{i} + \sigma_{w}^2)/ 2)\) can be calculated exactly; for the power transformation an approximation must be used. It can be expected that the predicted usual intake will not have the correct variance. This can possibly be remedied by equating the mean and variance of \(U_{i}\) and \(P_{i}\). These are however rather involved to calculate.

For individual with zero intakes on all recall days the model assisted usual intake can be set to zero, or can be simulated as follows

  1. Calculate the Model assisted frequency \(P_{0}\) for usual intake (see LNN0)

  2. Transform \(P_{0}\) back to the logistic scale, i.e. \(L_{0}= \mathit{logit} (P_{0} )\) . Get the conditional distribution of

\[(b | v = L_{0} - \lambda_{i}) \sim N \left( \frac{\sigma_{b}}{\sigma_{v}} \rho ( L_{0} - \lambda_{i}), (1 - \rho^2)\sigma_{b}^2 \right)\]
  1. Simulate a draw \(b_{0}\) from this conditional distribution and obtain the usual intake as \(P_{0}\exp(\mu_{i}+b_{0}+\sigma_{w}^2)\)

Note that the backtransformation from \(P_{0}\) to \(L_{0}\) is according to the median of the distribution rather than the mean.