PARC harmonised HBM data format

Note

The PARC harmonised data format as well as the MCRA data format are under developement and may be changed in the future.

In PARC, a harmonised data format for human biomonitoring data is being developed by VITO. Data harmonization improves the comparability of data from different HBM studies and interoperability for use with different analysis tools such as MCRA and the tool for the calculation of summary statistics of the HBM data, which can be made available via the IPCHEM portal and/or integrated into the European HBM dashboard. More information about this data format and instructions on preparing data files compliant with this format can be found at the PARC HBM data harmonization web page. This page also contains a tool for validating data files prepared in this format and an example data file that can be uploaded to and used in MCRA for testing.

Excel data files provided to MCRA in this format are mapped during the file upload process to the internal data structure/format of MCRA. For this, MCRA uses a custom mapping/conversion procedure. For a large part, this mapping is fairly straightforward. However, for some fields and entities explicit choices are made that users need to be aware of.

Survey/study
- Each data file corresponds to one HBM survey/study in MCRA.
- Survey StartDate refers to the first sampling date of all repeated samples.
- Survey EndDate refers to the last sampling date of all repeated samples.
- When all reported country values of the subjects are the same, then this is used as location of the survey/study.
Subjects/individuals
- Each subject maps to MCRA individuals of the study/survey.
- A selection of the subject/individual properties is mapped to the MCRA data format. Since MCRA does not support repeated properties, a choice has to be made when mapping individual properties that are repeated individual/subject properties in the harmonised HBM data format.
- Repeated recordings for each subject’s weight are averaged to one value: BodyWeight.
- Repeated recordings for each subject’s height are averaged to one value: Height.
- From the repeated recordings for smoking only the last recording is taken: Smoking_status.
Samples
- Each sample maps to an MCRA sample.
- The matrix code is translated to a biological matrix and sampling type, e.g. US translates to Urine and Spot, BP to Blood and Plasma.
- id_timepoint refers to the sampling time and is translated to DayOfSurvey.
Analytical methods
- MCRA analytical methods are derived/reconstructed from the concentration data sheets of the harmonised HBM data format based on the reported detection limits (LOQs/LODs) and substances and the matrix of the samples.
Sample concentrations/measurements
- Concentrations without value (blanks) are recorded as missing value (MV).
- Concentrations recorded as -1 are interpreted as a value below limit of detection (LOD).
- Concentrations recorded as -2 or -3 are interpreted as values below the limit of quantification (LOQ).