The output of LC-MS metabolomics experiments consists of mass-peak intensities identified through a peak-picking/alignment procedure. Besides imperfections in biological samples and instrumentation, data accuracy is highly dependent on the applied algorithms and their parameters. Consequently, quality control (QC) is essential for further data analysis. Here, we present a QC approach that is based on discrepancies between replicate samples. First, the quantile normalization of per-sample log-signal distributions is applied to each group of biologically homogeneous samples. Next, the overall quality of each replicate group is characterized by the Z-transformed correlation coefficients between samples. This general QC allows a tuning of the procedure's parameters which minimizes the inter-replicate discrepancies in the generated output. Subsequently, an in-depth QC measure detects local neighborhoods on a template of aligned chromatograms that are enriched by divergences between intensity profiles of replicate samples. These neighborhoods are determined through a segmentation algorithm. The retention time (RT)-m/z positions of the neighborhoods with local divergences are indicative of either: incorrect alignment of chromatographic features, technical problems in the chromatograms, or to a true biological discrepancy between replicates for particular metabolites. We expect this method to aid in the accurate analysis of metabolomics data and in the development of new peak-picking/alignment procedures.
ASJC Scopus subject areas
- Analytical Chemistry