A universal language for finding mass spectrometry data patterns

Tito Damiani, Alan K. Jarmusch, Allegra T. Aron, Daniel Petras, Vanessa V. Phelan, Haoqi Nina Zhao, Wout Bittremieux, Deepa D. Acharya, Mohammed M.A. Ahmed, Anelize Bauermeister, Matthew J. Bertin, Paul D. Boudreau, Ricardo M. Borges, Benjamin P. Bowen, Christopher J. Brown, Fernanda O. Chagas, Kenneth D. Clevenger, Mario S.P. Correia, William J. Crandall, Max CrüsemannEoin Fahy, Oliver Fiehn, Neha Garg, William H. Gerwick, Jeffrey R. Gilbert, Daniel Globisch, Paulo Wender P. Gomes, Steffen Heuckeroth, C. Andrew James, Scott A. Jarmusch, Sarvar A. Kakhkhorov, Kyo Bin Kang, Nikolas Kessler, Roland D. Kersten, Hyunwoo Kim, Riley D. Kirk, Oliver Kohlbacher, Eftychia E. Kontou, Ken Liu, Itzel Lizama-Chamu, Gordon T. Luu, Tal Luzzatto Knaan, Helena Mannochio-Russo, Michael T. Marty, Yuki Matsuzawa, Andrew C. McAvoy, Laura Isobel McCall, Osama G. Mohamed, Omri Nahor, Heiko Neuweger, Timo H.J. Niedermeyer, Kozo Nishida, Trent R. Northen, Kirsten E. Overdahl, Johannes Rainer, Raphael Reher, Elys Rodriguez, Timo T. Sachsenberg, Laura M. Sanchez, Robin Schmid, Cole Stevens, Shankar Subramaniam, Zhenyu Tian, Ashootosh Tripathi, Hiroshi Tsugawa, Justin J.J. van der Hooft, Andrea Vicini, Axel Walter, Tilmann Weber, Quanbo Xiong, Tao Xu, Tomáš Pluskal, Pieter C. Dorrestein, Mingxun Wang

Research output: Contribution to journalArticlepeer-review

Abstract

Despite being information rich, the vast majority of untargeted mass spectrometry data are underutilized; most analytes are not used for downstream interpretation or reanalysis after publication. The inability to dive into these rich raw mass spectrometry datasets is due to the limited flexibility and scalability of existing software tools. Here we introduce a new language, the Mass Spectrometry Query Language (MassQL), and an accompanying software ecosystem that addresses these issues by enabling the community to directly query mass spectrometry data with an expressive set of user-defined mass spectrometry patterns. Illustrated by real-world examples, MassQL provides a data-driven definition of chemical diversity by enabling the reanalysis of all public untargeted metabolomics data, empowering scientists across many disciplines to make new discoveries. MassQL has been widely implemented in multiple open-source and commercial mass spectrometry analysis tools, which enhances the ability, interoperability and reproducibility of mining of mass spectrometry data for the research community.

Original languageEnglish
Article number113
Number of pages8
JournalNature Methods
Early online date12 May 2025
DOIs
StatePublished - Jun 2025

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer Nature America, Inc. 2025.

ASJC Scopus subject areas

  • Biotechnology
  • Biochemistry
  • Molecular Biology
  • Cell Biology

Fingerprint

Dive into the research topics of 'A universal language for finding mass spectrometry data patterns'. Together they form a unique fingerprint.

Cite this