Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment

Bochao Zhang, Wenzhao Meng, Eline T. Luning Prak, Uri Hershberg

Research output: Contribution to journalArticlepeer-review


Immune repertoires are collections of lymphocytes that express diverse antigen receptor gene rearrangements consisting of Variable (V), (Diversity (D) in the case of heavy chains) and Joining (J) gene segments. Clonally related cells typically share the same germline gene segments and have highly similar junctional sequences within their third complementarity determining regions. Identifying clonal relatedness of sequences is a key step in the analysis of immune repertoires. The V gene is the most important for clone identification because it has the longest sequence and the greatest number of sequence variants. However, accurate identification of a clone's germline V gene source is challenging because there is a high degree of similarity between different germline V genes. This difficulty is compounded in antibodies, which can undergo somatic hypermutation. Furthermore, high-throughput sequencing experiments often generate partial sequences and have significant error rates. To address these issues, we describe a novel method to estimate which germline V genes (or alleles) cannot be discriminated under different conditions (read lengths, sequencing errors or somatic hypermutation frequencies). Starting with any set of germline V genes, this method measures their similarity using different sequencing lengths and calculates their likelihood of unambiguous assignment under different levels of mutation. Hence, one can identify, under different experimental and biological conditions, the germline V genes (or alleles) that cannot be uniquely identified and bundle them together into groups of specific V genes with highly similar sequences.

Original languageEnglish
Pages (from-to)105-116
Number of pages12
JournalJournal of Immunological Methods
StatePublished - 1 Dec 2015
Externally publishedYes

Bibliographical note

Funding Information:
The authors would like to thank Gregory Schwartz and Aaron Rosenfeld for their insightful comments and helpful arguments. Research reported in this publication was supported by the National Institute of Allergy and Infectious Diseases of the National Institutes of Health under Award Number P01AI106697 and by NIH P30-CA016520 . The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Publisher Copyright:
© 2015 Elsevier B.V.


  • B cells
  • Gene identification
  • Germline anotation
  • High throughput sequencing

ASJC Scopus subject areas

  • Immunology and Allergy
  • Immunology


Dive into the research topics of 'Discrimination of germline V genes at different sequencing lengths and mutational burdens: A new tool for identifying and evaluating the reliability of V gene assignment'. Together they form a unique fingerprint.

Cite this