Co-reporter:Hiroshi Tsugawa;Kazutaka Ikeda;Wataru Tanaka;Yuya Senoo
Journal of Cheminformatics 2017 Volume 9( Issue 1) pp:
Publication Date(Web):2017 December
DOI:10.1186/s13321-017-0205-3
Liquid chromatography coupled with electrospray ionization tandem mass spectrometry (LC–ESI–MS/MS) is used for comprehensive metabolome and lipidome analyses. Compound identification relies on similarity matching of the retention time (RT), precursor m/z, isotopic ratio, and MS/MS spectrum with reference compounds. For sphingolipids, however, little information on the RT and MS/MS references is available.Negative-ion ESI–MS/MS is a useful method for the structural characterization of sphingolipids. We created theoretical MS/MS spectra for 21 sphingolipid classes in human and mouse (109,448 molecules), with substructure-level annotation of unique fragment ions by MS-FINDER software. The existence of ceramides with β-hydroxy fatty acids was confirmed in mouse tissues based on cheminformatic- and quantum chemical evidences. The RT of sphingo- and glycerolipid species was also predicted for our LC condition. With this information, MS-DIAL software for untargeted metabolome profiling could identify 415 unique structures including 282 glycerolipids and 133 sphingolipids from human cells (HEK and HeLa) and mouse tissues (ear and liver).
MS-DIAL and MS-FINDER software programs can identify 42 lipid classes (21 sphingo- and 21 glycerolipids) with the in silico RT and MS/MS library. The library is freely available as Microsoft Excel files at the software section of our RIKEN PRIMe website (http://prime.psc.riken.jp/).
Co-reporter:Ipputa Tada;Yasuhiro Tanizawa
BMC Genomics 2017 Volume 18( Issue 2 Supplement) pp:
Publication Date(Web):2017 March
DOI:10.1186/s12864-017-3499-7
Standard graphical tools for whole genome comparison require a reference genome. However, any reference is also subject to annotation biases and rearrangements, and may not serve as the standard except for those of extensively studied model species. To fully exploit the rapidly accumulating sequence data from the recent sequencing technologies, genome comparison without any reference has been anticipated.We introduce a circular genome visualizer to compare complete genomes of closely related species. This tool visualizes the position of orthologous gene clusters rather than actual sequences or their features, thereby achieving the comparative view without using a single reference genome. The essential information is the matrix of orthologous gene clusters whose positions (not sequences) are color-coded in circular graphics. As a demonstration, comparison of 14 Lactobacillus paracasei strains and one L. casei strain revealed not only large-scale rearrangements but also genomic islands that are strain-specific. Comparison of 73 Helicobacter pylori strains confirmed their genetic consistency and also revealed the three general patterns of large-scale genome inversions.From the ample sequence information in the GenBank/ENA/DDBJ repository, we can reconstruct a genomic consensus for particular species. By visualizing multiple strains at a glance, we can identify conserved as well as strain-specific regions in multiply sequenced genomes. Positional consistency for orthologous genes provides information orthogonal to major sequence features such as the GC content or sequence similarity of marker genes. The positional comparison is therefore useful for identifying large-scale genome rearrangements or gene transfers.
Co-reporter:Hiroshi Tsugawa, Tobias Kind, Ryo Nakabayashi, Daichi Yukihira, Wataru Tanaka, Tomas Cajka, Kazuki Saito, Oliver Fiehn, and Masanori Arita
Analytical Chemistry 2016 Volume 88(Issue 16) pp:7946
Publication Date(Web):July 15, 2016
DOI:10.1021/acs.analchem.6b00770
Compound identification from accurate mass MS/MS spectra is a bottleneck for untargeted metabolomics. In this study, we propose nine rules of hydrogen rearrangement (HR) during bond cleavages in low-energy collision-induced dissociation (CID). These rules are based on the classic even-electron rule and cover heteroatoms and multistage fragmentation. We evaluated our HR rules by the statistics of MassBank MS/MS spectra in addition to enthalpy calculations, yielding three levels of computational MS/MS annotation: “resolved” (regular HR behavior following HR rules), “semiresolved” (irregular HR behavior), and “formula-assigned” (lacking structure assignment). With this nomenclature, 78.4% of a total of 18506 MS/MS fragment ions in the MassBank database and 84.8% of a total of 36370 MS/MS fragment ions in the GNPS database were (semi-) resolved by predicted bond cleavages. We also introduce the MS-FINDER software for structure elucidation. Molecular formulas of precursor ions are determined from accurate mass, isotope ratio, and product ion information. All isomer structures of the predicted formula are retrieved from metabolome databases, and MS/MS fragmentations are predicted in silico. The structures are ranked by a combined weighting score considering bond dissociation energies, mass accuracies, fragment linkages, and, most importantly, nine HR rules. The program was validated by its ability to correctly calculate molecular formulas with 98.0% accuracy for 5063 MassBank MS/MS records and to yield the correct structural isomer with 82.1% accuracy within the top-3 candidates. In a test with 936 manually identified spectra from an untargeted HILIC-QTOF MS data set of human plasma, formulas were correctly predicted in 90.4% of the cases, and the correct isomer structure was retrieved at 80.4% probability within the top-3 candidates, including for compounds that were absent in mass spectral libraries. The MS-FINDER software is freely available at http://prime.psc.riken.jp/.