CNCP 2021

计算糖蛋白质组学:软件与算法、机遇与挑战

计算糖蛋白质组学作品云上分享会

First Session

Nov. 26, 2021 20:00 (UTC+8)

Zoom Meeting ID:967 1224 8816

申洁晨
西北大学

StrucGP: De Novo Structural Sequencing of Site-specific N-Glycans on Glycoproteins Using a Modularization Strategy

Abstract:

Precision mapping of glycans at structural and site-specific level is still one of the most challenging tasks in the glycobiology field. Here, we describe a novel modularization strategy for de novo interpretation of N-glycan structures on intact glycopeptides using tandem mass spectrometry. A new algorithm named StrucGP is also developed to automate the interpretation process for large-scale analysis. By dividing an N-glycan into three modules and identifying each module using distinct patterns of Y ions or a combination of distinguishable B/Y ions, the method enables determination of detailed glycan structures on thousands of glycosites in mouse brain, which are comprised of four types of core structures and seventeen branch structures with three glycan subtypes. Owing to the database-independent glycan mapping strategy, StrucGP also facilitates the identification of rare/new glycan structures. The approach will be greatly beneficial for in-depth structural and functional study of glycoproteins in the biomedical research.

Keywords:

Glycoproteomics, Glycan structure, Mass spectrometry, Glycosylation
杨奕
复旦大学

GproDIA enables data-independent acquisition glycoproteomics with comprehensive statistical control

Abstract:

Large-scale profiling of intact glycopeptides is critical but challenging in glycoproteomics. Data independent acquisition (DIA) is an emerging technology with deep proteome coverage and accurate quantitative capability in proteomics studies, but is still in the early stage of development in the field of glycoproteomics. We propose GproDIA, a framework that applies the concept of peptide-centric DIA analysis to proteome-wide characterization of intact glycopeptides with comprehensive statistical control by a two-dimentional false discovery rate approach and a glycoform inference algorithm, enabling accurate identification of intact glycopeptides using wide isolation windows. We further utilize a semi-empirical spectrum prediction strategy to expand the coverage of spectral libraries of glycopeptides. We benchmark our method for N-glycopeptide profiling on DIA data of yeast and human serum samples, demonstrating that DIA with GproDIA outperforms the data-dependent acquisition-based methods for glycoproteomics in terms of capacity and data completeness of identification, as well as accuracy and precision of quantification. We expect that this work can provide a powerful tool for glycoproteomic studies.

Keywords:

Glycoproteomics, Data independent acquisition, Mass spectrometry

Second Session

Dec. 2, 2021 08:00 (UTC+8)

Zoom Meeting ID:953 5589 3407

Daniel Polasky
University of Michigan

Peptide-first Glycopeptide Search: Combining MSFragger Glyco Search with Glycan FDR Control in PTM-Shepherd

Abstract:

Advances in methods for enrichment and mass spectrometric analysis of intact glycopeptides are increasingly producing large-scale, high-quality glycoproteomics datasets, but confidently annotating both peptide and glycan identities in the resulting spectra remains challenging. We have developed a “peptide-first” glyco search strategy using the mass offset search of MSFragger to identify glycopeptides as the combination of a peptide sequence and a glycan mass. This approach takes advantage of glycan fragmentation and the indexed search of MSFragger to greatly improve the sensitivity of glycopeptide spectrum matching in CID/HCD data. We have recently introduced a module in the post-search annotation tool PTM-Shepherd to convert the glycan mass to a specific glycan composition and perform glycan composition-specific FDR estimation. Matching the peptide sequence first greatly reduces the number of possible glycans being considered in glycan matching, and, along with the use of both Y- and oxonium ions from the spectrum, allows our method to achieve sensitive and robust glycan assignment in the presence of entrapment glycans known not to be present in the sample. Combined with tools for quantitation, we now have a complete pipeline in the Fragpipe computational environment for analysis of glycopeptide tandem MS data.
卢磊
University of Wisconsin

Glycopeptide characterization with MetaMorpheus

Abstract:

Mass spectrometry (MS) is the gold standard for interrogating the glycoproteome, enabling the localization of glycans to specific glycosites. Recent applications of electron-driven dissociation methods have shown promise in localizing modified O-glycosites even in multiply glycosylated peptides. Yet, standard approaches for interpreting MS/MS spectra are ill-suited to the heterogeneity of O-glycopeptides, especially for the most challenging mucin-type O-glycosylation. O-glycoproteomic analysis pipelines are needed to search for multiply O-glycosylated peptides within reasonable time frames for simple mixtures of O-glycoproteins and proteome-scale experiments.

We developed O-Pair Search identifies O-glycopeptides via an ion-indexed open modification search and localizes O-glycosites using graph theory and probability-based localization Using paired collision- and electron-based dissociation spectra. O-Pair Search reduces search times compared to current popular O-glycopeptide processing software Byonic, while defining O-glycosite localization confidence levels and generating more O-glycopeptide identifications.

方盼
苏州大学

Multiplexed quantitative site-specific N-glycoproteomics method development and applications

Abstract:

Regulation of protein N-glycosylation is essential in human cells. However, large-scale, accurate, and site-specific quantification of glycosylation is still technically challenging. We introduced SugarQuant, an integrated mass spectrometry-based pipeline comprising protein aggregation capture (PAC)-based sample preparation, multi-notch MS3 acquisition (Glyco-SPS-MS3) and a data-processing tool (GlycoBinder) that enables confident identification and quantification of intact glycopeptides in complex biological samples. We apply SugarQuant to identify and quantify more than 5,000 unique glycoforms in Burkitt’s lymphoma cells, and determine site-specific glycosylation changes that occurred upon inhibition of fucosylation at high confidence. We further demonstrated the implementation of FAIMS in SugarQuant provided the most accuracy and precision for glycoproteomics.

Third Session

Dec. 10, 2021 20:00 (UTC+8)

Zoom Meeting ID:986 7749 6425

朱赫
大连化学物理研究所

MS-Decipher: a user-friendly proteome database search software with an emphasis on deciphering the spectra of O-linked glycopeptides

Abstract:

The interpretation of mass spectrometry (MS) data is a key step in proteomics analysis, and the identification of glycosylation, one of the post-translational modifications (PTMs), is essential for understanding the biological functions in living systems. In order to simplify the analysis of proteomic data sets, especially O-glycoproteomic data sets, we provide a user-friendly proteomics database search platform, MS-Decipher, to identify peptides from MS data. Two scoring schemes can be used for peptide matching. As for the result validation step, there are also more than one method for users to choose. In addition, a special search mode we developed before, O-search, is presented to search O-glycopeptides for the O-glycoproteomic analysis. It was found that MS-Decipher performed well in peptide search and O-glycopeptide search compared with traditional database search software. What’s more, MS-Decipher has a user-friendly graphical user interface, making it easy to operate. Data and result files in multiple formats can be used for search and validation steps. MS-Decipher is implemented in Java and can be used across platforms. MS-Decipher is free for academic use.
曾文锋
Max-Planck Institute of Biochemistry

Towards flexible and comprehensive glycopeptide analysis with pGlyco Series

Abstract:

I believe Glycoproteomics is becoming one of the most general extensions of proteomics after phosphoproteomics. In recent years, we have developed glycan-first glycopeptide search engine series called pGlyco for both N/O-glycopeptide identification. Additional to peptide part analysis, pGlyco more focuses on glycan part search and quality control. For glycan identification, pGlyco3, the newest version of pGlyco, mainly supports glycan structure database search with built-in or user-defined glycan structures. This feature allows us to analyze glycans with customized modifications. We also provide the glycan database-free search named pGlycoNovo in pGlyco3 as an experimental feature. pGlyco3 integrates a site-specific glycan localization (SSGL) algorithm –– pGlycoSite to fast and accurately localize glycans by using ETD/EThcD spectra. All these modules enable flexible and comprehensive analyses of glycopeptides. But there are still a lot of issues in terms of both ‘wet’ and ‘dry’ experiments, I will also discuss about possible future developments in pGlyco3 in this talk.
邱继辉
台湾中央研究院

Memos for Glycoproteomics 2.0

Abstract:

Despite a flourish of new glycoproteomic software tools being introduced lately, plenty of technical issues remain unsolved. As we enter the last month of 2021 and start looking into our crystal balls, a reality check on what are missing and desirable, to be calibrated against what are feasible and impactful, would be most wise and timely. Glycoproteomics is a holistic approach to map the site-specific glycosylation profile of all expressed proteins at any one time and track their changes over onco-developmental and myriad pathophysiological stages. The elephant in the room is the six blind man parable, seeing plenty of randomly sampled trees but not the forest, often exacerbated by failing to nail down with high precision the exquisitely distributed flowers and fruits during our brute force swath harvesting. The question to ask is: what are needed to advance glycobiology, not what we can seemingly offer. In this short closing talk following a series of impressive presentation on the computational advances in mining glycoproteomic data, I would like to pose some searching questions on the way forward, to invite deep thinking by human intelligence on the core matters. Together, we shall write the memos for the next generation glycoproteomics.