CNCP 2021

特邀专家

陈鹏 陈顺兴 陈兴 丁明 黄河 姜颖 刘泽先 李雪明 李衍常 乔亮 秦伟捷 申华莉 水雯箐 孙世伟 唐淳 谭敏佳 田瑞军 瑕瑜 谢鹭 杨明坤 张莹 赵群 郑杰 周虎 朱力
陈鹏
北京大学

Bioorthogonal Chemistry-enabled Spatial-Temporal Proteomics

Abstract:

Employing small molecules or other chemical means to modulate the function of an intracellular protein of interest, particularly in a gain-of-function fashion, remains highly desired but challenging. In this talk, I will introduce a “genetically encoded chemical decaging” strategy that relies on our recently developed bioorthogonal cleavage reactions to control protein activation in living systems. These reactions exhibit high efficiency and low toxicity for decaging the chemically “masked” lysine or tyrosine residues on intracellular proteins, allowing the spatial and temporal resolved proteomics study in living systems. Most recently, with the assistance of computer-based design and screening, we further expanded our method from “precise decaging” of enzyme active-sites to “proximal decaging” of enzyme pockets. This new method, termed Computationally Aided and Genetically Encoded Proximal Decaging” (CAGE-prox) (CAGE-prox), showed general applicability for switching on the activity of a broad range of proteins under living conditions. I will end by showcasing exciting applications of our CAGE-prox technique on: i) constructing orthogonal and mutually exclusive kinase signaling cascades; ii) temporal caspase activation for time-resolved profiling of proteolytic events upon apoptosis; and iii) on-demand activation of bacterial effectors as potential protein prodrugs for cancer therapy.

Keywords:

proteomics, bioorthogonal reaction, spatial and temporal control, living systems
陈顺兴
南方科技大学

Profiling Intracellular Protein-Protein and Protein-Chemical Interactions at Scale with Cellular Biophysics Proteomics

Abstract:

The interaction of proteins with chemicals and other proteins underlies all cellular activities, and many bioactive compounds modulate the molecular functions of proteins through direct physical interactions. Mapping these interactions will reveal internal wiring of the cells providing insight into how these interactions are dysregulated in diseases and perturbed by chemicals. However, there are currently limited time- and cost-effective techniques for global profiling of intracellular protein-protein and protein-chemical interactions. The recent Cellular Thermal Shift Assay (CETSA) verifies the intracellular binding of experimental drugs to its intended protein targets, and had been integrated with quantitative MS (termed Thermal Proteome Profiling, TPP) into a modification-free drug target deconvolution technique. My laboratory had the privilege of contributing to the extension of this technique for metabolite-binding and membrane proteins, and had recently adapted it to simultaneously assess the intracellular assembly state for thousands of protein complexes, based on an unorthodox concept termed Thermal Proximity Co-aggregation (TPCA). Both MS-based CETSA and TPCA require neither tedious modification of chemicals nor cell engineering that greatly increase their throughput potential. I will describe recent experimental and computational advancements made in my laboratory to these techniques for profiling intracellular protein-protein interaction and protein-chemical interaction networks at scale toward “inter”Omics.

Keywords:

cellular thermal shift assay, thermal proximity co-aggregation, thermal proteome profiling, protein-protein interaction, protein-chemical interaction
陈兴
北京大学

Chemical Labeling-Assisted Glycoproteomics

Abstract:

Protein glycosylation, attachment of glycans to specific amino acids via glycosidic bonds, is the most ubiquitous and complex posttranslational modification. Based on the glycosidic bond and glycan structure, the major types of protein glycosylation include N-linked glycosylation, mucin-type O-linked glycosylation, and O-GlcNAcylation. Comprehensive analysis of protein glycosylation is a prerequisite for understanding the biological function of protein glycosylation, but remains challenging. To address this challenge, we take advantage of chemical labeling of glycans with clickable unnatural sugars and develop an effective platform for comprehensive analysis of intact N- and O-glycopeptides in one sample from cell lysates and tissues. The chemical labeling-assisted glycoproteomics strategy is applied to generate large-scale datasets of protein glycosylation in various tissues of mice, demonstrating its potential in facilitating our understanding of glycobiology.

Keywords:

metabolic glycan labeling, glycomics, glycoproteomics, click chemistry, intact glycopeptide
丁明
中国药科大学

Discovery Molecular Targets of Natural Product Tanshintone With Quantitative Proteome

Abstract:

Ischemia-reperfusion injury is an important reason for cell and tissue damage in clinical, especially during the therapy procedures of myocardial infarction. Natural products from plant Salvia miltiorrhiza Bge have been used in clinical and show benefit for the cardiovascular diseases. Our lab try to use quantitative proteomics methods to elucidate the molecular mechanisms in the ischemia-reperfusion injury and to discover the potential drug targets of Salvia miltiorrhiza Bge. We also developed a biotin-maleimide probe for electrophilic cysteine profiling, which can quantify more than 18,000 cysteine sites in one experiments. With this method, we discovered several potential targets of natural product Tanshintone.

Keywords:

reperfusion injury, phosphorylation, chemical proteome, tanshintone
黄河
中国科学院上海药物研究所

Systematic Investigation of Key Regulatory Elements for Lysine β-hydroxybutyrylation

Abstract:

Short-chain fatty acids and their corresponding acyl-CoAs sit at the crossroads of multiple metabolic pathways and play important roles in diverse cellular processes. A noteworthy example is the newly identified protein posttranlation modifications (PTMs), lysine β-hydroxybutyrylation (Kbhb), which are derived from one of the ketone bodies β-hydroxybutyrate. We have demonstrated that histone Kbhb directly stimulates transcription, and established novel functions for β-hydroxybutyrate to regulate gene expressions. However, key elements for regulating this physiology-relevant pathway remain unknown, hindering characterization of mechanisms by which this modification exert its biological functions. Here we systemetically investigate the key regulatory enzymes and substrates of Kbhb, which will illuminates the landscape of the Kbhb pathway and lays a solid foundation for future studies of this pathway in cellular physiology and human diseases.

Keywords:

posttranslational modifications, lysine β-hydroxybutyrylation, regulatory enzymes, substrates
姜颖
国家蛋白质科学中心(北京)

Proteomic Subtyping as a Postoperative Recurrence Risk Assessment Analysis in Hepatocellular Carcinoma

Abstract:

Proteomics are increasingly important in predicting clinical outcomes. The quantitative proteomic data has suggested the heterogeneity in early-stage hepatocellular carcinoma (HCC) and are used to stratify the cohort into three subtypes (i.e., S-I, S-II and S-III) with different clinical outcomes, but larger cohorts and comprehensive proteomic analysis are needed in order to provide definitive answers. we established a proteomic pipeline integrating the data-independent acquisition (DIA) on four mass spectrometers (MS) in parallel and spectral library-based database-searching. The proteomics pipeline was not only robust for analyzing snap-frozen sample, but also for formalin-fixed, paraffin-embedded (FFPE) samples. There are 1,024 patients with primary HCC from three independently cohorts in our analyses. To further verify the proteomic subtyping in real world, we identified the subtypes of all the recruited HCC patients of this study using SRPS algorithm. Cox regression analysis showed the proteomic subtypes were significantly associated with overall survival and disease free survival independent of clinical and pathological features. In summary, the proteomic subtypes do not only apply to the early-stage HCC patients with HBV infection, but also suitable to the HCC patients with BCLC stage B, HCC patients without HBV infection or cirrhosis, showing the universality of proteomic subtypes in HCC patients.

Keywords:

proteomic stratification, data-independent acquisition, recurrence risk, multi-centric and well characterized cohorts
刘泽先
中山大学

Genetic variation driven PTM aberrances: from molecular mechanism to targeted cancer therapies

Abstract:

Post-translational modifications (PTMs) were critical for regulating cellular processes, and their aberrances were heavily implicated in cancer. Massive PTM sites have been identified through experimental identification and high-throughput proteomics techniques, however, their enzyme-specific regulation remains largely unknown. Recently, we developed the Deep-PLA software for HAT/HDAC-specific acetylation prediction based on deep learning, and employed the protein–protein interaction and co-sublocalization to reduce filter the false positive predictions. Through large-scale prediction based on TCGA cancer omics data, it was observed that mutations more frequently occurred at the region around acetylation sites, and acetylation-related mutations (ARMs) had higher variant allele fraction values than non-ARMs, which meant these mutations might be more functional in cancer. Furthermore, ARM proteins were significantly enriched in cancer genes and druggable proteins, and clinical survival analysis demonstrated that the patients with at least one ARM had significantly worse clinical prognosis in cancers such as head-neck squamous cell carcinoma. Besides the substrates and sites, we also studied the enzymes of PTMs, for example, HER2, which is the targeted kinase by trastuzumab in HER2 positive gastric cancer. Through monitoring patients by ctDNA, it was observed that the mutations of PIK3CA/R1/C3 or ERBB2/4 could indicated the trastuzumab resistance. Additionally, mutations in NF1 contributed to trastuzumab resistance, which was further confirmed through in vitro and in vivo studies, while combined HER2 and MEK/ERK blockade overcame trastuzumab resistance. Taken together, the PTM systems including the substrates, sites and enzymes, were critical in cancer, and further studies should be contributed to this area.

Keywords:

acetylation, phosphorylation, mutation, cancer
李雪明
清华大学

Continual learning in cryoEM particle picking

Abstract:

Before we can process the biological images of cryo-electron microscopy(cryoEM), the prerequisite is to be able to see the sample of interest, so we need methods such as particle picking to find the protein or other objects of interest. However, limited by the radiation damage, the signal-to-noise ratio of cryoEM micrographs is very low, and hence the particle picking is often challenges and labor consuming. While Template matching has been widely to accelerate particle picking. it requires manual intervention and the provision of templates, consequently, relies heavily on the user's experience and is not friendly to the automated processing.

Deep learning methods are showing great potential due to their ability in object identification. Although deep learning methods no longer require users to directly provide templates, they still need enough training data to ensure the correctness and accuracy of recognition. In order to achieve fully automated, even intelligent, we have introduced a continual learning method based on the deep learning framework for the particle picking task. The significance of continual learning is that deep neural networks can be continuously trained and enhanced. In continual applications, the computer can continuously learn new feature knowledge, accumulate, and become more and more powerful.

李衍常
国家蛋白质科学中心(北京)

Quantitative Proteomics for Ubiquitination Detection and Functional Researches

Abstract:

Ubiquitin chains, as the carriers of biological information, perform specific biological functions and constitute the "Ubiquitin Code" system. The lack of systematic and efficient screening strategies for atypical ubiquitin chain modified substrates limits the study on identification and functional mechanism of atypical ubiquitin chain modified substrates. Also the ubiquitination signal network is reversely trimmed through deubiquitinating enzymes (DUB). The specificity of DUBs and ubiquitin chains was challenging but meritorious, which help us deeply understand the precise regulatory mechanism of UPS enzymes and substrates. Technically it is remaining challenge to directly display the specificity of certain DUB for their corresponding ubiquitin linkages. To improve the detection sensitivity and coverage for ubiquitin chains and modified sites, we developed tandem hybrid UBD (ThUBD) to enrich the ubiquitinated proteins and constructed the trypsin and LysargiNase tandem digestion strategy to improve the modified sites identification. According to these high-performance technologies, we used budding yeast to establish a high-throughput profiling and validation strategy for substrate modified by atypical ubiquitin chain based on quantitative proteomics. High-throughput screening of the substrates modified by K11 atypical ubiquitin chain were conducted to reveal its molecular functions on the transcription activation of Met4, providing a theoretical basis for the discovery of new functions of K11 atypical ubiquitin chain. We also employed SILAC quantitative proteomics approaches to systematically evaluate the specificity of DUBs on all seven types of ubiquitin chains. Based on the specificity, we proved the precise regulation and functions of ubiquitin modification on substrates through DUBs. The signal “DUBs – Ub chains – substrate – function” become the basis of precise regulation mechanism of ubiquitin networks.

Keywords:

quantitative proteomics, ubiquitin, atypical ubiquitin chains, deubiquitinase
乔亮
复旦大学

DIA proteomics with in silico spectral libraries by deep learning and DIA glycoproteomics

Abstract:

Data-independent acquisition (DIA) is an emerging technology for quantitative proteomic analysis of large cohorts of samples. However, sample-specific spectral libraries built by data-dependent acquisition (DDA) experiments are required prior to DIA analysis, which is time-consuming and limits the identification/quantification by DIA to the peptides identified by DDA. Recently, we developed DeepDIA, a deep learning-based approach to generate in silico spectral libraries for DIA analysis [1]. We demonstrate that the quality of in silico libraries predicted by instrument-specific models using DeepDIA is comparable to that of experimental libraries, and outperforms libraries generated by global models. With peptide detectability prediction, in silico libraries can be built directly from protein sequence databases. We further illustrate that DeepDIA can break through the limitation of DDA on peptide/protein detection, and enhance DIA analysis on human serum samples compared to the state-of-the-art protocol using a DDA library. Due to the emergence of timsTOF pro and FAIMS, we further extended the tool box of DeepDIA for ion mobility prediction. Now, the DeepDIA also supports data from timsTOF pro and FAIMS orbitrap.

On the other topic, we recently developed GproDIA, a framework for DIA glycoproteomics with comprehensive statistical control by a 2-dimentional false discovery rate approach and a glycoform inference algorithm, enabling accurate identification of intact glycopeptides using wide isolation windows [2]. We benchmark our method for N-glycopeptide profiling on DIA data of yeast and human serum samples, demonstrating that DIA with GlycoSWATH outperforms the data dependent acquisition (DDA) based methods for glycoproteomics in terms of capacity and data completeness of identification, as well as accuracy and precision of quantification. We expect that this work can provide a powerful tool for glycoproteomic studies.

[1] Yi Yang, Xiaohui Liu, Chengpin Shen, Yu Lin, Pengyuan Yang, Liang Qiao, Nature Communications, 2020, 11, 146
[2] Yi Yang, Weiqian Cao, Guoquan Yan, Siyuan Kong, Mengxi Wu, Pengyuan Yang, Liang Qiao, bioRxiv, 2021, doi: https://doi.org/10.1101/2021.03.20.436117

Keywords:

data independent acquisition, proteomics, glycoproteomics, deep learning
秦伟捷
国家蛋白质科学中心(北京)

An RNA tagging approach for system-wide RNA-binding proteome profiling and dynamics investigation upon transcription inhibition

Abstract:

RNA-protein interactions play key roles in epigenetic, transcriptional and posttranscriptional regulation. To reveal the regulatory mechanisms of these interactions, global investigation of RNA-binding proteins (RBPs) and monitor their changes under various physiological conditions are needed. Herein, we developed a psoralen probe (PP)-based method for RNA tagging and ribonucleic-protein complex (RNP) enrichment. Isolation of both coding and noncoding RNAs and mapping of 2986 RBPs including 782 un-known candidate RBPs from HeLa cells was achieved by PP enrichment, RNA-sequencing and mass spectrometry analysis. The dynamics study of RNPs by PP enrichment after the inhibition of RNA synthesis provides the first large-scale distribution profile of RBPs bound to RNAs with different decay rates. Furthermore, the remarkably greater decreases in the abundance of the RBPs obtained by PP-enrichment than by global proteome profiling suggest that PP enrichment after transcription inhibition offers a valuable way for large-scale evaluation of the candidate RBPs.

Keywords:

RNA-binding proteins, psoralen probe, large-scale, enrichment, mass spectrometry
申华莉
复旦大学

Cancer Serum Atlas combining pan-targeted mass spectrometry supports proteomics-based multi-cancer diagnosis

Abstract:

Early cancer detection could give better chance of long-term survival to cancer patients. The emerging multi-cancer diagnosis approach owns potential to address the large unmet need in more inclusive and cost-effective way. Yet, such approach would require high specificity, sensitivity, and highly accurate tissue of origin (TOO) identification. In this study, we developed a proteomics-based approach for multi-cancer diagnosis. Firstly, we conducted a systematic data-mining of the potentially secreted, cancer-associated proteins from the published clinical-proteomics datasets of seven common cancer types with high morbidity and mortality. Over two thousand proteins were screened as candidate cancer biomarkers that could be detectable in the blood of individuals. Unique peptides of each protein were synthesized and high-quality MS/MS and PRM spectra were acquired. All the result data were presented in the database named “Cancer Serum Atlas”(www.cancerserumatlas.com). Then we developed a pan-targeted MS strategy that can precisely quantify up to 800 proteins in one run and applied this strategy to quantify 485 detectable cancer biomarkers in sera of 293 individuals who are healthy or with 4 different types of cancer. To further improve the specificity of the multi-cancer diagnosis, a previously developed PPC-VDE algorithm was introduced which generated large number of cancer-specific features through quantify the protein-protein co-regulations. Taken together, the Cancer Serum Atlas combining pan-targeted MS approach presented great effectiveness in multi-cancer diagnosis and can be widely used to other blood-based cancer studies.

Keywords:

multi-cancer diagnosis, cancer serum atlas, pan-targeted mass spectrometry, protein-protein co-regulation
水雯箐
上海科技大学

Bridging Mass Spectrometry with GPCR Biology: Discovery of Potential Therapeutic Targets

Abstract:

Transmembrane proteins play vital roles in mediating synaptic transmission, plasticity and homeostasis in the brain. However, these proteins, especially the G protein-coupled receptors (GPCRs), are under-represented in most large-scale proteomic surveys. Here, we present a new proteomic approach aided by deep learning-based spectral library prediction for comprehensive profiling of transmembrane protein families in multiple mouse brain regions. Our multiregional proteome profiling highlights the considerable discrepancy between mRNA and protein distribution, especially for region-enriched GPCRs, and predicts an endogenous GPCR interaction network in the brain. Furthermore, our new approach reveals the transmembrane proteome remodeling landscape in the brain of a mouse depression model, which led to the identification of two novel GPCR regulators of depressive-like behaviors. Our study provides an enabling technology and rich data resource to expand the understanding of transmembrane proteome organization and dynamics in the brain as well as accelerate the discovery of potential therapeutic targets for depression treatment.

Keywords:

transmembrane proteins, GPCRs, spectral library prediction, brain proteomics, regulators of depression
孙世伟
中国科学院计算技术研究所

Toward Automated Identification of Glycan Branching Patterns Using Multistage Mass Spectrometry with Intelligent Precursor Selection

Abstract:

Glycans play important roles in a variety of biological processes. Their activities are closely related to the fine details of their structures. Unlike the simple linear chains of proteins, branching is a unique feature of glycan structures, making their identification extremely challenging. Multistage mass spectrometry (MSn) has become the primary method for glycan structural identification. The major difficulty for MSn is the selection of fragment ions as precursors for the next stage of scanning. Widely-used strategies are either manual selection by experienced experts, which requires considerable expertise and time, or simply selecting the most intense peaks by which the product-ion spectrum generated may not be structurally informative and therefore fail to make the assignment. We here report an ‘intelligent precursor selection’ strategy (GIPS) to guide MSn experiments. Our approach consists of two key elements, an empirical model to calculate candidate glycan’s ‘probability’ and a statistical model to calculate fragment ion’s ‘distinguishing power’ in order to select the structurally-most informative peak as the precursor for next-stage scanning. Using 13 glycan standards, including 3 pairs with isomeric sequences, and 8 variously fucosylated oligosaccharides on linear or branched hexasaccharide backbones obtained from a human milk oligosaccharide fraction by HPLC, we demonstrate its successful application to branching pattern analysis with improved efficiency and sensitivity, and also the potential for automated operation.
唐淳
北京大学

A personal guide for distilling protein structure and dynamics information from cross-linking MS data

Abstract:

Cross-linking mass spectrometry (XLMS) has been increasingly employed for the structural characterization of proteins and protein complexes. Photo- or chemical cross-linking connects two adjacent residues within a relatively short distance, and the cross-linked peptides can be identified by mass spectrometry with high confidence. However, there are three caveats associated with XLMS-based structural biology and structural proteomics:
  1. Proteins and protein complexes are usually dynamic. Therefore, the cross-linking reaction can capture and manifest alternative protein conformations, while the observed XLMS data should be interpreted with an ensemble of structures.
  2. Cross-linking implicitly involves two consecutive reactions. Thus, the dynamic timescale of the cross-linker versus the dynamic timescale of the protein can impact the observed XL-MS data. The reaction kinetics issue can especially matter for the intrinsically disordered proteins.
  3. XLMS data manifest inter-residue distances, which had mostly been represented with straight-line distances. Since the cross-linker cannot penetrate the protein, a new type of distance restraint has been developed to recapitulate the inter-residue solvent-accessible distance.
Together, proper care and control should be performed when characterizing protein structure and dynamics using the XLMS data.

Keywords:

integrative structural biology, distance restraint, cross-linking mass spectrometry, reaction kinetics, protein dynamics
谭敏佳
中国科学院上海药物研究所

Proteomic characterization of protein post-translational modifications identifies new therapeutic opportunities

Abstract:

Protein post-translational modifications (PTMs) play fundamental roles in cellular physiology and disease development. Yet current understanding of the inventory and function of PTMs is by far limited. In this talk, I will present our recent study on the systematic characterization of PTMs in several types of disease models and clinical samples using mass spectrometry-based proteomics technologies. Our pervious study demonstrated that targeting epigenetic crosstalk as a therapeutic strategy for ezh2-aberrant solid tumors. Integrative proteomics analysis of 103 cases of human lung adenocarcinoma enables a more comprehensive understanding of its molecular landscape. Global identification of phospho-dependent SCF substrates reveals a FBXO22 phosphodegron and an ERK-FBXO22-BAG3 axis in tumorigenesis. Our integrative proteomics identifies the combination of DOT1L and SHP2 inhibitors as an effective treatment for a subset of KRAS mutant cancer. These studies led to the identification of new mechanisms, biomarkers, and therapeutic approaches in diseases.

Keywords:

phosphoproylation, acetylation, methylation, ubiquitylation, SCF substrates, lung adenocarcinoma, KRAS mutant cancer, combination therapy
田瑞军
南方科技大学

MS-based protein complex profiling in time and space

Abstract:

Proteins are major building blocks of the cell, which play structural, catalytic, and regulatory roles through more than 100,000 dynamic protein−protein interactions in the cell at any given time. In order to precisely coordinate these protein machines, it is critical that these protein complexes form at the right time and in the right place. To systematically characterize these dynamic protein complexes, we have developed a series of integrated sample preparation methods for specifically capturing and enriching them directly from clinical samples and living cells. By combining with advanced mass spectrometry, we could systematically identify these functional protein complexes and accurately quantify their dynamic modulation in a spatiotemporal manner. In this talk, I will mainly present our recent progress for applying these new proteomic methods to characterize tyrosine phosphorylation-mediated membrane receptor complexes and specifically discussing related bioinformatic efforts for improving analysis selectivity and sensitivity.

Keywords:

protein complex, integrated sample preparation, bioinformatics, signal transduction
瑕瑜
清华大学

Large-Scale Lipid Profiling with Isomer Resolving Capabilities

Abstract:

Mass spectrometry (MS) has become a primary tool in lipidomics for global lipid identification and quantitation. Despite the fact that multi-level structural information is available from MS and tandem mass spectrometry (MS/MS), localization of carbon-carbon double bond (C=C) is difficult from current analysis workflows. Our group is interested in harnessing radical chemistry for enhanced lipid analysis. We have paired the Paternò–Büchi (PB) reaction with tandem mass spectrometry (PB-MS/MS) for pinpointing C=Cs in unsaturated lipids. Acetone and aryl ketones are used as the PB reagents. Upon 254 nm ultra-violet irradiation, the PB reagent adds onto a C=C via [2+2] cycloaddition, forming the PB products. Collision-induced dissociation (CID) of the PB products produces C=C diagnostic fragment ions, allowing localization of C=C as well as isomer quantitation. The PB-MS/MS approach has been applied for shotgun lipid analysis and more recently hyphenated with liquid chromatography (LC)-MS. The LC-PB-MS platform enabled large-scale identification of unsaturated glycerophospholipids. It was found that the ratios of C=C isomers were much less affected by interpersonal variations than their individual abundances, allowing more sensitive discovery of lipid markers.

Keywords:

lipidomics, unsaturated lipids, isomers, Paternò–Büchi reaction, tandem mass spectrometry
谢鹭
上海生物信息技术研究中心

Proteogenomics analysis for tumor neoantigen prediction and identification

Abstract:

Neoantigens can function as actual antigens to facilitate tumor rejection, which play a crucial role in cancer immunotherapy. However, timely and efficient identification of neoantigens is still a major obstacle to personalized neoantigen-based cancer immunotherapy. To this end, our previous studies provide a platform for identifying tumor neoantigen, which includes a database for human tumor neoantigen peptides: dbPepNeo, a proteogenomics neoantigen prediction pipeline: ProGeo-neo, and a machine learning algorithm for prediction of neoepitope immunogenicity: INeo-Epp. With the application of neoantigens in immunotherapy and the development of proteogenomics, the requirements for the source range and prediction accuracy of neoantigens have been increased. Therefore, we have extended the neoantigen prediction platform. dbPepNeo2.0 catalogs more than 800 experimentally validated immunogenic neoantigens (MHC-Ⅰ/MHC-Ⅱ) and corresponding 648 TCR sequences. In addition, 251 medium confidence and 864884 low confidence neoantigens were also included. Furthermore, dbPepNeo2.0 provide a deep learning model for predicting immunogenicity of neoantigens based on convolutional neural network: DeepCNN-Ineo. ProGeo-neo2.0 is an integrated computational pipeline to identify neoantigens based on proteogenomics, which can predict neoantigens from a variety of mutant types, including SNV, InDel and gene fusion. In addition, ProGeo-neo2.0 adds functional modules for predicting neoantigens bound to MHC-Ⅱ molecules. Furthermore, in order to provide guidance for tumor types with low tumor mutation load and contribute to a comprehensive understanding of the tumor immune landscape, we extend the neoantigen source to the noncoding regions and construct a proteogenomics-based neoantigen prediction pipeline in noncoding regions, namely PGNneo. In summary, our study results in a proteogenomic platform to promote the predication and confirmation of potential neoantigens in cancer immunotherapy.

Keywords:

neoantigen, proteogenomics, prediction platform, cancer immunotherapy
杨明坤
中国科学院水生生物研究所

Proteogenomic analyses for Genome Annotation and Global Profiling of Post-Translational Modifications

Abstract:

Proteogenomics is referred as the use of mass spectrometry (MS)-derived proteomic data to annotate the protein coding genes and improve genome annotation quality. Currently there is no software dedicated to conducting eukaryotic proteogenomic analyses, and most previous studies did not simultaneously address the whole process in a holistic way. We recently developed an integrated proteogenomic pipeline (GAPE) for annotating genomes and conducted a global analysis of PTMs common in eukaryotes. Using this pipeline, which was designed to be integrated and automated, we generated a customized protein database, performed spectral searches, integrated identified PSMs, identified novel events (novel proteins, AS genes, and SAAVs), revised annotated proteins, and performed global profiling of PTM events. Through this approach, we were able to unambiguously identified approximately 8300 genes and revealed 606 novel proteins, 506 revised genes, 94 splice variants, 58 single amino acid variants, and a holistic view of post-translational modifications in Phaeodactylum tricornutum. We experimentally confirmed a subset of novel events and obtained MS evidence for more than 200 micropeptides in Phaeodactylum tricornutum. The proteogenomic pipeline we developed in this study is applicable to any sequenced eukaryote and thus represents a significant contribution to the toolset for eukaryotic proteogenomic analysis.

Keywords:

proteogenomics, genome annotation, posttranslational modifications (PTMs), eukaryotes
张莹
复旦大学

MS-based Approaches for Analysis of Glycosylation and Application

Abstract:

Glycosylation is a complex form of protein modification occurring on eukaryotic proteins. It affects both the structure and function of proteins. To sensitively analyze the protein N-glycosylation by MS, we have developed a series of new approaches.To selective enrich the glycopeptides, we explored several different chemical reactions that can specifically occur between glycoproteins and solid phases including reductive amination and oxime click reaction. These methods greatly reduced the enrichment time and improve the selectivity of N-glycoprotein analysis. By using the oxime click reaction, we further designed a cross linker that can label the glycan and glycoproteins on bacterial surface in vivo and then can cross link the bacteria with its host interactors by UV irradiation, thereby enabled a time-resolved chemical proteomics strategy enabling host and pathogen temporal interaction profiling (HAPTIP) for tracking the entry of a pathogen into the host cell. Moreover, to enable the accurate quantification of the N-glycome, we developed several new novel N-glycan quantitation approaches based on isotope labeling combined with mass spectrometric analysis including metallic element chelated tag labeling (MeCTL) to increase the sample throughput and duplex stable isotope labeling (DuSIL) to quantify the sialic glycan and neutral glycans simultaneously. Recently, in response to the technical challenge in site-specific N-glycosylation analysis, we reported a chemical labeling strategy to improve the electron transfer dissociation efficiency of intact glycopeptides. This comprehensive glycosylation analysis strategy for the first time allows the discrimination of IgG3 and IgG4 intact N-glycopeptides with high similarity in sequence without the antibody-based pre-separation. In summary, these novel strategies helped the highly sensitive and specific MS analysis of the protein glycosylation.

Keywords:

glycosylation, chemical proteomics, posttranslational modification, mass spectrometry
赵群
中国科学院大连化学物理研究所

Novel Methods for Chemical Crosslinking Based Protein Complex Analysis

Abstract:

Chemical cross-linking combined with mass spectrometry (CXMS) has emerged as a powerful tool to assist traditional technologies to study protein structure and protein–protein interaction with advantages of providing direct interaction sites, less time-consuming and less demanding on sample purity. However, application of CXMS is still limited by the high complexity of CXMS samples, the low abundance of cross-linked peptides, and so on. Besides, how to realize the in-situ protein complex analysis with the lowest cell interference, and further analyze the dynamic conformation and interaction changes at both temporal and spatial dimensions, is an important issue for precisely characterizing the protein complexes, further to elucidate their functions. In response to the above problems, our team has developed a series of methods to improve the depth of chemical crosslinks and realize the in-situ dynamic analysis of protein complexes at temporal and spatial dimensions. And the obtained results suggested our developed strategy might be a promising tool for the global analysis of protein complexes assembly.

Keywords:

chemical cross-linking coupled with mass spectrometry (CXMS), protein complex, in-situ, dynamic analysis, temporal and spatial dimension
郑杰
中国科学院上海药物研究所

Hydrogen/Deuterium Exchange Mass Spectrometry reveals a tethering mechanism of MDA5-MAVS signaling cascade by long K63-polyUbiquitin chains

Abstract:

Homotypic ubiquitin chains play critical roles in a wide range of innate immune signaling pathways. MDA5 senses cytosolic viral RNAs and endogenous retroelements to activate MAVS via a CARD-CARD interaction. Here, we used HDX-MS to probe the K63-polyUbn-mediated tetramerization of MDA5CARDs. We resolved cryo-EM structures - a polyUb13-bound MDA5CARDs tetramer and a polyUb11-bound MDA5CARDs-MAVSCARD assembly – that resemble a hierarchical signaling tower. HDX-MS studies further reveal that MDA5-RNA engagement upon ATP binding and hydrolysis allows the multiprotein complex to remotely stabilize the CARDs-polyUb complex for sustained signaling prior to MAVS activation. Yet abundant ATP could prevent unwanted basal activation of apo MDA5 against unanchored K63-polyUb chains. Our data reveal a K63-polyUb mediated tethering mechanism preferably adopted by MDA5-MAVS signaling cascade crucial for antiviral responses and immune homeostasis.
周虎
中国科学院上海药物研究所

In-depth proteogenomic analysis characterizes the molecular features of HBV-related hepatocellular carcinoma

Abstract:

To obtain a comprehensive molecular understanding of Chinese HCC patients with HBV infection (CHCC-HBV), paired tumor and non-tumor liver tissues from 159 HCC patients, in-depth proteogenomic analysis were performed using whole-exome sequencing (WES), RNA-seq, proteomics and phosphoproteomics. WES data identified 10,235 mutated genes including 20,369 non-silent point mutations and 1,363 small insertions-deletions. RNA-seq data analysis identified 19,860 protein-coding genes. Isobaric tandem mass tags (TMT)-based global proteomics identified 10,783 proteins; phosphoproteomics identified 59,746 highly reliable phosphosites from 9,224 phosphoproteins. We found that 35.2% (56/159) of the patients harbored aristolochic acids (AAs) signature (A:T >T:A transversions), and the AA signature-resulting single amino acid mutations were also identified in the proteomic data. Proteomic profiling identified three subgroups associated with clinical and molecular attributes including patient survival, tumor thrombus, genetic profile, and the liver-specific proteome. These proteomic subgroups have distinct features in metabolic reprogramming, microenvironment dysregulation, cell proliferation, and potential therapeutics. CTNNB1-associated ALDOA phosphorylation was validated to promote glycolysis and cell proliferation. Our study provides a valuable resource that significantly expands the knowledge of HBV-related HCC and may eventually benefit clinical practice.
朱力
军事医学科学院生物工程研究所

Proteomics in the Development of Vaccines against Bacterial Infections

Abstract:

The proteomics strategy based on mass spectrometry technology has become one of the common useful methods in the field of life sciences. Vaccine technology is an important means used by us to fight against pathogens (including bacteria and viruses). Here, the proteomics applications in the researches of bacterial pathogenesis and in the developments of vaccines will be discussed. First, the identification of bacterial intein and cyclic peptides might be improved by computational proteomics. In addition, computational algorithm also helps in the screening of antigen epitopes (MHC-binding peptides from pathogens) and the design of delivery vectors (protein nanoparticles) during the vaccine development process.