PhD Scientific Days 2021

Budapest, 7-8 July 2021

MO_II_P: Molecular Sciences II. Posters

The Importance of Data Analysis in Glycoproteomics

Simon Sugár 1,2, Gábor Tóth 1,3, Fanni Bugyi 1,4, Károly Vékey 1, László Drahos 1, Lilla Turiák 1

1 Eötvös Loránd Research Network, Research Centre for Natural Sciences, MS Proteomics Research Group
2 Semmelweis University, Doctoral School of Pharmaceutical Sciences
3 Budapest University of Technology and Economics, György Oláh Doctoral School
4 Eötvös Loránd University, György Hevesy Doctoral School of Chemistry

Text of the abstract

Introduction:
The results of Omics research projects are strongly influenced by data handling procedures. It is especially important when quantitative mass spectrometry (MS)-based proteomics and glycoproteomics analysis is performed to minimize errors from the multi-step sample preparation and the MS analysis. Although its importance is widely recognized, currently, there is no scientific consensus regarding the optimal data analysis of proteomics and glycoproteomics data. This is primarily due to the fact, that no two datasets are identical, and thus the ideal method could be different as well.
Aims:
The main goal of the research was to establish the statistical data analysis methods best applicable for the analysis of intact N-glycopeptides from complex biological samples. Particular emphasis was put on normalization and missing value imputation.
Methods:
The glycoproteomics dataset was obtained through the nanoHPLC-MS/MS analysis of healthy and cancerous prostate tissue microarray biopsy samples. N-glycopeptides were identified using Byonic then quantified using GlycoPattern. Subsequent data analysis and visualization were performed in R using RStudio.
Results:
Several of the most popular normalization and imputation methods and their combinations have been tested. The presented results show the effect of choosing different data analysis methods and how those influence a prostate tissue glycoproteomics dataset. These include the changes in the most important N-glycosylation metrics and the statistically significant differences between healthy and cancerous tissues.
Conclusion:
The applied data analysis methods greatly influence the results and the subsequent biological interpretation of N-glycoproteomics research projects.
Funding:
SUPPORTED BY THE ÚNKP-20-3-I NEW NATIONAL EXCELLENCE PROGRAM OF THE MINISTRY FOR INNOVATION AND TECHNOLOGY FROM THE SOURCE OF THE NATIONAL RESEARCH, DEVELOPMENT AND INNOVATION FUND.

University and Doctoral School

Semmelweis University, Doctoral School of Pharmaceutical Sciences