Recovering Proteoforms from Cardiovascular Omics Datasets: A Multi-omics Secondary Analysis
Biography Overview PROJECT SUMMARY Large-scale omics techniques including proteomics and RNA-seq have become important tools to identify disease mechanisms and therapeutic targets. However, these experiments have largely not considered ?proteoforms? - protein variants coded by the same gene such as through alternative splicing and post- translational modifications that can serve different cellular functions and whose distributions are often permuted in disease. In the heart in particular, alternative splicing is implicated in broad pathological processes in heart failure and cardiomyopathy, but at present we have a poor understanding of the expression status and molecular functions of many alternative splice isoform products at the protein level. Recently we have developed and optimized a computational pipeline which can integrate information from RNA-seq and proteomics data to recover lost protein isoform information from proteomics data. Our goal now is to perform a targeted secondary analysis of publicly available quantitative proteomics data on heart diseases that are housed in persistent data repositories. Specifically, Aim 1 will (i) identify and quantify alternative splice isoforms in heart failure and atrial fibrillation proteomics data, by using custom sequence databases constructed from RNA-seq data; and (ii) determine the intersections between AS isoforms with PTM sites at regulatory hotspots, with the aid of mass-tolerant open-search algorithms that can recover unexpected PTMs in proteomics data. By reanalyzing existing datasets with our pipeline we aim to extract isoform-level knowledge on existing data, which we are confident will have a strong likelihood to open unforeseen avenues into the research of heart diseases, and also add value to the existing rich data resources in our research community.
Time
|