In shotgun proteomics, generally only a fraction of peptides from a parent protein are actually detected. Because a large portion of the protein sequence is not detected, it is often impossible to determine whether the expressed protein is present in a modified, spliced, or truncated form. Provided herein are methods and systems for analyzing polypeptides which allow for the increase of the mean sequence coverage of a protein concomitant with bioinformatics analysis in order to distinguish putative proteoforms with improved amino acid resolution. Aspects of the invention include (1) a deep sequencing strategy to provide more protein sequence coverage than is typically achieved, and (2) a computational approach to view protein expression across its full length and identify regions of the protein that are potentially subject to such regulation. This technology has global utility in proteomics and will be of particular use for the analysis of biosimilar protein drug therapeutics.
展开▼