Nanopore direct sequencing of proteins with up to 100% accuracy

  • Proteins are the primary constituents of living organisms and the primary carriers of life activities. Proteins with biological functions often have a certain spatial structure that is defined at various levels, the most important of which is the primary structure, i.e., the type and order of amino acids, which can determine the higher structure of proteins.

     

    However, reading the main structure of proteins directly has been extremely challenging, and in most cases, scientists "decipher" the amino acid sequence of proteins using gene sequences and amino acid codon tables. However, because of post-transcriptional and post-translational modifications, the deciphered results are not entirely accurate and even differ greatly from the genuine amino acid sequence.

     

    November 4, 2021—"Multiple rereads of single proteins at single-amino acid resolution using nanopores" was published in Science by researchers at Delft University of Technology in the Netherlands.

     

    The researchers successfully scanned and read individual protein amino acid sequences using nanopore sequencing technology—linearized. DNA-peptide complexes slowly passed through a tiny nanopore, and depending on the variation and intensity of the current, the researchers were able to read the relevant protein information content and sequence the amino acid sequence of the protein directly.

     

    The proteins of all living organisms are long peptide chains made up of around 20 different amino acids, similar to necklaces made up of several types of beads. Unfortunately, existing protein sequencing technologies are prohibitively expensive and fail to detect a large number of uncommon proteins.

     

    In recent years, nanopore sequencing technology has made it possible to directly scan and sequence individual DNA molecules. This new discovery demonstrates that we can read the amino acid sequences of proteins directly in the same way that we can read DNA nanopore sequences.

     

    Over the past 30 years, nanopore-based DNA sequencing has evolved from an idea to an actual working device, and commercially available portable nanopore sequencers haves successfully been developed that serve the multi-billion dollar genome sequencing market, according to Prof. Cees Dekker, corresponding author of this study. "In our paper, we extend the concept of nanopores to the reading of individual proteins. This could have a significant impact on basic protein research and medical diagnostics."

     

    Researchers can utilize this technology to precisely quantify the present size of the nanopore and deduce the matching amino acid species. More critically, this process does not affect the integrity of the peptide chain, so the researchers can repeatedly read the individual peptide chains and then fit all the data to obtain the sequence composition of the peptide chain with essentially 100% accuracy.

     

    This technique may establish the framework for future protein sequencing, but for the moment, protein de novo sequencing remains a significant issue, according to Dr. Henry Brinkerhoff, the paper's first author. "To develop a 'codebook' that corresponds to electrical signals and protein sequences, we still need to explain a huge number of electrical signals from various sequences." Nonetheless, the team was able to successfully differentiate specific amino acid changes in protein sequences, which is undoubtedly a significant achievement with numerous direct implications.

     

    To summarize, this proteomics technology capable of directly reading protein sequence is critical for cell biology research and applications. This study shows how to use the DNA deconvolutive enzyme Hel308 to transit DNA-peptide conjugates via the biological nanopore MspA to read the amino acid sequence of linearized proteins based on current alterations. More importantly, this approach has the capacity to differentiate specific amino acid changes with high fidelity and throughput.

     

    This single-molecule peptide reader represents a significant advancement in protein identification, paving the path for the sequencing and classification of single-molecule proteins within single cells.