Computational Methods for Rational Oligonucleotide PCR Primer Design and Analysis: Two Scenarios Using GCG¥’s SeqLab. By Steven M. Thomson (stevet@bio.fsu.edu) : Oligonucleotide Synthesis

Introduction

The Polymerase Chain Reaction, PCR, developed at Cetus Corporation by Kary Mullis in the mid ‘80’s (Saiki, et al., 1988), for which he won the Nobel Prize, and patented by Hoffman La Roche and Perkins-Elmer Corporation, has revolutionized modern molecular biology. From Jurassic Park scenarios in popular novels, to everyday research in countless laboratories across the world, to cutting-edge forensic pathology techniques, PCR is being used to analyze tinier concentrations of DNA than ever before imagined possible. PCR allows the investigator to analyze any stretch of DNA in any organism where at least some sequence information is known, either in that organism or in related organisms. It can isolate, and amplify up to around a million-fold, just a few molecules of DNA from complex environmental mixtures, even where the DNA is significantly degraded — the ramifications are incredibly far-reaching. It has been employed, among many examples, to analyze DNA in Egyptian mummies, preserved prehistoric insects in amber, ancient fossilized leaves, and both ice-age frozen and tar-pit preserved mastodons and other animals from the ‘great age of mammals.’ Claims were even made of dinosaur DNA recovery from specimens recovered in a Utah coalmine, though the results were later proven to be contamination. The practical applications are extensive in medicine, especially in the field of prenatal genetics and, in particular with HIV, immediately postnatal diagnosis. Other pathologies such as Lyme disease are also extremely amenable to PCR diagnosis. Furthermore, molecular evolutionists now have a tremendous tool for inferring phylogenies of any organism, whether they can be cultured or not. Furthermore, forensics has been completely turned about. Now investigators can isolate the DNA from incredibly obscure bits of physical evidence, ala CSI, to positively exclude suspects based on distinct patterns, fingerprints, within their DNA. Using it to ‘prove’ guilt is more difficult because of the population genetics statistics involved, however, even these probabilities can be demonstrated within several magnitudes of order. PCR has truly changed the face of molecular biology.

PCR is a modified primer extension reaction using a thermostable DNA polymerase that allows for the heat dissociation of newly formed complimentary DNA and subsequent hybridization of oligonucleotide probes to the target regions for subsequent rounds of amplification. The scope and methods of PCR are huge and many varied and way beyond the aim of this workshop — I will not attempt to teach anything of the actual procedure. Refer to any good, modern text in molecular biology for details (for some good, early reviews of PCR methodology see Mullis [1990], White et al. [1989], and Cherfas [1990]). What I will attempt to teach is a rational method for inferring appropriate oligonucleotide probes, often known as primers, for PCR or hybridization screening analysis. These oligonucleotides are usually about 20 or more bases in length and target the beginning and ending locations of the PCR amplification process.

Coupled with PCR techniques and/or ultra sensitive hybridization screenings, oligonucleotide primers have allowed the ‘fishing out’ of thousands of genes from complex genomes that would have previously been extremely difficult to ever even find, yet alone sequence. Present-day economic, automated synthesis and the ready availability of nucleotides, have made primers commonplace. (This has also facilitated the development of reliable methods for the introduction of site-specific mutations into known sequences.) Because of the high specificity and adjustable stringency of oligonucleotide hybridization, the sequence knowledge of a relatively short stretch of unique DNA is sufficient to rapidly isolate and/or amplify, clone if desired, and sequence the corresponding gene. However, whatever technique one may use, primers are essential ingredients.

PCR and hybridization screening both require the design of appropriate primers. This can be a ‘hit-or-miss’ affair or you can use computational methods to greatly assist the efficiency of the process. Several strategies can be imagined for the design of oligonucleotide primers. If an exact nucleotide sequence is known, then a single oligonucleotide probe for hybridization or a pair of primers for PCR of a defined sequence can simply be selected, tested, and synthesized. In the absence of a defined DNA sequence, sometimes a group of similar DNA sequences can be aligned and a consensus sequence created from which primers can be designed. However, this is often not possible because DNA can be very, very difficult to align. In some cases one may even be forced to work off of either a small portion of a protein sequence from an Edman degradation reaction or, as will be illustrated in this exercise, a consensus pattern from a group of related proteins — the luxury of using DNA directly is often not available.

When nucleotide data is lacking or problematic, amino acid sequences can be back translated to provide the necessary primers. In the absence of exact protein sequence data, a consensus pattern from a group of related proteins can often be used. Using amino acid sequence information requires one to back translate the sequence though. This is not a trivial chore though, because of the degeneracy of the genetic code. There are 64 possible codons for 20 amino acids. Because of this, many different back translation probe techniques have been employed. Two are, either utilizing large pools of short oligonucleotides whose sequences are highly degenerate, or using small pools, or even just one pair, of longer oligonucleotides of lesser or no degeneracy. All organisms have preferential biases in codon usage and this information can be used to advantage in deciding which codons to synthesize out of all of the possible choices. This strategy of choosing the longest defined stretches of unambiguous peptide and back translating them to their most probable oligonucleotides, is known as designing “guessmers.”

Guessmers contain the combination of codons most likely to match the authentic gene. Guessmers work because the decrease in hybridization stability caused by mismatched bases is offset by an increase in stability from using longer sequences. In most cases, mismatches will occur in only the third position of incorrect codon choices and, therefore, at least two of the three bases will still be matched. Naturally, the biggest constraint on utilizing this type of strategy is that relatively long stretches of amino acid sequence are required. Because of this, guessmers are particularly appropriate when strong and sufficiently long consensus elements can be discovered in a protein family. They should be at least 30 nucleotides in length, in order to insure sufficient hybridization despite potential mismatches, though PCR primers are seldom designed as long as hybridization probes. It’s also not worth the extra effort and bother to synthesize them longer than about 70 bases. For very some early, very good descriptions of the factors involved in guessmer design and analysis and references to primary literature see Sambrook et al. (1990) and Wood (1987).

The first portion of today’s tutorial will explore guessmer design. In order to discover possible consensus patterns within a known protein family for the design of a guessmer, the individual members must be maximally aligned and then a consensus must be created. Alignment is usually achieved through an automated progressive, pairwise alignment procedure, here the GCG program PileUp, which inserts gaps to align the full length of its members. Other automated alignment methods are also available such as Thompson and Higgins’ ClustalW (1994), Smith and Smith’s PIMA (1995), and Gupta et al.’s MSA (1995), as are several different manual alignment editors. Consensus sequences can then be created from the alignment. Many methods merely rely on the positional frequency of individual symbols; however, some utilize much more information. Profile analysis (Gribskov et al., 1989) is one of these. Profile analysis takes advantage of the BLOSUM (Henikoff and Henikoff, 1992) Dayhoff style scoring matrices (Schwartz and Dayhoff, 1979) that utilize the relative conservation of various amino acid substitutions within the alignment. Therefore, the resultant consensus residues are the most evolutionarily conserved rather than just statistically the most frequent. This can mean much more to us than an ordinary consensus and is especially appropriate in the design of the type of guessmer that we will be simulating — that is, a situation in which much sequence information for the protein of interest is known in other organisms but not in the one we are studying.

I will illustrate the design of guessmers using the prion protein as an example. The prion molecule is responsible for a debilitating disease in animals and yet is encoded by the organism’s own DNA; the gene is expressed in both normal and afflicted cells. Large amounts of proteinaceous plaques aggregate and are deposited in the brains of afflicted animals. The prion protein has an unknown natural function but is found in very high quantities in the brain of animals infected with the degenerative neurological diseases scrapie and Bovine Spongiform Encephalopathy, in wild stock, and kuru, Creutzfeldt-Jacob Disease, or Gerstmann-Straussler Syndrome in humans. It is also involved in Fatal Familial Insomnia and gained notoriety as the harbinger of “Mad-Cow Disease.” In humans the gene maps to position 20p12-pter and the disease can be inherited in an autosomal dominant fashion. Seventeen pathologic allelic variants are listed in OMIM (1995). One of the most peculiar aspects of the prion is no infective nucleotide entity has ever been found, yet the protein particle itself is highly infectious. Somehow the infectious protein particle induces a posttranslational, pathological change in the host’s normal protein to convert it to the aberrant isoform. The primary amino acid sequence is not changed, only the structural conformation of the protein is different. Stanley B. Prusiner of the University of California, San Francisco, won the 1997 Stockholm’s Karolinska Institute Nobel Prize in physiology or medicine because of his work on this system. For further information, see Prusiner’s article in Science, available on the World Wide Web at: http://www.sciencemag.org/feature/data/prusiner/245.shl.

The second scenario utilizes a human papillomavirus (HPV) dataset. HPV is known to be associated with many varieties of human genital cancers. The DNA from certain types of HPV, in particular types 16 and 18, has been found integrated into various sites on human chromosomes, especially 12q13, and is often associated with the cis-activation of cellular oncogenes and/or the establishment of heritable fragile sites (OMIM). HPV exists in a dizzying number of genetic types — there are almost 2000 HPV nucleotide sequences including around 50 complete HPV genomes in GenBank (Bilofsky, et al. 1986)! Some types appear relatively benign while others have powerful etiologic roles.

The ability to easily discriminate between HPV types is obviously a valuable diagnostics tool. PCR provides a proven methodology for achieving just this. The HPV major capsid protein, or L1 gene as it is known, has proven to be a reliable locus for this technique. The HPV viral coat is largely built from this protein, and, therefore, represents the first and major antigen presented to the host. Hence, the selective pressure is quite intense on the molecule: It evolves quickly enough to provide sufficient variation between types for screening purposes and yet has strongly conserved areas to provide for ‘universal’ primers. One paired set, the so-called MY09/11 consensus, has been extensively used for this purpose. See, for other historic examples, the articles by Tenti, Nagano, Stewart and their collaborators (all 1996).

I have already prepared a multiply aligned DNA sequence dataset of the L1 region from about 50 different HPV sequences most similar to type 16 for the second scenario. This dataset will not require the design of guessmers, as these sequences have quite a high degree of similarity, enough to make this region quite easy to align at the DNA level. From the multiple sequence alignment provided, you will be able to design your own ‘universal’ and type/strain specific primers. Furthermore, using the GCG primer design software, you can test the efficiency of the commercial MY09/11 universal set, and compare them to your newly designed primers. Finally, you can review the results of a database search that I completed using the MY09/11 primers to see just how specific and/or universal they are for HPV L1 genes.

more info : please click here

1 comments:

Kais Vadim said...: I find something in herbal medicine good to share on here with anyone suffering from the disease such as HIV, Herpes, Hepatitis or Chronic Lyme Disease, Lupus as well.Dr Itua herbal made cure my HIV and gave me hope that he can cure all types of diseases I believed him) I do the best of myself that I can do, I went for a program in west Africa about fashion on another side I was HIV positive. I walk through a nearby village for our program schedule then I found a signage notice that says Dr Itua Herbal Center then I asked my colleagues what all about this very man called Dr Itua, She told me that he's a herbal doctor and he can cure all kind of disease i walked to him and explain myself to him as I'm a strangler out there he prepared me herbal medicine and told me how to drink it for two weeks, when I get to my hotel room I take a look at it then says a prayer before I drank it not knowing after two weeks I went to test and I found out I was negative I ran to him to pay him more but he refuses and says I should share his works for me around the globe so sick people can see as well. I'm writing a lot about him this season so that is how I was cured by drinking Dr Itua herbal medicine, He's A caring man with godly heart. Well - everything I decided all go through for me well and how you're going to treat this new aspect to your life. You don't have to suffer alone, and it's okay to ask for help. It also doesn't have to be a constant demon, as you'll get to know your body and yourself in a much deeper way than most people. Take advantage of this, as it will help you appreciate Africa Herbal Made.
Dr Itua Contact Information.
Email...drituaherbalcenter@gmail.com
Whatsapp Number....+2348149277967; 25 April 2019 at 10:53

Oligonucleotide Synthesis

Computational Methods for Rational Oligonucleotide PCR Primer Design and Analysis: Two Scenarios Using GCG¥’s SeqLab. By Steven M. Thomson (stevet@bio.fsu.edu)

1 comments:

Post a Comment

Popular Posts

Labels

Feedjit

List Blog

Visitor