|
This optional lab contains two problems dealing with detecting segments in transmembrane proteins. At the end of the lab is a section on displaying these proteins. This exercise assumes that you will return to earlier labs if you need help with a given web site.
Lab 1 help guide Lab 2 help guide Lab 3 help guide
![]()
Introduction
Transmembrane proteins are very interesting and yet difficult to study. This is due to the complex nature of the proteins themselves and the fact that it is extremely difficult to determine their 3D structures.
Transmembrane proteins can contain from 1 to over 20 transmembrane segments. The transmembrane segment can either completely span a membrane or can be only partially inserted into it. Both helical and sheet secondary structural elements can be used to span a membrane. A single protein may perform the entire transporter function or be part of a number of proteins which do this task.
Transmembrane prediction programs are all based on the premise that a primarily phobic region of sequential amino acids in such a protein forms a helix that spans the membrane. The very first means of predicting transmembrane segments was simply to look at the protein's hydrophobic profiles. However, this technique can't tell the difference between transmembrane segments and phobic core regions in a protein.
The main difference between the various prediction programs is in the way in which each defines the characteristics of a transmembrane segment. Not all transmembrane segments show the same characteristics. Some are composed entirely of phobic amino acids, others have some charged residues. The ends of the segments that are in the membrane faces need to be compatible with the philic environment found there. This observation is sometimes used to help separate transmembrane segments from phobic protein core regions.
Because transmembrane proteins can and do interact with ions and/or small charged molecules, there needs to be charged residues within the structure of the channel they form. None of the existing tools are very good at predicting this type of segment.
There currently is no prediction technique that works for those transmembrane proteins which possess sheets spanning a membrane.
Additional information is often required in order to be able to properly assign the transmembrane segments. This information can come from physical experiments, motif determinations or database searches.
additional information:
Computational Methods for Studying Transmembrane alpha-Helices (Yale)
![]()
Problem 1
>trans1 MDLKESPSEGSLQPSSIQIFANTSTLHGIRHIFVYGPLTIRRVLWAVAFVGSLGLLLVES SERVSYYFSYQHVTKVDEVVAQSLVFPAVTLCNLNGFRFSRLTTNDLYHAGELLALLDVN LQIPDPHLADPSVLEALRQKANFKHYKPKQFSMLEFLHRVGHDLKDMMLYCKFKGQECGH QDFTTVFTKYGKCYMFNSGEDGKPLLTTVKGGTGNGLEIMLDIQQDEYLPIWGETEETTF EAGVKVQIHSQSEPPFIQELGFGVAPGFQTFVATQEQRLTYLPPPWGECRSSEMGLDFFP VYSITACRIDCETRYIVENCNCRMVHMPGDAPFCTPEQHKECAEPALGLLAEKDSNYCLC RTPCNLTRYNKELSMVKIPSKTSAKYLEKKFNKSEKYISENILVLDIFFEALNYETIEQK KAYEVAALLGDIGGQMGLFIGASILTILELFDYIYELIKEKLLDLLGKEEDEGSHDENVS TCDTMPNHSETISHTVNVPLQTTLGTLEEIACThe classic means of finding transmembrane segments is to determine a hydropathic profile for the protein in question. A description of the process is given below (text taken from the Weizmann help pages).
Calculating hydrophilicity/hydrophobicity
The hydropathic profile of a protein is calculated by assigning each amino acid a numerical value ("hydropathy index") and then repetitively averaging these values along the peptide chain. The values assigned to each amino acid can either be the Hopp-Woods values or the Kyte-Doolittle values. The window length over which the hydropathy indices are averaged must also be set. This changes the hydropathy of the protein and will affect which and how many proteins are returned in the protein search. Further information about the calculation of hydrophilicity can be found in
- Hopp TP and Woods KR Prediction of protein antigenic determinants from amino acid sequences. Proc. Nat. Acad. Sci. USA 78(6): 3824-3828, June 1981.
- Kyte J and Doolittle RF A Simple Method for Displaying the Hydropathic Character of a Protein. Journal of Molecular Biology 157(6): 105-142, 1982.
Visit the Weizmann Institute's Hydropathic Profile site and determine the protein's profile. This time change the window sizes (7, 17, 27) to see how this affects the resulting profiles.
[Weizmann's Hydropathic Profile run - input format: raw sequence] The profile determination uses software based on a windowing technique. The size of the window used impacts the output. Using too small a window produces a graph with lots of noise, too big smoothes out the graph and may remove any signal. The window size used should be about the size of the feature being sought (in this case a transmembrane helix the assumed size is about 20 residues). A size 27 window will smooth out the signal some, but should leave the strong phobic peaks intact.
1. Where are the phobic regions in trans1? How big are these regions?
Protein characterization can also provide information that can be used to identify transmembrane segments. There are a number of motifs that are transmembrane specific. Other motifs can be used to identify possible extracellular regions.
Visit one of the EXPASY's Prosite sites ( Canada, China, Korea, Taiwan, USA) to find text pattern-based functional motifs in your sequence. Choose the option to Exclude patterns with a high probability of occurrence. Run the process by clicking the START THE SCAN button.
[EXPASY's Prosite run - input format: raw sequence] On the results page follow the PDOC link to detailed information about the located pattern. Record your findings.
2. What types of motifs were located? Are they membrane related?
Repeat this process, this time not excluding high probability patterns. Many of these patterns are short and can give false positive hits, but they also can help identify possible extracellular regions of a transmembrane protein. ASN_GLYCOSYLATION is one of these patterns. N-glycosylation sites, if real, are on the extracellular side of a membrane. Record your findings.
3. Is there any indication of an extracellular portion to trans1? Where is it located?
Determine profile-based functional information for trans1 by going to the ProfileScan server at ISREC. Select all the database options you can.
[ISREC's PSCAN run - input format: raw or fasta sequence] Follow any new database profile documentation links to find out more about the protein. Record your findings.
4. Were there any new hits? If so, did they provide any new information?
Use trans1 in the following transmembrane prediction sites to determine if it is a transmembrane protein.
HMMTOP SOSUI TMHMM PRED-TMR TMAP (single) TMPRED
[HMMTOP run - input format: raw sequence] [SOSUI run - input format: raw sequence] [TMHMM run - input format: fasta file]
[PRED-TMR run - input format: raw sequence or fasta file] [TMAP run - input format: raw sequence] [TMPRED run - input format: raw sequence] 5. Do all the sites agree? What is the majority opinion on the trans1 protein?
6. Where are the transmembrane segments?
7. Which of the possible regions agree with the hydrophobic profile data?
Next, determine if trans1 is similar to any annotated proteins at NCBI by doing a BLASTP search against the nr database. Be sure to have filtering turned off.
[BLASTP run - input format: fasta file] Examine the list of hits to find a perfect match for the entire sequence. Go to the documentation for that hit and see if it contains any information about the function of the protein and its features. Record that information.
8. How many identical sequences with different names were there for the best hit?
9. Of those links that gave feature information, do they agree? How do the "feature information" parts differ?
This protein has been known by a number of names including MDEG, and ASIC2. It can be confusing when searching the literature for information on this protein.
In a review article on this family of transporters, the following statements are made.
... The existence of slight differences between the theoretical and experimental molecular weight of the extracellular domains suggested to Renard et al. (116) that the structure of the two transmembrane domains might be more complex than classical hydrophobic alpha-helices. It probably involves structures similar to the pore loop found in several voltage-dependent ion channels (56). ...
... Many experimental observations confer a high functional importance to this region (37,49,64,65,116,125,147) and are consistent with a structural model in which the second transmembrane domain is divided into two distinct parts. The COOH-terminal segment is likely to correspond to a classical transmembrane alpha-helix. One section of this helix interacts with ions and with the amiloride molecule, while the others interact with the lipid bilayer. The NH2-terminal segment in which two putative beta-strand structures are linked by a coil region that contains one (or two) conserved glycine(s), participates in the formation of the ion pore (20,49,116). ...
The pore loop region for the MDEG proteins was given as being at residues 422-440 in figure 2 of the review article.
- Barbry P, Hofman P., "Molecular biology of Na+ absorption.", Am J Physiol. 1997 Sep; 273(3 Pt 1):G571-85. Review.
One of the principal investigators in this area, Dr. Michel Lazdunski, has published the following statements about the protein.
"The region located just before TM2 has been shown to be important for gating and for the pharmacological properties of ASIC2. The ASIC2 mutations G430T, G430V or G430F, which correspond to mutations leading to gain of function or neurodegeneration in C. elegans degenerins, cause large changes in pH dependence and the inactivation process as well as changes in amiloride sensitivity. Participation of the pre-TM2 region in ion selectivity and conductance has also been established for ENaC."
"This study suggests that the pre-TM1 domain contributes to the ion pore of the ASICs and plays a crucial role in the selectivity filter, possibly as a re-entrant loop from the cytoplasmic side. The present data suggests a model for the ion pore of ASICs that involves the transmembrane domains (TM2 and perhaps TM1) and the pre-TM2 and pre-TM1 regions."
The location of the pre-TM1 regions was given in Table 1 of the paper as being located at residues 17-28 with the TM1 segment starting at 42.
Coscoy S, de Weille JR, Lingueglia E, Lazdunski M. "The Pre-transmembrane 1 Domain of Acid-sensing Ion Channels Participates in the Ion Pore.", J Biol Chem. 1999 Apr 9;274(15):10129-32.
Taking this information into account along with the data from the BLASTP search reference and your previously collected information, create a model for the trans1 protein that fits.
![]()
Problem 2
>trans2 MTRAGDHNRQRGCCGSLADYLTSAKFLLYLGHSLSTWGDRMWHFAVSVFLVELYGNSLLL TAVYGLVVAGSVLVLGAIIGDWVDKNARLKVAQTSLVVQNVSVILCGIILMMVFLHKHEL LTMYHGWVLTSCYILIITIANIANLASTATAITIQRDWIVVVAGEDRSKLANMNATIRRI DQLTNILAPMAVGQIMTFGSPVIGCGFISGWNLVSMCVEYVLLWKVYQKTPALAVKAGLK EEETELKQLNLHKDTEPKPLEGTHLMGVKDSNIHELEHEQEPTCASQMAEPFRTFRDGWV SYYNQPVFLAGMGLAFLYMTVLGFDCITTGYAYTQGLSGSILSILMGASAITGIMGTVAF TWLRRKCGLVRTGLISGLAQLSCLILCVISVFMPGSPLDLSVSPFEDIRSRFIQGESITP TKIPEITTEIYMSNGSNSANIVPETSPESVPIISVSLLFAGVIAARIGLWSFDLTVTQLL QENVIESERGIINGVQNSMNYLLDLLHFIMVILAPNPEAFGLLVLISVSFVAMGHIMYFR FAQNTLGNKLFACGPDAKEVRKENQANTSVVVisit the Weizmann Institute's Hydropathic Profile site and determine trans2's profile.
[Weizmann's Hydropathic Profile run - input format: raw sequence] 1. How many possible transmembrane segments are there in trans2?
Go to one of the EXPASY's Prosite sites ( Canada, China, Korea, Taiwan, USA) and find any text pattern-based functional motifs in trans2. Choose the option to Exclude patterns with a high probability of occurrence.
[EXPASY's Prosite run - input format: raw sequence] On the results page follow the PDOC link to detailed information about the located pattern. Record your findings.
2. Did the found hit(s) provide any real insight on trans2? What is the nature of the found motif(s)?
Determine trans2's profile-based functional information by going to the ProfileScan server at ISREC. Select all the database options you can.
[ISREC's PSCAN run - input format: raw or fasta sequence] Follow any new database profile documentation links to find out more about the protein. Record your findings.
3. What information did the new hits add to the trans2 story?
4. Was the documentation consistent as to the information it provided? What were the inconsistencies?
Use trans2 in the following transmembrane prediction sites to determine if it is a transmembrane protein or not.
HMMTOP SOSUI TMHMM PRED-TMR TMAP (single) TMPRED
[HMMTOP run - input format: raw sequence] [SOSUI run - input format: raw sequence] [TMHMM run - input format: fasta file]
[PRED-TMR run - input format: raw sequence or fasta file] [TMAP run - input format: raw sequence] [TMPRED run - input format: raw sequence] 5. What was the range in the number of predicted transmembrane segments?
6. How many of the regions are consistently predicted by the 6 different methods?
In the paper referenced below these statements were made.
... The Ireg1 cDNA contained an open reading frame (ORF) encoding a novel protein of 570 amino acids with ten predicted transmembrane domains, and yielding a predicted 62 KDa protein. ...
... The IREG1 IRE-like motif sequence shows 100% conservation between the human, rat, and mouse species. ...
McKie AT, Marciani P, Rolfs A, Brennan K, Wehr K, Barrow D, Miret S, Bomford A, Peters TJ, Farzaneh F, Hediger MA, Hentze MW, Simpson RJ., "A novel duodenal iron-regulated transporter, IREG1, implicated in the basolateral transfer of iron to the circulation.", Mol Cell. 2000 Feb; 5(2):299-309.
The paper does not reference the process used to obtain this transmembrane prediction. The alignment given for human, rat and mouse IREG1 sequences shows them to be highly conserved. However, the human sequence is the most divergent of the three.
Determine if trans2 is similar to any annotated proteins at NCBI by doing a BLASTP search against the nr database. Be sure to have filtering turned off.
[BLASTP run - input format: fasta file] Examine the list of hits to find a perfect match of the entire sequence. Go to the documentation for that hit and see if they contain any new information about the function of the protein.
7. What new information did this process provide?
Save the mouse and rat sequences in fasta format on your local machine. Then use them with the original sequence to create a multiple fasta formatted file to be used in the clustalw site to generate your own alignment of the three sequences.
Pick the sequence with the least amount of variation and use that in the transmembrane sites given above. Use your alignment and the transmembrane data from the two sequences to firm up the locations of the transmembrane regions in the human sequence. Small changes in sequence can greatly impact these prediction techniques, especially if a philic residue in a possible region is changed to a phobic one.
Use the data you collected, plus the information from the paper to create a model of the trans2 protein. Consider N-glycosylation sites to be an indication of extracellular location. But, this could be overwritten, however, if all the prediction methods say that an area is part of a transmembrane segment.
![]()
Presenting 2D transmembrane structural information can be difficult. In the past this required the use of a drawing program and lots of effort to generate a suitable image.
Use the TOPO2 site to generate images of your two predictions. You will need to: paste in raw sequence data, provide the number and location of segments, provide the type of each segment (whether it crosses the membrane or is a partial looping one), and tell the program on which side of the membrane the protein starts.