Exercises in Data Retrieval and Using Blast Searches

[The NCBI web site and its parts are updated periodically, therefore
results given below may change with time.]

Step by step instructions with screen shots are given at the end of each exercise.
Instuctions not in the step by step section of the handout are given in green type.


Table of Contents:


#1 What is the breath of information available at NCBI on cystic fibrosis in humans?
  Hints: Do a All Databases search at NCBI (http://www.ncbi.nlm.nih.gov). Then repeat the search narrowing the returned hits to human.

  Answer: Lots of data to be explored.

#1 step by step instructions


#2 Besides the cystic fibrosis transmembrane conductance regulator gene (CFTR), what other genes are associated with cystic fibrosis in humans and what are their roles in the disease?
  Hints: Perform an Entrez Gene search (http://www.ncbi.nlm.nih.gov) to find the other genes and their function or relationship to the disease.

#2 step by step instructions


#3 Nocturnal asthma associated with what gene in humans? What are the RefSeq codes for this gene's mRNA and protein sequences? On the GenBank accession pages, data can be displayed in different formats. What is the difference between default and FASTA formats for these sequence files? How can these RefSeq codes be used to search for similar sequences in other species? What are the results of such a search?
  Hints: Do a Gene search at NCBI (http://www.ncbi.nlm.nih.gov), record the codes. Compare the formats of the mRNA and protein sequences. Run a BLAST search, (http://www.ncbi.nlm.nih.gov/BLAST/).

#3 step by step instructions


#4 Are there any solved protein crystal structure(s) for the nocturnal asthma gene. Does the structure include the transmembrane segments? Are the found structure protein and the nocturnal asthma protein closely enough related to believe the results?

  Hints: Use the protein accession code from the previous exercise and run a protein BLAST search (http://www.ncbi.nlm.nih.gov/BLAST/). This time, instead of using the default database, use the swissprotein database and a structure database. Compare the two human proteins with BLAST 2 SEQUENCES ( http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi) to make the decision.

#4 step by step instructions


#5 Find proteins that are known to contribute to pulmonary artery hypertension and determine if animal models exist in which the disease can be studied. Can a full length dog protein sequence be found?

best dog matches XP_536035 759 aa
  XP_851509 248 aa

#5 step by step instructions


#6 How conserved are the ATP2A2 proteins across vertebrate species? Should all the available protein sequences be used to make this assessment?

#6 step by step instructions


#7 Are there knockout mice available to study the AGPAT6 gene? How would you order one of these cell lines?


#7 step by step instructions

last updated 4/27/2007