Bioinformatics

.docx

School

Texas A&M University *

*We aren’t endorsed by this school

Course

112

Subject

Biology

Date

Apr 25, 2024

Type

docx

Pages

5

Uploaded by GeneralCrown2047 on coursehero.com

Name:_____________________ Section:____________________ Bioinformatics I Data Sheet Lab Activity 1: Sequence Taxonomy 1. Approximately what proportion of the presently described species of life on Earth are accounted for in the Taxonomy database? ___10%_______ 2. How many bacterial species are represented in the taxonomy database? _38,144___ 3. Pick an extinct species to investigate. a. Scientific name ___Campephilus imperialis_______________ b. Common name (if available) _Imperial Woodpecker_______ c. How many nucleotide sequences are available? ___15_________ d. What is the name of your selected gene? ______Campephilus imperialis mitochondrion, complete genome___________________ e. How many nucleotides long is the sequence? (you can find this in the first line (LOCUS) xxx bp) __16858bp_______________ f. What is the citation for the paper reporting the sequence? g. Anmarkrud JA, Lifjeld JT. Complete mitochondrial genomes of eleven extinct or possibly extinct bird species. Mol Ecol Resour. 2017 Mar;17(2):334-341. doi: 10.1111/1755-0998.12600. Epub 2016 Oct 11. PMID: 27654125 Page 1 of 5
Name:_____________________ Section:____________________ Bioinformatics II Data Sheet Lab Activity 3: SARS CoV-2 Phylogeny 1. When looking at the Sequence Data Explorer window, how many variable sites are there? How many conserved sites? (Hint: pay attention to the bottom of the window as you toggle between variable and conserved sites.) Variable:96/29885 Conserved: 29727/29885 2. With the Sequence Data Explorer window still open, on the main menu bar select “Data” and “Translate Sequences.” Now how many variable sites are there? How many conserved sites? Why do you suppose there are fewer variable sites in the translated sequences? Variable:53/9961 Conserved: 9115/9961 I think the reason behind there being fewer variable sites in the translated sequences is the degenarcy of the nucleotide sequence. 3. With the Sequence Data Explorer window still open, on the main menu bar select “Data” and deselect “Translate Sequences.” Highlight the variable sites and scroll through the alignment to find one. Attach a screenshot of your variable site to this document. What is the position of the mutation? What is the mutation? What location was this variant detected? (Hint: you can click on any nucleotide to get the position information of that nucleotide at the bottom of the window.) The position of the mutation is at site 1032/29885 4. In the Tree Explorer window, select “Bootstrap Consensus Tree.” Attach an Page 2 of 5
Name:_____________________ Section:____________________ image of the tree to this document. (For Mac users, a screenshot is the best solution to do this: use shift+command+4 to select an area) 5. Observe the bootstrap consensus tree. Which strain is most unlike the others? How do you suppose an isolate from New York is more similar to an isolate from Iowa than the others? The strain that is most unlike the others is the OM9434871 California Omicron. I think they are more similar to each other than others because of their proximity to each other compared to their proximity to the others. 6. Do you think generating a tree of isolates on a global scale can be a useful tool to help investigators trace the likely path of transmission? Why or why not? Yes I believe it can be a useful tool because it can help investigators find out where it originated from by tracing it back. Lab Activity 4: Protein Phylogenetic Tree 7. Look closely at the creatine kinase phylogenetic tree you have produced. Do you see any large groupings/clustering? Describe what you see and why you think it might be arranged that way. Page 3 of 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help