This blog started with my Phyloinformatics course taught in University of Glasgow.. now I've been publishing other stuff that I do in my MSc and any interesting Bubble I find on the web. Enjoy! :)
Monday, 12 March 2012
NCBI_Blast_Tree
The process of making a phylogenetic tree is simple in essence, however in our case, where we had to deal with many new species; to make a proper molecular tree would be a challenge. The basic steps involved in creating a tree from molecular sequence data are:
(i) Some collection of sequences obtained from the NCBI database by a BLAST search and (carefully) aligned to put together homologous residues, nucleobases, or amino acids in case we want to compare proteins.
(ii) Identifying other sequences that are related to the sequence of interest and obtain the data for that sequence is also a crucial step.
(iii) Aligning the sequences and finding the differences between all pairs of sequences would be our next step, we would use ClustalX in order to accomplish that step.
(iv) Using the alignment results, we can generate a phylogenetic tree.
(source: "Phylogenetic Trees Made Easy - A How-To Manual for molecular Biologists" - Hall G. Barry)
Our first step was to Refine our data table and find the NCBI codes for each specie.
Filtering our data for all the species that have a NCBI code.
As we can see, most of the species did not include a valid NCBI ID code, since they are still new species. Another reason could be because of a spelling mistake in their scientific name, are because the same specie could be know with another scientific name, causing a confusion. While filtering the data, we got only 100 species that contained NCBI ID number, out of 328 species in the list. And only 70 species had both NCBI and uBio ID names.
I've decided then to look in NCBI Taxonomy database for a small group of new mammals and compare them and try to make a good molecular tree for this small group.
Unfortunately it was very hard to find sequences of the same protein or molecule in each of the species and then make a sequence alignment with ClustalX.
.....
The best tree I could manage to find is a typical tree in NCBI, where I used the NCBI codes that I had produced in Google Refined table.
NCBI Tree
In our Data, only 169 species appeared with NCBI codes and these came out to our Taxonomy Tree in NCBI. Dominic and I tried to save the file in different available formats. Unfortunately, we do not know how to open this file after we save it. We tried using PHYLIP, since Dominic used that while attempting to make a tree, but it didn't work for some reason.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment