Monday, December 28, 2009

Protein sequences are still badly annotated in major databases

A new paper in PLoS Comp Bio reminds us that sequences are still often misannotated!

Schnoes AM, Brown SD, Dodevski I, Babbitt PC, 2009 Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies. PLoS Comput Biol 5(12): e1000605. doi:10.1371/journal.pcbi.1000605

http://www.ploscompbiol.org/article/info%3Adoi%2F10.1371%2Fjournal.pcbi.1000605
I have expected the problem to go away with more data available and more experience among database managers, but this does not seem to be the case. SwissProt annotators can pride themselves with few errors however.

Friday, October 19, 2007

Structural variation in the human genome has been underestimated

In todays issue of Science, a study concludes that the structural variation in our genome is probably larger than what has been though. They suggest that the variation surpass single nucleotide variation (as revealed by the HapMap project).


The study was conducted with massive paired-end sequencing on two individuals from the HapMap project, one European and one African, so the sample is somewhat small. On the other hand, we have not been thinking of variation between two random individuals as drastic as is shown here.


I am curious if this variation is also the actual cause behind the differences between the HPR assembly and the Celera assembly. Obviously, the challenge of putting together a genome assembly for a pool of individuals is now somewhat of an ill-posed problem! I think Celera pooled data from five people of different ethnic populations, and then they added data from the HPR sample: all in all a problematic task.

Wednesday, September 19, 2007

New Masters program at KTH

I would like to draw your attention to a new Master program in
Computational and Systems Biology starting in the fall of 2008 at
The School of Computer Science and Communication of KTH Royal
Institute of Technology (Stockholm, Sweden).


The Master Program is of two years, in the Bologna format of
European Higher education. A web page with more information is at
http://www.csc.kth.se/utbildning/program/compsysbio/home


Tuition in Swedish higher education is free, for anyone,
from any country.


Stockholm/Uppsala is a major center of the biotech/pharma
industries in Europe, and one of the major hubs in biomedical
research, worldwide.


KTH has strong traditions in Biotechnology, Theoretical Computer
Science (Algorithmic complexity theory) and many other fields,
and pursues a vigorous program in Entrepreneurship and Innovation.


For any questions or queries, please feel free to get in
touch with me, or with any faculty member as listed on the
the program web page.

Wednesday, September 5, 2007

Ultraconserved elements investigated

One of the surprises when the numerous whole genomes started to come out, was that they contain short regions that are extremely conserved. They were named ultra-conserved elements and were defined to be at least 200 bp and almost perfectly conserved. These elements were found in species as far apart as human and mouse. Usually, such conservation hints about important function. Notice that this conservation is much stronger than what we usually see in genes, and therefor it was expected that we would find a new exciting mechanism using these regions.

However, a recent publication (Ahituv et al, PLoS Biology) details how knockouts of these regions in mouse gave no hints about functionality. In fact, the mice were as viable as any other mouse. This does not rule out that the regions are important and functional, but the extreme conservation does no longer imply extreme importance.