Learning the alphabet of gene control

Jan 17, 2013

Scientists at Karolinska Institutet in Sweden have made a large step towards the understanding of how human genes are regulated. In a new study, published in the journal Cell, they identified the DNA sequences that bind to over four hundred proteins that control expression of genes. This knowledge is required for understanding of how differences in genomes of individuals affect their risk to develop disease.

After the was sequenced in 2000, it was hoped that the knowledge of the entire sequence of could rapidly be translated to medical benefits such as , and predictive tools that would identify individuals at risk of disease. This however turned out to be harder than anticipated, one of the reasons being that only 1 percent of the genome that code for proteins was in fact possible to read. The remaining part, much of which describes how these proteins should be expressed in different cells and tissues, could not be understood. This, in turn, because the scientists did not know which are functional, and bind to the specific proteins called transcription factors that regulate gene expression.

"The genome is like a book written in a foreign language, we know the letters but cannot understand why a human genome makes a human or the a mouse", says Professor Jussi Taipale, who led the study at the Department of Biosciences and Nutrition. "Why some individuals have higher risk to develop such as heart disease or cancer has been even less understood."

The human genome encodes approximately 1000 transcription factors, and they bind specifically to short sequences of DNA, and control the production of other proteins. In the work published in Cell, the scientists at Karolinska Institutet describe DNA sequences that bind to over 400 such proteins, representing approximately half of all human transcription factors. Data was generated with a new method that uses a modern DNA sequencer that produces hundreds of millions of sequences, giving the results unprecedented accuracy and reliability.

In addition, binding specificities of human were compared to those of the mouse. Surprisingly, no differences were found. According to the scientists, these results suggest that the basic machinery of gene expression is similar in humans and mice, and that the differences in size and shape are caused not by differences in transcription factor proteins, but by presence or absence of the specific sequences that bind to them.

"Taken together, the work represents a large step towards deciphering the code that controls gene expression, and provides an invaluable resource to scientists all over the world to further understand the function of the whole human genome", says Professor Taipale. "The resulting increase in our ability to read the genome will also improve our ability to translate the rapidly accumulating genomic information to medical benefits.

Explore further: The origin of the language of life

More information: 'DNA-Binding Specificities of Human Transcription Factors', Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K.R., Rastas, P., Morgunova, E., Enge, M., Taipale, M., Wei, G., Palin, K., Vaquerizas, J.M., Vincentelli, R., Luscombe, N.M., Hughes, T.R., Lemaire, P., Ukkonen, E., Kivioja, T. and Taipale, J., Cell, online 17 January 2013. Embargoed until 12:00 (noon) US Eastern Time, on Thursday 17 January 2013.

add to favorites email to friend print save as pdf

Related Stories

'Moonlighting' molecules discovered

Oct 29, 2009

Since the completion of the human genome sequence, a question has baffled researchers studying gene control: How is it that humans, being far more complex than the lowly yeast, do not proportionally contain in our genome ...

Deciphering the language of transcription factors

Sep 11, 2012

Transcription factors are proteins that bind to DNA to promote or suppress protein production. Since almost all diseases involve disruption of the protein-production process, transcription factors are promising biological ...

Researchers create atlas of transcription factor combinations

Mar 04, 2010

In a significant leap forward in the understanding of how specific types of tissue are determined to develop in mammals, an international team of scientists has succeeded in mapping the entire network of DNA-binding transcription ...

Recommended for you

The origin of the language of life

20 hours ago

The genetic code is the universal language of life. It describes how information is encoded in the genetic material and is the same for all organisms from simple bacteria to animals to humans. However, the ...

Quest to unravel mysteries of our gene network

Dec 18, 2014

There are roughly 27,000 genes in the human body, all but a relative few of them connected through an intricate and complex network that plays a dominant role in shaping our physiological structure and functions.

EU court clears stem cell patenting

Dec 18, 2014

A human egg used to produce stem cells but unable to develop into a viable embryo can be patented, the European Court of Justice ruled on Thursday.

User comments : 0

Please sign in to add a comment. Registration is free, and takes less than a minute. Read more

Click here to reset your password.
Sign in to get notified via email when new comments are made.