Sequencing the genome of an organism allows scientists to investigate its unique genetic make-up, its evolutionary links to other creatures, and how it has adapted to its environment. Researchers at King Abdullah University of Science and Technology (KAUST), Saudi Arabia, have sequenced the first reef fish genome, the blacktail butterflyfish (Chaetodon austriacus), an iconic Red Sea species considered to be an 'indicator' species for coral health.
While genome sequences already exist for well-established model species such as the zebrafish, which is commonly used in medical research, there are no genomes publically available for natural populations of tropical reef fish. Michael Berumen, Joseph DiBattista, and a multidisciplinary team at KAUST, sought to fill this significant gap in fish genomic data.
"The blacktail butterflyfish has one of the most restricted ranges of any butterflyfish species, largely concentrated in the northern and central Red Sea," explains DiBattista. "Therefore, it is likely to have developed unique genomic adaptations to this environment."
Identifying these genetic mechanisms may also help predict how other marine organisms could adapt to challenging sea conditions in future.
The team faced a considerable task when it came to sequencing the new genome, partly because they had no reference genomes from closely-related fish to compare. They took portions of gill filaments from a wild butterflyfish and generated a mix of DNA fragments or 'reads'.
"We then undertook a series of steps to figure out which reads connected with each other, and as a whole, how they overlapped," explains Berumen. "Imagine trying to reconstruct a lengthy book from tiny segments consisting of a few hundred characters, each taken from a random part of that book. This very quickly becomes a computer science problem since it would be impossible to do it manually. Most fish genomes consist of around a billion base pairs, or a book with a billion characters in our analogy!"
Berumen sought the bioinformatics expertise of Manuel Aranda's group at KAUST's Computational Bioscience Research Center. Once the team had assembled the genome, they analyzed it to ensure it made sense; for example, checking for the existence of genes previously identified in other organisms.
Their final, high-quality genome includes 28,926 protein-coding genes. The team hope their genome will enable studies on the co-evolution of reef fish species and comparisons of gene sequences between closely-related fish across the Indo-Pacific region.
The genome may also help stem trading in wild reef fish, because aquaculture specialists may eventually be able use the data to produce new, aquarium-tolerant species to fulfill the market demand for decorative fish.
Explore further: The Atlantic cod's sex gene revealed
Joseph D. DiBattista et al. Draft genome of an iconic Red Sea reef fish, the blacktail butterflyfish (): current status and its characteristics, Molecular Ecology Resources (2016). DOI: 10.1111/1755-0998.12588