Using reference-assisted chromosome assemblies to study chromosome structures and evolution in vertebrates
Lead Research Organisation:
Royal Veterinary College
Department Name: Comparative Biomedical Sciences CBS
Abstract
Genomes contain genes that encode proteins that build organisms. In the course of evolution genomes change and these changes affect genes by changing the time when proteins are formed or even leading to formation of new genes or death of old genes. These events together form one of the sources of variations used by the natural selection to form new species or for species adaptation to the environment. Complete sequencing of a genome refers to the identification of the sequence of nucleotides along chromosomes. To understand what mechanisms drive changes in chromosome structures in different species and how this affects formation of new species or an adaptation of existing species to changing environment we will reconstruct complete chromosome structures of newly sequenced species using an novel algorithm called "reference-assisted chromosome assembly" or RACA. This algorithm compares sequenced parts of one organism' genome to existing complete chromosome assembly of another and reconstructs chromosomes of their putative common ancestor. Then it uses parts of the newly sequenced genome and searches for the differences between the ancestral organization of chromosomes and the organization proposed by parts of chromosomes that are generated for the organism. At the final step it organizes parts or ancestral chromosomes according to the order proposed by sequence scaffolds. In the research proposed in this proposal we will develop several algorithms to verify these reconstructions by looking at the specific features of chromosomes like "telomeres" - chromosome ends and "centromeres" - important for cell division. These structures contain specific sequence features that could be reconstructed from the sequence data produced during sequencing projects. By detecting positions of these features in the reconstructed chromosomes we will be able to check how close the structure of reconstructed chromosomes is to real chromosomes in the species of interest. If there are issues, we will adapt the RACA algorithm to improve the assembly.
In the next step we will use RACA-generated chromosomes to investigate mechanisms driving chromosomal changes at the DNA level. We will check if the distribution of chromosome parts that are not rearranged in all genomes included in our analysis can be explained by the random breakage of chromosomes in evolution, or if there is a selection against chromosomal rearrangements in some parts of a vertebrate genome. If we analyze a large set of species we might be able to find "built blocks" of mammalian, amniote, or vertebrate genomes that cannot be rearranged without a lethal effect for the organism.
Evolutionary breakpoint regions are regions of chromosomes where chromosomes were broken and then rejoined in a different combinations or orientation in evolution. We will use multiple RACA genomes to investigate what features of the genomes are driving these events. An important question to answer is "which genes would more likely be affected by these evolutionary events?" Previously we demonstrated that the evolutionary breakpoint regions are enriched for the genes that are associated with the lineage-specific features. In this project we will perform bioinformatics analysis of these intervals in an attempt to classify lineage-specific changes that happened in ancestral genomes of some lineages leading to the formation of their specific traits chosen by natural selection, e.g., formation of the rumen in ruminant species. Our hypothesis is that the changes in ancestral genomes of the livestock species will be connected to those features of the species that made them attractive source of proteins for humans. Therefore, detection of these ancestral changes is an important step for improving genetics of these species as it will identify best gene and other targets for future artificial selection and breed improvement.
In the next step we will use RACA-generated chromosomes to investigate mechanisms driving chromosomal changes at the DNA level. We will check if the distribution of chromosome parts that are not rearranged in all genomes included in our analysis can be explained by the random breakage of chromosomes in evolution, or if there is a selection against chromosomal rearrangements in some parts of a vertebrate genome. If we analyze a large set of species we might be able to find "built blocks" of mammalian, amniote, or vertebrate genomes that cannot be rearranged without a lethal effect for the organism.
Evolutionary breakpoint regions are regions of chromosomes where chromosomes were broken and then rejoined in a different combinations or orientation in evolution. We will use multiple RACA genomes to investigate what features of the genomes are driving these events. An important question to answer is "which genes would more likely be affected by these evolutionary events?" Previously we demonstrated that the evolutionary breakpoint regions are enriched for the genes that are associated with the lineage-specific features. In this project we will perform bioinformatics analysis of these intervals in an attempt to classify lineage-specific changes that happened in ancestral genomes of some lineages leading to the formation of their specific traits chosen by natural selection, e.g., formation of the rumen in ruminant species. Our hypothesis is that the changes in ancestral genomes of the livestock species will be connected to those features of the species that made them attractive source of proteins for humans. Therefore, detection of these ancestral changes is an important step for improving genetics of these species as it will identify best gene and other targets for future artificial selection and breed improvement.
Technical Summary
A novel in silico approach will be applied to predict the order of scaffolds in chromosomes of species sequenced with the NGS techniques. This will include the alignment of scaffolds to existing whole-genome assemblies ("reference genomes"), algorithmic prediction of the most probable organization of a common ancestor for two genomes followed by ordering of ancestral blocks in newly sequenced genomes basing on the lineage-specific rearrangements found within its scaffolds. The chromosomal reconstructions will be verified using telomeric and centromeric sequences reconstructed from the NGS data of the newly sequenced genome. Suboptimal solutions will be searched for the chromosomes that will have structural issues after the verification performed. Corrected RACA genomes from multiple species will be used to search for a support of fragile breakage models of the chromosome evolution as well as for detection of minimum blocks of genes in vertebrates that cannot be disrupted in evolution. We will also explore the mechanisms of the chromosomal rearrangements by analyzing the evolutionary breakpoint regions for enrichment for the lineage-specific features, such as retrotransposable elements, genes, SNPs and CNVs. For the whole-genome set we will analyze sequence features that mark out genomes of different clades (orders, families) as distinct from the genomes of other groups. This will be especially important for gene networks and other genomic features that contribute to the agricultural importance of some species and clades.
Planned Impact
We will detect chromosomal structures in mammalian species and will compare their genome organization to the genome organization of other species, including human. The outcome of our programme has a potential to influence the UK and world economy, health, and services.
Impact on health and biomedicine:
a)The outcome of our research will be used to identify animal models for human genetic disorders by selecting species with disease phenotypes similar to human phenotypes and with the same as in human genome organization in homologous genome intervals. An important advantage of our work is that we will produce ordered sequence maps for all genomes, an absolute requirement for a good animal model. Therefore, the outcome of our research will permit the selection of better animal models for testing human medicine than traditionally used mice.
b)Another part of our studies potentially connected to the quality of life and health is a study of the mechanisms of evolutionary chromosomal rearrangements. We will investigate what makes some regions of chromosomes in meiotic cells fragile in evolution. Our results in this area could have influence on studies of cancers in humans that are accompanied by rearrangements of chromosomes in somatic cells. Our previous studies suggest that a correlation exists between the regions of meiotic instability in evolution and mitotic instability in cancer cells. This project will extend knowledge of the mechanisms of instability and our results could be used for predicting chromosomal regions that could to be rearranged in human cancers.
Impact on economy and services:
Impact of our research on economy in the UK and other countries could be achieved by using our findings of the lineage-specific genome changes in livestock species for making better strategies for improving efficiency of agriculture and decreasing its negative effect on global climate. For example, lineage-specific changes found in all related species (e.g., ruminants) are likely to control the traits that made these species attractive for domestication. Therefore, the unique features of different genomes will form a dataset that can be explored for genes or other targets for improvement of economically or environmentally important traits in livestock species, such as meat quality in cattle and sheep or green house gas emissions from all ruminants. Our data should allow a reduction in expensive genome-wide association studies using hundreds of animals without a guarantee that the gene of interest will be detected. Instead, it will be possible to generate an explicit list of all lineage-specific changes that are ready to be explored for the markers associated with the particular trait of interest. Eventually, the cost of QTL hunting can be significantly decreased and efficiency increased. We are already exploring the possibilities of using comparative genomics studies to understand the genetics of lineage-specific systems, such as the rumen in order to decease the green house gas emissions from livestock species.
Delivering highly skilled people:
We will provide the postdoctoral research associate appointed to this project with the opportunity to learn the cutting edge methods of bioinformatics and laboratory work related to genome sequencing, assembly, and genome analysis, therefore improving his/her professional skills and chances of a successful career in academia or industry in areas related to bioscience.
Distribution of knowledge:
To make our results widely known we will publicise our research results to the widest possible audience through publications in the scientific journals, conference presentations, and Internet websites. The list of the journals we expect the results of this programme to be published is presented in the academic beneficiaries section. We will work close will local business to distribute the knowledge generated by this work among businesses working on animal production and breeding.
Impact on health and biomedicine:
a)The outcome of our research will be used to identify animal models for human genetic disorders by selecting species with disease phenotypes similar to human phenotypes and with the same as in human genome organization in homologous genome intervals. An important advantage of our work is that we will produce ordered sequence maps for all genomes, an absolute requirement for a good animal model. Therefore, the outcome of our research will permit the selection of better animal models for testing human medicine than traditionally used mice.
b)Another part of our studies potentially connected to the quality of life and health is a study of the mechanisms of evolutionary chromosomal rearrangements. We will investigate what makes some regions of chromosomes in meiotic cells fragile in evolution. Our results in this area could have influence on studies of cancers in humans that are accompanied by rearrangements of chromosomes in somatic cells. Our previous studies suggest that a correlation exists between the regions of meiotic instability in evolution and mitotic instability in cancer cells. This project will extend knowledge of the mechanisms of instability and our results could be used for predicting chromosomal regions that could to be rearranged in human cancers.
Impact on economy and services:
Impact of our research on economy in the UK and other countries could be achieved by using our findings of the lineage-specific genome changes in livestock species for making better strategies for improving efficiency of agriculture and decreasing its negative effect on global climate. For example, lineage-specific changes found in all related species (e.g., ruminants) are likely to control the traits that made these species attractive for domestication. Therefore, the unique features of different genomes will form a dataset that can be explored for genes or other targets for improvement of economically or environmentally important traits in livestock species, such as meat quality in cattle and sheep or green house gas emissions from all ruminants. Our data should allow a reduction in expensive genome-wide association studies using hundreds of animals without a guarantee that the gene of interest will be detected. Instead, it will be possible to generate an explicit list of all lineage-specific changes that are ready to be explored for the markers associated with the particular trait of interest. Eventually, the cost of QTL hunting can be significantly decreased and efficiency increased. We are already exploring the possibilities of using comparative genomics studies to understand the genetics of lineage-specific systems, such as the rumen in order to decease the green house gas emissions from livestock species.
Delivering highly skilled people:
We will provide the postdoctoral research associate appointed to this project with the opportunity to learn the cutting edge methods of bioinformatics and laboratory work related to genome sequencing, assembly, and genome analysis, therefore improving his/her professional skills and chances of a successful career in academia or industry in areas related to bioscience.
Distribution of knowledge:
To make our results widely known we will publicise our research results to the widest possible audience through publications in the scientific journals, conference presentations, and Internet websites. The list of the journals we expect the results of this programme to be published is presented in the academic beneficiaries section. We will work close will local business to distribute the knowledge generated by this work among businesses working on animal production and breeding.
Organisations
People |
ORCID iD |
Denis Larkin (Principal Investigator) |
Publications
Beynon SE
(2015)
Population structure and history of the Welsh sheep breeds determined by whole genome genotyping.
in BMC genetics
Damas J
(2018)
Reconstruction of avian ancestral karyotypes reveals differences in the evolutionary history of macro- and microchromosomes.
in Genome biology
Damas J
(2017)
Upgrading short-read animal genome assemblies to chromosome level using comparative genomics and a universal probe set.
in Genome research
Fang X
(2014)
Genome-wide adaptive complexes to underground stresses in blind mole rats Spalax.
in Nature communications
Farré M
(2016)
Novel Insights into Chromosome Evolution in Birds, Archosaurs, and Reptiles
in Genome Biology and Evolution
Farré M
(2019)
Evolution of gene regulation in ruminants differs between evolutionary breakpoint regions and homologous synteny blocks.
in Genome research
Farré M
(2015)
An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity.
in BioEssays : news and reviews in molecular, cellular and developmental biology
Kim J
(2017)
Reconstruction and evolutionary history of eutherian chromosomes.
in Proceedings of the National Academy of Sciences of the United States of America
Larkin D
(2015)
The genetics of cattle
O'Connor RE
(2018)
Reconstruction of the diapsid ancestral genome permits chromosome evolution tracing in avian and non-avian dinosaurs.
in Nature communications
Description | During the course of this project we tuned and applied our reference assisted chromosome assembly algorithm (RACA) to multiple animal genomes and studied chromosome evolution in two classes of animals: mammals and birds. Specifically, we applied RACA to scaffold genome assemblies of Tibetan antelope, yak, blind mole rat, giraffe, gemsbok, Indian Muntjac, fox and 18 avian genome assemblies. The blind mole rat RACA genome assembly was used to identify that the blind mole rat has a highly-conserved chromosome structure compared to other rodents. This could be directly related to high resistance to cancer found in this species (the blind mole rat genome paper was published in Nature Communication. D. Larkin is one of primary authors, Marta Farre - a postdoc funded by this award is a contributing author). Our blind mole rat Evolution Highway website containing the results of our RACA chromosome reconstructions is publicly available and listed under the databases section. We applied RACA to 18 avian genomes that were sequenced as part of the Avian Genome consortium (we also have contributed to the main paper published in Science). In the Science paper and another paper published in Genome Biology and Evolution in 2016 we used these genomes to find novel patterns of avian evolution addressing the questions of the stability of avian karyotypes and association between chromosome structures and phenotypes in reptiles. Avian RACA assemblies are now being used to link avian genome assemblies to chromosomes in our BBSRC funded collaborative project with Prof. Darren Griffin and to study the avian chromosome evolution. RACA assemblies are available from our avian Evolution Highway website and our UCSC Genome Browser hub. We contributed to sequencing of a several ruminant genomes (giraffe, gemsbok, Indian Muntjac) to focus on the evolutionary analysis of ruminant chromosomes in order to answer questions related to how changes in ancestral genomes led to formation of economically important traits or organs (e.g., rumen as it was initially proposed). In 2019 we published our ruminant studies in the Genome Research. This paper shows the power of comparative genomics and RACA-assisted assemblies to reveal and understand functional effects of structural chromosome changes on gene regulation in ruminants. Larkin's group keeps leading the ruminant genome analysis as part of the Genome 10K initiative. We built a local version of the UCSC genome browser focusing on ruminant genome data. We specifically looked for conserved non-coding elements and structure of chromosomes that could be now analysed due to the availability of RACA predictions of ruminant genomes generated using funds from this grant. We found that a significant fraction of conserved non-coding elements were formed in the ancestral ruminant genome due to insertions of ruminant-specific transposable elements. We detected genes in/near ruminant-specific chromosome rearrangements, identified expanded gene families and genes under selection. In collaboration with a group from BGI and UCD we reconstructed ancestral karyotype of ruminants with the use or RACA genomes built during the last two years. With the use of RACA ruminant genomes, we reconstructed the ancestral ruminant karyotype in 45 predicted chromosome fragments. We linked our findings on the gene family expansions, gene selection and ruminant conserved non-coding elements to the structural differences that occurred during the formation of the ancestral ruminant and descendent ruminant genomes. We generated reconstructions of Cetartiodactyl, Ruminant, Pecoran and Bovidae ancestral karyotype structures and confirmed these reconstructions using FISH on the chromosomes of a basal ruminant - chevrotain in collaboration with Prof. Graphodatsky, Novosibirsk, Russia. These data have been used to find the association between Ruminant and Pecoran evolutionary breakpoint regions and phenotypes distinguishing these groups of ruminants. We developed a new approach to improve quality of RACA assemblies that uses a direct verification with PCR of the scaffolds that contain potential structural differences between the RACA-assembled genome and the reference and outgroup genomes. We developed a pipeline to map sequences to telomeric scaffolds and used these to verify our RACA assemblies. Finally, we developed a new software tool that uses conserved elements to quickly align mammalian genomes. We well exceeded the minimum number of species (5) which we proposed in the original proposal. RACA-assisted assemblies of yak and giraffe have been verified using centromeric sequences generated by the RepeatNet algorithm as it was initially proposed in our grant application. Overall, the three years of this project and the follow up studies were very successful. We were able to achieve the goals stated in the original proposal, contribute as invited speakers to more than 10 conferences, be interviewed by BBC, publish several press releases (including one on the BBSRC website) publish five papers including a paper in Science about avian genome evolution, and one book chapter. In 2016 we published a landmark paper on a new method of assembling avian genomes to the chromosome level in Genome Research and in 2019 on the ruminant evolution in the same journal. Dr. Farré obtained training at EBI. Dr Larkin ran training workshops on how to use RACA in South Africa, Russia and in Hungary. We are currently actively using RACA to verify and improve the assembles produced by Dovetail Genomics for our ruminant and other mammalian sequencing projects and expect to run our algorithm on 20+ ruminant genome assemblies from ruminant species currently being analysed by Goujie Zhang group at Copenhagen University. |
Exploitation Route | Results of our research were and will be used in several ways: -improved RACA algorithm is used to verify/adjust superscaffold genome assemblies produced by the Dovetail Genomics company and for PacBio genome assemblies. -RACA allowed for development of a non-expensive and effective method of chromosome-level assembly of animal genomes. -Stable chromosome structure in blind mole rat will be studied further to understand how this stability affects regulation of genes that control apoptosis and necrosis and therefore contribute to high cancer resistance in the blind mole rat (not a single case of cancer was reported in 30 years of study). The fact that our results are available as a public database facilitates their use by other groups. -Our improved reference assisted chromosome assembly algorithm is used by the group of Prof. Steve O'Brien to predict chromosome structures and improve genome assemblies in large cats (see information about the visitor we had from this group) and by the group of Dr. Anna Kukekova (University of Illinois at Urbana-Champaign) to reconstruct the chromosome organization of the silver fox genome. -RACA reconstructions for 18 avian genomes formed the basis for our BBSRC funded collaborative work with Prof. Darren Griffin at the University of Kent who is using them to map fragmented avian genome assemblies to chromosomes. -Results of ruminant comparative project were/will be used in several ways: a) conserved non-coding elements were used to select and annotate mutations that are parts of cattle and sheep genotyping arrays and are located within evolutionary conserved sequences. Over 3000 of SNPs within these conserved sequences were selected and placed on the new commercial SNP array. |
Sectors | Agriculture Food and Drink Education Healthcare |
Description | We the major economical and societal impact of this grant is two-fold. First, the construction of a new GeneSeek cattle genotyping array for which our conserved non-coding elements identified from the ruminant RACA genomes were used. This array, focused on functional mutations in the cattle genome allows for a significant improvement in accuracy of detection of associations between genotypes and economically important phenotypes. We finished working on the final, ruminant chromosome evolution part of the project. We detected genomic sequences that contribute to the formation of rumen and economically important traits in ruminant genomes and the genomes of other cetartiodatyls with the final paper being submitted for publication. Second, RACA allowed to design a new method to assemble animal genomes to chromosomes. As shown for birds this allows to build chromosome assemblies inexpensively, providing industry with genome assemblies for neglected livestock species and poultry. We had a chance to produce societal impact through the public media (interview given by D. Larkin to BBC online about the blind mole rat genome, multiple ress releases and news stories) and producing tools (Evolution Highway databases) that have been used for teaching at university levels in the UK, Brazil, the USA and Russia. We also contributed a chapter to the main textbook on cattle genomics widely used for student and postgraduate education in many countries (see publications, awards). Our paper on the Welsh sheep genomics shed light on the history of human migration to the UK and will provide a basis for future improvement for sheep breeding strategies in the UK. Dr. Larkin has organized training workshops on genomics and use of software tools in South Africa and Russia in 2015-2016 and participated in school visits in the UK. Finally, based on resources developed during the grant (CNE datasets) we designed a new software that allows for a fast alignment of mammalian genomes. The software is publicly available and was published in GigaScience journal. |
Sector | Agriculture, Food and Drink,Education |
Impact Types | Cultural Societal Economic |
Description | Marie Curie Fellowship |
Amount | € 195,000 (EUR) |
Organisation | European Research Council (ERC) |
Sector | Public |
Country | Belgium |
Start | 04/2016 |
End | 04/2018 |
Description | Responsive mode |
Amount | 18,000,000 ₽ (RUB) |
Organisation | Russian Science Foundation |
Sector | Public |
Country | Russian Federation |
Start | 01/2016 |
End | 12/2018 |
Description | Responsive mode |
Amount | £387,536 (GBP) |
Funding ID | BB/P020062/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 09/2017 |
End | 09/2020 |
Description | Responsive mode |
Amount | 2,100,000 ₽ (RUB) |
Organisation | Russian Foundation for Basic Research |
Sector | Academic/University |
Country | Russian Federation |
Start | 01/2018 |
End | 12/2020 |
Description | Responsive mode |
Amount | 6,000,000 ₽ (RUB) |
Organisation | Russian Foundation for Basic Research |
Sector | Academic/University |
Country | Russian Federation |
Start | 09/2017 |
End | 09/2020 |
Title | Genome alignments of mammalian genomes against the cattle genome on the UCSC genome browser |
Description | Alignments of 15 mammalian genomes against the cattle genome visualised on the UCSC Genome Browser. The alignments were obtained using the lastz aligner and parsed with the Kent utility tools |
Type Of Material | Biological samples |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | A paper in Genome Research was published demonstrating effects of evolutionary rearrangements on the expression of nearby genes. |
URL | http://sftp.rvc.ac.uk/rvcpaper/ruminantsHUB/hub.txt |
Title | Genome alignments of mammalian genomes and reconstructed ancestors on Evolution Highway |
Description | 1.Visualisation of homologous synteny between the cattle genome and 12 additional mammalian species on our Evolution Highway Comparative Chromosome Browser was achieved by parsing lastz genome alignments the Kent utilities and the maf2synteny tool to build comparative synteny blocks. 2. Visualisation of homologous synteny between reconstructed cetartiodactyl, ruminant, and pecoran ancestors (reconstructed with DESCHRAMBLER software) with the extant mammalian genomes and other reconstructed ancestors |
Type Of Material | Biological samples |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | A paper was published in Genome Research demonstrating that chromosome rearrangements in ruminants have functional effect on gene expression of the nearby genes. |
URL | http://eh-demo.ncsa.uiuc.edu/ruminants/ |
Title | Mammalian ancestral genome reconstructions |
Description | In collaboration with groups from University of California, Davis and University of Illinois we generated reconstructions for chromosome structures for 8 mammalian ancestors. |
Type Of Material | Biological samples |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | The paper describing this resource was published in PNAS. |
URL | http://eh-demo.ncsa.illinois.edu/ancestors/#/SynBlocks |
Title | Reference assisted genome assemblies of 17 avian species |
Description | We built RACA assemblies for 18 avian genomes previously sequenced by AvianGenomes consortium and made these assemblies publicly available on our Evolution Highway website and the UCSC Genome browser hub (http://sftp.rvc.ac.uk/rvcpaper/birdsHUB_test/hub.txt). The later source contain 16 of these because two assemblies were upgraded to the chromosomal level and constitute a separate resource. |
Type Of Material | Biological samples |
Year Produced | 2015 |
Provided To Others? | Yes |
Impact | These assemblies are being used by our collaborators to upgrade these avian genome assemblies to the chromosome level. |
URL | http://eh-demo.ncsa.uiuc.edu/birds/ |
Title | Ruminant CNEs |
Description | We identified a set of 1.2 million ruminant conserved non-coding elements (CNEs) that could have influence on gene regulation and function in ruminants. |
Type Of Material | Biological samples |
Year Produced | 2015 |
Provided To Others? | Yes |
Impact | Over 3000 of the CNEs that have single nucleotide polymorphisms in cattle breeds have been placed on a new 250,000 SNP GeneSeek genotyping array aiming to screen for functional mutations in cattle. |
Title | Blind Mole Rat Evolution Highway Interactive Chromosome Browser |
Description | We built an interactive Evolution Highway website that contains results of reference assisted chromosome assembly of Blind Mole Rat Chromosomes. This site provides user with easy means for visualising and comparing chromosome structures of Blind Mole Rat, Mouse and Human chromosomes, detections of the regions of conserved synteny, and evolutionary breakpoint regions. |
Type Of Material | Database/Collection of data |
Year Produced | 2013 |
Provided To Others? | Yes |
Impact | This website was used to produce and visualise results of the chromosome structure and evolution published as part of the Blind Mole Rat Genome paper (Fang et al., Nature Communications, 2014) |
URL | http://eh-demo.ncsa.uiuc.edu/spalax/ |
Title | Cat Evolution Highway Interactive Chromosome Browser |
Description | We built an interactive Evolution Highway website that contains results of alignments of cat chromosome with chromosome sequences of 9 mammalian species. This site provides a user friendly interface for visualising and comparing chromosome structures of cats and other mammals, detections of the regions of conserved synteny, potential assembly errors, and evolutionary breakpoint regions. |
Type Of Material | Database/Collection of data |
Year Produced | 2013 |
Provided To Others? | Yes |
Impact | This tool is currently used by our collaborators (Steve O'Brien at Dobzhansky Bioinformatic Center, St. Petersburg, Russia) to produce a collaborative paper that deals with the detection and correction of errors in the cat genome assembly. |
URL | http://eh-demo.ncsa.uiuc.edu/cat/ |
Title | mammalian ancesors chromosome browser |
Description | An Evolution Highway Chromosome Browser website has been made containing visualizations of 7 ancestral genomes for the lineage leading to human (from Eutherian ancestor to the acnsetor of human and chimp). |
Type Of Material | Database/Collection of data |
Year Produced | 2016 |
Provided To Others? | Yes |
Impact | This database have been used for publication from our group in PNAS. |
URL | http://eh-demo.ncsa.uiuc.edu/ancestors/#/SynBlocks |
Description | Camel chromosome level genome assemblies |
Organisation | University of Veterinary Medicine Vienna |
Department | Department of Pathobiology |
Country | Austria |
Sector | Academic/University |
PI Contribution | Dr. Larkin served as a consultant at a IAEA meeting in Vienna dedicated to construction of radiation hybrid chromosomal maps for camel species. As the result of this meeting it was decided that Dr. Larkin's group will be responsible for constructing reference-assisted assemblies of the dromedary and Bactrian camels. Dr. Larkin has appointed an RVC master student to perform this work who is doing this project now. In addition Dr. Larkin is now a partner on the ongoing FWF-RSF application to study chromosome evolution and selection in camel breeds in Central Asia. In 2018 the reference assisted assembly of the Dromedary camel has been finished and published. |
Collaborator Contribution | Dr. Pamela Burger from the University of Veterinary Medicine in Vienna has provided us with the genome assemblies and raw read data to perform reference assisted chromosome assemblies of dromedary camel. She also provided with the DNA samples required to perform verification of RACA camel assemblies. Dr. Polina Perelman from the Institute of Molecular Biology, Novosibirsk, Russia has provided us with BAC maps of alpaca genome to facilitate RACA chromosome-level assemblies of camel genomes. |
Impact | Denis Larkin's group performed initial reference assisted chromosome assembly using RACA for the dromedary camel genome (Fitak et al. 2016). The tool assembled 1,797 scaffolds (10 Kb minimum size) into 154 predicted chromosome fragments (PCFs) of which one was homologous to a complete cattle chromosome (chromosome 25). The longest PCF was 112 Mb long containing 97 scaffolds and the shortest PCF was 117 Kb long containing two scaffolds. The N50 of the RACA initial assembly was 31,2 Mb which is 21 times higher than the N50 = 1,48 Mb of the original assembly. The total length of the assembled PCFs was 1,886 Mb or 94% of the original dromedary scaffold-based assembly. RACA splits 44 (2%) scaffolds as potentially"chimeric". All split scaffolds are currently being verified by PCR prior to running the second (final) round of RACA were all scaffolds with confirmed structure will be kept intact. The paper describing this with our master student as a first author has been published in Frontiers in Genetics in a special collection dedicated to camel genomics (PMID: 30804979) |
Start Year | 2016 |
Description | Fox RACA genome |
Organisation | University of Illinois at Urbana-Champaign |
Department | Department of Animal Sciences |
Country | United States |
Sector | Academic/University |
PI Contribution | We worked with the group at the UIUC to build a RACA assembly of the fox genome. Our pipelines and help using them were provided. |
Collaborator Contribution | The group at UIUC used our tools to build the silver fox RACA genome assembly |
Impact | Reference-assisted assembly of the fox genome has been built. The manuscript describing the results was written and published in the journal Genes (PMID: 29925783). |
Start Year | 2015 |
Description | Mammalian ancestral genome reconstructions |
Organisation | University of California, Davis |
Department | Department of Evolution and Ecology |
Country | United States |
Sector | Academic/University |
PI Contribution | In collaboration with a group of Prof. Harris Lewin at UCD we have designed a novel algorithm to reconstruct structures of animal ancestral chromosomes. Dr. Larkin is a co-corresponding author on the paper being published PNAS in 2017. He and members of his group directly contributed to the design of the algorithm and its application to 19 mammalian genomes to reconstruct ancestral genomes in the lineage leading to human. Dr. Larkin's group has also applied this algorithm to the ruminant genomes, with the manuscript being published in Genome Research in 2019 (D. Larkin is a corresponding author) and works with the UCD group and the company Dovetail to improve qualities of Dovetail assemblies for mammalian genomes upgraded by Prof. Lewin to Dovetail scaffolds. The later is done using a combination of FISH technique, our reference-assisted assembly algorithm (RACA) and HiC approach. |
Collaborator Contribution | Prof. Lewin is paying for the upgrades of mammalian genomes to Dovetail superscaffolds (~10K USD per genome) and coordinates the collaborative project. Prof. Lewin contributed to interpretation of the data produced during our Ruminant genome analysis. Prof Graphodatsky was involved in fluorescence hybridization of cattle BACs on several ruminant genomes. |
Impact | A new assembly algorithm has been designed and applied to reconstruct chromosomal structures of several ancestral genomes in the lineage leading to human. The paper describing this approach and results have been published in PNAS (PMID: 28630326). The visualizations of the reconstructed assemblies are available from our Evolution Highway comparative chromosome browser. Reconstructed ruminant genomes were published in Genome Research (PMID: 30760546). This work was multidisciplinary as it involved bioinformatic analysis of sequenced mammalian genomes and fluorescence in situ hybridisation to verify the reconstructions and to infer ancestral genomes which were not available at sequence level. |
Start Year | 2015 |
Description | Muntjac rearrangements |
Organisation | 3i Consortium |
Country | United Kingdom |
Sector | Multiple |
PI Contribution | The group of Dr. Helder Maiato is specialising in the study of mitosis using a combination of cytogenetic and molecular biology techniques. They are using Indian Muntjac as a model to study mechanisms of mitosis. Dr. Larkin's group own genome assemblies of two muntjac species. They were used to design anti-sense RNA sequences to knock-out muntjac genes related to mitosis. Dr. Larkin's group did identification of muntjac sequences for 200+ mitotic genes and passed that information to Dr. Maiato's group. |
Collaborator Contribution | Dr. Maiato's group helps Dr. Larkin to identify genes that have contributed to extremely high level of chromosomal rearrangements found in the Indian Muntjac (2n=6). They are performing knock-out of the genes in muntjac cell lines that were identified as top candidates for this process in the Dr. Larkin's lab from RACA assisted chromosome assemblies of muntjac genomes. |
Impact | This is a multidisciplinary collaboration bringing together cell molecular geneticists and bioinformaticians. So far 200+ mitotic genes were knocked out and tested on Indian muntjac cell lines to reveal their effect on mitosis. The paper was published in Current Biology (PMID: 29706521) In addition, five top candidate genes are being currently investigated for the effects on contribution to spontaneous chromosome fusions in muntjac cell lines. |
Start Year | 2016 |
Description | cat genomes |
Organisation | Saint Petersburg State University |
Country | Russian Federation |
Sector | Academic/University |
PI Contribution | We are using our pipelines (including RACA) and research tools to improve genome assemblies of large cats as part of collaboration with Steve O'Brien research group |
Collaborator Contribution | Dr. O'Brien's research group contributes to the analysis of ruminant genomes my group is doing. They are providing their expertise in performing gene annotation, population demography and segmental duplication identification and use their resources to do this work for us. |
Impact | We built the cat genome Evolution Highway to visualise chromosome structure in cat and other feline species |
Start Year | 2013 |
Title | DESCRAMBLER |
Description | A novel algorithm to reconstruct ancestral chromosome structures from both completely assembled and fragmented animal genomes have been developed. |
Type Of Technology | Software |
Year Produced | 2017 |
Impact | Complete chromosome structures of seven mammalian ancestors for the lineage leading to human have been developed. The paper is under review at PNAS. |
Title | G-Anchor |
Description | G-Anchor is a new software designed to quickly align complete mammalian genomes utilising conserved non coding elements |
Type Of Technology | Software |
Year Produced | 2017 |
Impact | A paper describing the tool was accepted by GigaScience |
URL | http://gigadb.org/dataset/view/id/100415/token/jod2SJaptAFTEPZw |
Description | ConGen2016 |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Dr. Larkin was teaching at the ConGen 2016: Recent Advances in Conservation Genetics, in May 2016, Balaton Limnological Institute, Lake Balaton, Hungary. 35 postgraduate and undergraduate students attended the school and many of them interested in starting to use the reference assisted assembly algorithm in their research. |
Year(s) Of Engagement Activity | 2016 |
URL | http://congen2016.com/ |
Description | Night at the RVC |
Form Of Engagement Activity | Participation in an open day or visit at my research institution |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | About 300 people have attended our boost at the Night at the RVC event. We presented our software tool that teaches the general public the principles of the chromosome evolution. |
Year(s) Of Engagement Activity | 2015 |
Description | School for young scientists 2016 |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | More than 100 participants from the former soviet union counties have attended the School for young scientists held in Zvenigorod, Russia in 2016. Dr Larkin gave an invited lecture on the current status of animal genome studies resulting in a lot of questions from the audience and the follow up discussions. The organising committee has requested a review paper to be written and published based on the lecture given. |
Year(s) Of Engagement Activity | 2016 |