Rapid construction of reference chromosome-level mammalian genome assemblies and insights into the mechanisms of gross genomic rearrangement

Lead Research Organisation: Royal Veterinary College
Department Name: Comparative Biomedical Sciences CBS

Abstract

We live in an era in which the genomes of new species are being sequenced all the time. The most modern ways to sequence DNA have many advantages over older approaches (the prominent one being a vastly reduced cost) but a problem that arises each time the genome of a new species is sequenced is that assigning large blocks of sequence to an overall genomic "map" can be problematic and/or very expensive. It's a little like finding your location on Google Maps but not being able to "zoom out" to establish where that position is in relation to the whole country. In essence the aim of this project is to rectify this problem at one fifth of the current cost. Using our experience with birds we have developed a high-throughput approach and the tools for assigning the sequences to their proper positions in chromosomes. This involves our own adaptations to a technique called "FISH" that can take the data from sequenced genomes and visualize directly blocks of DNA sequence as they appear in their rightful place in the genome. In this study we will focus our attention on 25 newly sequenced mammal species. More importantly however we will provide the means through which this can be achieved for any of the 5,000 living mammalian species. Mammals are important to our lives in that many are models for human disease and development and are critical to agriculture (both meat and milk). Others are threatened or endangered and, with impending global warming, molecular tools for the study of their ecology and conservation are essential. Our combined efforts have also developed computer-based browser methods to compare the overall structure of one genome with another, directly visualizing the similarities and differences between the genomes of several animals at a time, something we can share widely amongst the scientific community and general public via the world wide web. The differences between mammalian genomes arose through changes that happened during evolution. One of the main aims of this project is to find out how this occurred and what are implications of these changes. We have a number of ideas such as we think there may be different "signatures" that classify why blocks of genes tend to stay together during evolution. Armed with this information, we fully intend to take it out into the world. The devices that we will develop can be adapted for the screening of individual animals for genomic rearrangements that may cause e.g. breeding problems. Moreover, the resources we will develop provide a source for public information and student learning through a dedicated, outwardly-facing web site. We have received overwhelming support from numerous laboratories all over the world who are interested in using the resources that we will develop to ask biological questions of their own. For this reason, we feel that this project will help us understand evolution in mammals and contribute to establishing the UK as a central international hub of mammalian genomics.

Technical Summary

Unless a whole genome sequence is assembled to the level of one "(super)scaffold" per chromosome, the resultant assembly can be studied for gene structure and function but cannot be used effectively to address biological questions pertaining to critical aspects of evolutionary and applied biology. Multiple letters of support for this application attest to this. Contemporary genome sequencing projects however usually fall short of this "chromosome level" assembly unless supported by extensive funding resources (~$100,000/genome). In reality, with the genomes of more animals being sequenced but with limited resources, this problem will only increase unless lower cost solutions can be found. Recently we have, in birds, developed means of taking sub-chromosomal sized scaffold based assemblies (e.g. enhanced by Dovetail or bioinformatically by RACA) and "upgrading" them to chromosome level at a fraction ~20% of the cost. This approach involves a novel method of selecting BAC clones that will hybridise to any mammalian metaphase then multiplex adaptations of FISH approaches. Mammals are the most studied phylogenetic Class, however only ~25/5000 species have sequenced genomes assembled to chromosome level. Indeed, most recent de-novo sequencing projects typically produce assemblies of several super-scaffolds per chromosome. Our approach will upgrade 25 further genomes and provide both proof of principle and the practical means through which many hundreds more can be mapped and compared. Our approach will allow easy comparative visualization of multiple genome assemblies and testing of fundamental hypothesis pertaining to the importance of overall chromosome structure in the formation of lineage-specific and ancestral phenotypes and the conservation of blocks of homologous synteny who's functional and sequence features define phenotypic traits with medical, veterinary or agricultural relevance.

Planned Impact

At the core of this application is a commitment to high impact activity, specifically benefitting industry (UK plc), academia, the third sector and the general public (academic beneficiaries are dealt with in another section). The primary industrial supporter (and beneficiary) of this research is Cytocell Ltd who specialize in the development of multiple hybridization FISH probes. Building on a long-standing collaboration initiated by a Knowledge Transfer Partnership for the development of non-human probes, the company is very interested in our approach as it will lead to new product development and maximize the potential of the human BAC collection present in the company. After extensive market research we have collectively identified "chromosome evolution Multiprobe devices" and a range of individual animal translocation screening devices. Cytocell's generous in-kind contribution is outlined in the application and, as clearly stated, represents a genuine partnership incorporating real cash-equivalent contributions designed to maximize our collective skills to bring cross species hybridization probes to market and thus ultimately to the scientific community. Going into partnership with a company in this way means that the highest possible quality product can reach the widest market worldwide.

Digital Scientific UK have identified considerable benefit in collaboration on this project through the development of its new animal karyotyping software suites as a contribution to this project. They have generously agreed to provide these free of charge. Their new "Batch Capture" protocols integrating microscope hardware with their in house algorithms for multiple FISH capture normally are charged to customers at market price but the company have kindly donated unlimited use software to this project. Both these companies also see this project as means of working together with one anther more closely, adding to their R&D portfolio and thereby increasing their share value and the value of UK plc. Finally, Dovetail see value to their company and efforts to generate contiguous chromosome assemblies, seeing our approach as entirely complementary to theirs.

A gap in perception exists in understanding the role of gross chromosomal evolution in academia and industry. While in academia it is accepted that chromosome structures play an important role in gene regulation, industry application is still focused mostly on protein changes and ignores many other features of the genome. Our project will aim to start changing this perception by providing popular resources and outreach activities for non-scientists. These resources and events will hopefully have influence on the general public including the future policy makers (see Pathways to impact for details). Therefore, we expect to have an impact on future policies in animal sciences.

The third sector (museums) will benefit from our project through the inclusion of mammalian chromosome evolution histories into the interactive tools aiming at student education and popular science exhibitions in museums. One of such tools we recently built with ESEB is called 'Evolution Factory' which teaches schoolchildren the principles of chromosome and genome evolution. A more advanced version of the tool is interactive screen that we develop with a group from the University of California at Davis to be displayed in San Francisco Exploratorium. After the tool is developed and tested we will also approach the London Science Museum to investigate their interest in using this and other interactive games we develop for their exhibitions.
 
Description During the three years of the project we built alignments of over 20 mammalian species against the human and cattle genomes and identified conserved sequence elements. These elements were used twofold: 1. We evaluated results of fluorescence in situ hybridisation of human and cattle BAC clones obtained from our grant partners and made selection of BAC clones for them to be hybridised on multiple mammalian species chromosomes. Based on the results of these hybridizations, conclusions were made regarding the genomic features of the universal probes which should be suitable for all mammals. 2. These conserved sequences were utilised in our analysis of chromosome evolution in ruminant genomes in order to understand if chromosome breakage in evolution is related to changes of gene regulation used by natural selection to produce adaptive phenotypes and eventually, the new species. This work has also utilised some of the genome alignments we built. In addition, we looked at the patterns of regulatory sequence (enhancers) changes near evolutionary breakpoints in ruminants and other species. Our findings demonstrated that near evolutionary breakpoints gene regulation is significantly different between species due to the changes in enhancer and conserved element profiles caused by insertion of transposable elements. Our paper on this subject was published in Genome Research in 2019 (Farre et al., 2019a).
We then upgraded three fragmented mammalian assemblies to the chromosome levels. The first was the genome of gemsbok, a species adapted to survive very hot climates of Africa. Its genome will help us and others to reveal adaptations to hot climates (Farre et al., 2019b). The second genome is the genome of the Dromedary camel (Ruvinsky et al., 2019). This is an economically important species. Its chromosome level assembly could be used to improve camel breeds, look for milk-production QTLs and to understand camel adaptations. The genomes of the gemsbok and camel were assembled using our Reference Assisted Chromosome Assembly (RACA) algorithm combined with the PCR verification of chromosome assemblies. For the giraffe we also utilised our preliminary panel of 140 cattle "conserved" BAC clones in addition to the RACA and PCR-based verification of reconstructed chromosome structures (Farre et al., 2019c). This genome will be utilised to understand the biology of giraffes, their adaptations and unique features. In collaboration with the Broad Institute and Prof. Harris Lewin we worked on assembling the chevrotain genome. This species is a primitive ruminant. Its genome contains clues to the formation of unique ruminant features which made ruminants a most popular livestock. The Illumina Discovar fragmented assembly produced by the Broad Institute has been upgraded using the Dovetail Chicago. In addition, we placed over ~200 BAC clones on chevrotain chromosomes to produce a chromosome level, reference quality assembly for this species. We made the HiC assembly for chevrotain as well using two different algorithms, the Dovetail HiRise and the JuiceToolBox assembler. This allowed us to compare two independently built chromosome assemblies for Chevrotain and spot issues with the HiRise assembler. The discrepancies between the assemblies were checked in the laboratory of our collaborator Prof. Graphodatsky using a set of our evolutionary conserved BAC clones. We found that the HiRise assembler was too conservative in placing scaffolds at the end of chevrotain chromosomes. This resulted in many of them being missed. Using our BAC clones and synteny comparison we identified missed scaffolds and their positions at the end of chevrotain chromosomes. The JuiceToolBox assembler on the other hand made one wrong joint connecting autosomes to chromosome X in chevrotain. Using BACs we corrected this error. As a result, we built a high-quality consensus chevrotain chromosome-level genome assembly. This assembly was used: 1) to reconstruct chromosome structure of the ancestral ruminant and pecoran genomes; 2) understand the biology of a chevrotain-specific phenotype (smallest red blood cells among all mammals); 3) to investigate activities of microbiome in chevrotain in comparison to other ruminant and non-ruminant species. We found that the ruminant ancestor had an additional cattle chromosome merge which was never reported before. The reason is that this merge was present in the chevrotain and ourgroup species only and, therefore, was not reported in studies missing good chevrotain genome assembly. We found that genes controlling red blood cell cytoskeleton were under strong positive selection in chevrotain compared to other mammals. In addition, genes involved in blood pumping by heart and the cardiovascular system were under selection probably in response to the viscous blood in chevrotain due to the presence of many tiny red bood cells. Interestingly, the activity of the microbiome based on composition of microbe species in chevrotain stomach was intermediate between the other ruminants and non-ruminants in line with its intermediate phylogenetic position and diet (Poppleton et al., submitted). Overall, our work has demonstrated a need for use of conserved BAC probes in correcting genome assemblies made with contemporary methods and how these genome assemblies could decode the genetic background of the lineage and species-specific phenotypes. In addition, we generated a chromosome level assembly for the Indian Muntjac utilising the HiC method and our conserved BAC clones. We generated the 10x Genomics de novo genome assemblies for the Tufted deer, Fea's muntjac and Black muntjac. These assemblies were built with N50s suitable for further improvement with the HiC method and our BAC clone panel. These species are important to study the patterns of chromosome evolution in mammals and chromosomal fusions in muntjacs and deer. In 2019 we established a new collaboration with prof Juha Kantanen (Finland) to upgrade the reindeer genome to chromosome level using our conserved BAC panel. In 2019 we made an additional modification to how the conserved BAC clones are selected for mammalian FISH experiments. We included the nucleotide binding ability (estimated from genome alignments to a range of mammals) to the >30 criteria used for the initial BAC selection. Fifty new BAC clones were selected using this additional criterion and tested in the lab of Prof. Griffin on several mammalian species. Based on these results we identified BACs that hybridize well on mammalian chromosomes and applied the decision tree approach to identify the criteria which should be used. The results indicated that for a successful mapping on a range of mammals the BACs should be within the 130-170 Kbp size range, have >190 conserved elements and bind well to the Hyrax genome. Using these selection criteria an additional set of 160 BAC clones has been selected to close the remaining gaps on the mammalian chromosomes in our panel. These BACs were used to build the final panel of clones (over 250) be used by Prof. Griffin's lab and our collaborators (Prof. Graphodatsky) for mapping on a range of mammalian species. All our upgraded genomes are either available from the NCBI or/and Evolution Highway Chromosome browser or will be upon publication(s) acceptance.
Exploitation Route Our grant partners are using our recommendations as a guide for selecting BAC clones (or to use our panel of BAC clones) to be tested for multi-species hybridisation experiments.

The chromosome level assemblies produced by us in the course of this grant could be utilised to study evolution, adaptations and specific traits in a number of mammalian species.

Our ruminant Evolution Highway website and the UCSC genome hub is an excellent resource to study chromosome evolution by students.
Sectors Agriculture, Food and Drink,Education

 
Description As part of this project a first chromosome level assembly of the dromedary camel has been developed and made publicly available. It has been used to detect genetic variants associated with the economically important traits in camels therefore contributing to the camel breeding industry.
First Year Of Impact 2019
Sector Agriculture, Food and Drink
Impact Types Economic

 
Description Resequencing of Russian cattle and sheep breeds adapted to cold climates
Amount 24,000,000 ₽ (RUB)
Organisation Russian Science Foundation 
Sector Public
Country Russian Federation
Start 04/2019 
End 03/2023
 
Title Chevrotain chromosome-level assembly 
Description We generated chromosome-level genome assembly for Chevrotain using the pre-existing short-read Illumina assembly and our Dovetail Chicago and HiC data, enhanced by placements of our universal mammalian BAC clones. 
Type Of Material Biological samples 
Year Produced 2020 
Provided To Others? Yes  
Impact We are using this assembly together with our collaborators to understand the early evolution of ruminants 
 
Title Chromosome level assembly of Dromedary Camel 
Description A near chromosome level assembly of the Dromedary camel has been produced using a combination of the Reference Assisted Chromosome Assembly, PCR verification of the reconstructed chromosomes, and comparison with FISH and physical maps of camel and alpaca. 
Type Of Material Biological samples 
Year Produced 2019 
Provided To Others? Yes  
Impact A paper describing the dromedary camel genome was published in Frontiers in Genetics. 
 
Title Chromosome level assembly of the gemsbok genome 
Description A near chromosome level chromosome assembly of the gemsbok genome was constructed using a combination of the Reference-Assisted Chromosome Assembly tool and PCR verification of reconstructed chromosomes. 
Type Of Material Biological samples 
Year Produced 2018 
Provided To Others? Yes  
Impact A chromosome level assembly of the gemsbok, a highly adapted to desert conditions ruminant, has been published in GigaScience 
URL http://eh-demo.ncsa.uiuc.edu/ruminants
 
Title Genome alignments of mammalian genomes against the cattle genome on the UCSC genome browser 
Description Alignments of 15 mammalian genomes against the cattle genome visualised on the UCSC Genome Browser. The alignments were obtained using the lastz aligner and parsed with the Kent utility tools 
Type Of Material Biological samples 
Year Produced 2019 
Provided To Others? Yes  
Impact A paper in Genome Research was published demonstrating effects of evolutionary rearrangements on the expression of nearby genes. 
URL http://sftp.rvc.ac.uk/rvcpaper/ruminantsHUB/hub.txt
 
Title Genome alignments of mammalian genomes and reconstructed ancestors on Evolution Highway 
Description 1.Visualisation of homologous synteny between the cattle genome and 12 additional mammalian species on our Evolution Highway Comparative Chromosome Browser was achieved by parsing lastz genome alignments the Kent utilities and the maf2synteny tool to build comparative synteny blocks. 2. Visualisation of homologous synteny between reconstructed cetartiodactyl, ruminant, and pecoran ancestors (reconstructed with DESCHRAMBLER software) with the extant mammalian genomes and other reconstructed ancestors 
Type Of Material Biological samples 
Year Produced 2019 
Provided To Others? Yes  
Impact A paper was published in Genome Research demonstrating that chromosome rearrangements in ruminants have functional effect on gene expression of the nearby genes. 
URL http://eh-demo.ncsa.uiuc.edu/ruminants/
 
Title Giraffe near chromosome level genome assembly 
Description We generated a near chromosome-level giraffe genome assembly using a combination of Illumina short read sequencing, Dovetail Chicago, RACA, and our conserved FISH BAC clones. The clones were used to check the assembly quality and assign our chromosomal fragments to chromosomes. 
Type Of Material Biological samples 
Year Produced 2019 
Provided To Others? Yes  
Impact The assembly is used by other research groups to understand giraffe biology and adaptation. 
 
Title Indian muntjac chromosome-level assembly 
Description We used short-read Illumina technology, Dovetail Chicago and Dovetail HiC methods and our universal mammalian BAC clones to generate a chromosome level Indian muntjac genome assembly. 
Type Of Material Biological samples 
Year Produced 2019 
Provided To Others? Yes  
Impact The assembly is currently being used by our collaborators (Dr. Farre and Prof. Lewin) to understand the muntjac biology. 
 
Title Mammalian genomes Evolution Highway Comparative Chromosome Browser 
Description We built a database containing visulaization of homologous synteny for mammalian genomes assembled with illumina scaffolding, Dovetail Chicago and Dovetail HiC methods. This database contains over 20 genomes aligned to the cattle, goat and human genomes. The data is utilised to assemble these genomes to chromosome levels and to verify assemblies. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? No  
Impact A subset of this database containing the ruminant genomes was utilised in our recent publication on chromosome evolution in ruminants (Farre et al., Genome Research 2019) 
 
Title Supporting data for "A Near-Chromosome Scale Genome Assembly of the Gemsbok (Oryx gazella): An Iconic Antelope of the Kalahari Desert" 
Description The gemsbok (Oryx gazella) is one of the largest antelopes in Africa. Gemsbok are heterothermic and thus highly adapted to live in the desert, changing their feeding behavior when faced with extreme drought and heat. A high-quality genome sequence of this species will assist efforts to elucidate these and other important traits of gemsbok and facilitate research on conservation efforts. Using 180 Gbp of Illumina paired-end and mate-pair reads, a 2.9 Gbp assembly with scaffold N50 of 1.48 Mbp was generated using SOAPdenovo. Scaffolds were extended using Chicago library sequencing, which yielded an additional 114.7 Gbp of DNA sequence. The HiRise assembly using SOAPdenovo + Chicago library sequencing produced a scaffold N50 of 47 Mbp and a final genome size of 2.9 Gbp, representing 90.6% of the estimated genome size and including 93.2% of expected genes according to BUSCO analysis. The Reference-Assisted Chromosome Assembly tool (RACA) was used to generate a final set of 47 predicted chromosome fragments with N50 of 86.25 Mbp and containing 93.8% of expected genes. A total of 23,125 protein-coding genes and 1.14 Gbp of repetitive sequences were annotated using de novo and homology-based predictions. Our results provide the first high-quality, chromosome-scale genome sequence assembly for gemsbok, which will be a valuable resource for studying adaptive evolution of this species and other ruminants. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
 
Title Supporting data for "An integrated chromosome-scale genome assembly of the Masai Giraffe (Giraffa camelopardalis tippelskirchi)" 
Description The Masai giraffe (Giraffa camelopardalis tippelskirchi) is the largest-bodied giraffe and the world's tallest terrestrial animal. With its extreme size and height, the giraffe's unique anatomical and physiological adaptations have long been of interest to diverse research fields. Giraffes are also critical to ecosystems of sub-Saharan Africa, with their long neck serving as a conduit to food sources not shared by other herbivores. Although the genome of a Masai giraffe has been sequenced, the assembly was highly fragmented and unsuitable for the analysis of chromosome evolution. Herein we report an improved giraffe genome assembly to facilitate evolutionary analysis of the giraffe and other ruminant genomes.
Using SOAPdenovo2 and 170 Gbp of Illumina paired-end and mate-pair reads we generated a 2.6 Gbp female Masai giraffe genome assembly, with a scaffold N50 of 3 Mbp. The incorporation of 114.6 Gbp of Chicago library sequencing data resulted in a HiRise SOAPdenovo + Chicago assembly with an N50 of 48 Mbp and containing 95% of expected genes according to BUSCO analysis. Using the Reference-Assisted Chromosome Assembly tool, we were able to order and orient scaffolds into 42 predicted chromosome fragments (PCFs). Using fluorescence in situ hybridization we hybridized 153 cattle BACs onto giraffe metaphase spreads to assess and place the PCFs on 14 giraffe autosome and 10 sex chromosome fragments. In this assembly, 21,621 protein-coding genes were identified and 1 Gbp of repetitive sequences annotated using both de novo and homology-based predictions.
Our results provide the first chromosome-scale genome assembly for the Masai giraffe. This assembly will be a valuable tool to elucidate the evolution and molecular basis of adaptive traits of the Giraffidae, and will provide a new genomic resource to assist conservation efforts. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
 
Description Camel chromosome level genome assemblies 
Organisation University of Veterinary Medicine Vienna
Department Department of Pathobiology
Country Austria 
Sector Academic/University 
PI Contribution Dr. Larkin served as a consultant at a IAEA meeting in Vienna dedicated to construction of radiation hybrid chromosomal maps for camel species. As the result of this meeting it was decided that Dr. Larkin's group will be responsible for constructing reference-assisted assemblies of the dromedary and Bactrian camels. Dr. Larkin has appointed an RVC master student to perform this work who is doing this project now. In addition Dr. Larkin is now a partner on the ongoing FWF-RSF application to study chromosome evolution and selection in camel breeds in Central Asia. In 2018 the reference assisted assembly of the Dromedary camel has been finished and published.
Collaborator Contribution Dr. Pamela Burger from the University of Veterinary Medicine in Vienna has provided us with the genome assemblies and raw read data to perform reference assisted chromosome assemblies of dromedary camel. She also provided with the DNA samples required to perform verification of RACA camel assemblies. Dr. Polina Perelman from the Institute of Molecular Biology, Novosibirsk, Russia has provided us with BAC maps of alpaca genome to facilitate RACA chromosome-level assemblies of camel genomes.
Impact Denis Larkin's group performed initial reference assisted chromosome assembly using RACA for the dromedary camel genome (Fitak et al. 2016). The tool assembled 1,797 scaffolds (10 Kb minimum size) into 154 predicted chromosome fragments (PCFs) of which one was homologous to a complete cattle chromosome (chromosome 25). The longest PCF was 112 Mb long containing 97 scaffolds and the shortest PCF was 117 Kb long containing two scaffolds. The N50 of the RACA initial assembly was 31,2 Mb which is 21 times higher than the N50 = 1,48 Mb of the original assembly. The total length of the assembled PCFs was 1,886 Mb or 94% of the original dromedary scaffold-based assembly. RACA splits 44 (2%) scaffolds as potentially"chimeric". All split scaffolds are currently being verified by PCR prior to running the second (final) round of RACA were all scaffolds with confirmed structure will be kept intact. The paper describing this with our master student as a first author has been published in Frontiers in Genetics in a special collection dedicated to camel genomics (PMID: 30804979)
Start Year 2016
 
Description Chevrotain genome assembly 
Organisation Russian Academy of Sciences
Department Institute of Cytology of the Russian Academy of Sciences
Country Russian Federation 
Sector Public 
PI Contribution Generating Hi-C data for the Chevrotain genome vassembly
Collaborator Contribution Using JuiceBox tools to assemble Illumina scaffolds to chromosomes with our HiC data
Impact Chevrotain chromosome-level genome assembly
Start Year 2019
 
Description Chevrotain genome assembly and annotation 
Organisation University of Kent
Country United Kingdom 
Sector Academic/University 
PI Contribution Assembling and analysing the Chevrotain genome; making genome alignments to be run by Dr. Farre's group on our DESCHRAMBLER tool.
Collaborator Contribution Running the DESCHRAMBLER tool and analysing the reconstructed ancestral ruminant genomes
Impact Not yet.
Start Year 2020
 
Description CytoCell 
Organisation Cytocell Ltd
Country United Kingdom 
Sector Private 
PI Contribution We established a formal collaboration with CytoCell company who providing in-kind contribution to support this project
Collaborator Contribution CytoCell is providing us with BAC clones to be hybridised on mammalian chromosomes to identify conserved probes to be used for genome mapping
Impact A set of 200 BAC probes has been transferred to our partner's laboratory (Darren Griffin, University of Kent).
Start Year 2017
 
Description Mammalian ancestral genome reconstructions 
Organisation University of California, Davis
Department Department of Evolution and Ecology
Country United States 
Sector Academic/University 
PI Contribution In collaboration with a group of Prof. Harris Lewin at UCD we have designed a novel algorithm to reconstruct structures of animal ancestral chromosomes. Dr. Larkin is a co-corresponding author on the paper being published PNAS in 2017. He and members of his group directly contributed to the design of the algorithm and its application to 19 mammalian genomes to reconstruct ancestral genomes in the lineage leading to human. Dr. Larkin's group has also applied this algorithm to the ruminant genomes, with the manuscript being published in Genome Research in 2019 (D. Larkin is a corresponding author) and works with the UCD group and the company Dovetail to improve qualities of Dovetail assemblies for mammalian genomes upgraded by Prof. Lewin to Dovetail scaffolds. The later is done using a combination of FISH technique, our reference-assisted assembly algorithm (RACA) and HiC approach.
Collaborator Contribution Prof. Lewin is paying for the upgrades of mammalian genomes to Dovetail superscaffolds (~10K USD per genome) and coordinates the collaborative project. Prof. Lewin contributed to interpretation of the data produced during our Ruminant genome analysis. Prof Graphodatsky was involved in fluorescence hybridization of cattle BACs on several ruminant genomes.
Impact A new assembly algorithm has been designed and applied to reconstruct chromosomal structures of several ancestral genomes in the lineage leading to human. The paper describing this approach and results have been published in PNAS (PMID: 28630326). The visualizations of the reconstructed assemblies are available from our Evolution Highway comparative chromosome browser. Reconstructed ruminant genomes were published in Genome Research (PMID: 30760546). This work was multidisciplinary as it involved bioinformatic analysis of sequenced mammalian genomes and fluorescence in situ hybridisation to verify the reconstructions and to infer ancestral genomes which were not available at sequence level.
Start Year 2015
 
Description FutureBiotech school for young scientists 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The FutureBiotech school for young scientists provided me with the opportunity to tell students about our studies, discuss how to build genome assemblies, share my career experience and convince some students to work with us
Year(s) Of Engagement Activity 2020
 
Description Night at the RVC 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact About 60 people attended our stand about the mammalian genome evolution and adaptation during the Night at the RVC event. People asked questions about avolution, mechanisms of adaptation. Our stand has been reported as one of the best at the event.
Year(s) Of Engagement Activity 2019
 
Description School of young scientists 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact More than 100 participants from the former soviet union counties have attended the School for young scientists held in Zvenigorod, Russia in 2018. Dr Larkin gave an invited lecture on the current status of animal genome studies resulting in a lot of questions from the audience and the follow up discussions. The organising committee has requested a review paper to be written and published based on the lecture given.
Year(s) Of Engagement Activity 2018