BioSolr: addressing the challenges in making biomedical data easily accesible using the world-leading Apache-Solr search-engine framework
Lead Research Organisation:
EMBL - European Bioinformatics Institute
Department Name: Protein Data Bank in Europe
Publications

Adams PD
(2019)
Announcing mandatory submission of PDBx/mmCIF format files for crystallographic depositions to the Protein Data Bank (PDB).
in Acta crystallographica. Section D, Structural biology

Armstrong DR
(2020)
PDBe: improved findability of macromolecular structure data in the PDB.
in Nucleic acids research

Berman HM
(2014)
The Protein Data Bank archive as an open data resource.
in Journal of computer-aided molecular design

Berman HM
(2016)
The archiving and dissemination of biological structure data.
in Current opinion in structural biology

Velankar S
(2021)
Structural Proteomics - High-Throughput Methods

Velankar S
(2016)
PDBe: improved accessibility of macromolecular structure data from PDB and EMDB.
in Nucleic acids research

Westbrook JD
(2015)
The chemical component dictionary: complete descriptions of constituent molecules in experimentally determined 3D macromolecules in the Protein Data Bank.
in Bioinformatics (Oxford, England)

WwPDB Consortium
(2019)
Protein Data Bank: the single global archive for 3D macromolecular structure data.
in Nucleic acids research

Young JY
(2017)
OneDep: Unified wwPDB System for Deposition, Biocuration, and Validation of Macromolecular Structures in the PDB Archive.
in Structure (London, England : 1993)
Description | The major objectives of this grant were three fold: 1. To ensure that the interns/interchangers from software industry working at the EMBL-EBI, get background and domain knowledge in life sciences as part of the interchange program; 2. To ensure that their software skills are used in the process to develop new functionality relevant to the life science area; 3. To bring together the bioinformatics community to exchange expertise and experience of using various search engine technologies available for developing life science-related query mechanisms. All these objectives were met as the interchange resulted in knowledge transfer to software industry experts. The interchangers were embedded within the EMBL-EBI teams for the period of the project and were exposed to the life sciences data and practices. The interchangers also helped the teams to assess use of search engine technologies after understanding the requirements and developed new functionality by contributing to the open source Apache Lucene Solr project. The new functionality is in use in production systems at EMBL-EBI. The other major outcome of the grant was bringing together Apache Lucene Solr "experts" and "non-experts" at workshops to discuss requirements and exchange knowledge and experience. Bioinformaticians from different UK, European and US universities and institutes attended the workshops alongside software industry experts resulting in exchange of knowledge and expertise. As a result of the project better query systems are now available for life science data. |
Exploitation Route | The interactions established during the BioSolr project have been beneficial in improving the query mechanisms for life science data. By developing software that is open source, some of the developments were accepted by the main Apache Lucene Solr software committee and were integrated into the main distribution. As these developments are part of the main Apache Lucene Solr distribution, anyone using the search engine has access to the new functionality. The remaining BioSolr software that did not become part of the main distribution is available for distribution via Apache Lucene Solr site and Github as a plugin. The software is part of the query systems at EMBL-EBI. |
Sectors | Digital/Communication/Information Technologies (including Software),Other |
Title | BioSolr software repository |
Description | Repository of all the code developed in BioSolr project. Contains enhancements to Apache Solr. One of the patches (facet-contains) is integrated in the official Apache Solr distribution. The remaining patches are linked to JIRA issues listed on Apache Solr site. These patches are used in production services by SPOT and PDBe teams. |
Type Of Technology | Software |
Year Produced | 2015 |
Open Source License? | Yes |
Impact | The new developments have enhanced the Apache Solr functionality that was deemed essential for the improvements in the services provided by the SPOT and PDBe teams. The code is also evaluated by the NCBI teams in the USA. |
URL | https://github.com/flaxsearch/BioSolr/ |
Description | July 10-11, 2015 : BOSC Dublin |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation at Bioinformatics Open Source Conference (BOSC) Special Interest Group meeting in Dublin, as part of this year's ISMB/ECCB conference. This made the community aware of the BioSolr developments resulting in more request for information. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.flax.co.uk/blog/2015/07/13/biosolr-at-bosc-2015-open-source-search-for-bioinformatics/ |
Description | October 13-16, 2015 : Lucene Revolution, Austin, Texas |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Matt Pearce presented the BioSolr project at the Lucene revolution meeting which is a meeting of Solr professionals. This presentation was accepted because of the number of votes it received as a presentation of interest to a large number of people attending the conference. This resulted in discussions and further interaction with search engine professionals. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.flax.co.uk/blog/2015/10/16/lucenesolr-revolution-2015-biosolr-searching-the-stuff-of-life... |
Description | Apr 21 2015 : London Solr Meetup |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | The search experts in London were informed of the BioSolr project and the planned developments. There was quite a lot interest from this community and they have provided input in the development of the BioSolr plugins. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.meetup.com/Apache-Lucene-Solr-London-User-Group/events/220603505/ |
Description | Better search for life sciences at the BioSolr Workshop, day 1 - Apache Lucene/Solr |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Blog-post describing the activities on the first day of "Open source search for bioinformatics" workshop |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.flax.co.uk/blog/2016/02/10/better-search-life-sciences-biosolr-workshop-day-1-apache-luce... |
Description | Better search for life sciences at the BioSolr Workshop, day 2 - Elasticsearch & others |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The blog describes the activities on the second day of the "Open source search for bioinformatics" workshop. The blog was advertised on the PDBe twitter and facebook accounts. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.flax.co.uk/blog/2016/02/15/better-search-life-sciences-biosolr-workshop-day-2-elasticsear... |
Description | BioSolr workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The initial BioSolr workshop was designed to bring together Solr users from Cambridge area. The workshop was attended by many teams on campus and by teams from NCBI in USA. There was also presence from industry from Cambridge area and Siren solutions from Ireland. This has resulted in continued involvement from all the people who attended the initial workshop which has resulted in better interactions between different teams. |
Year(s) Of Engagement Activity | 2014 |
URL | http://www.flax.co.uk/blog/2014/10/02/biosolr-begins-with-a-workshop-day/ |
Description | ECCB 2018 - PDBe/UniProt workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This international workshop was conducted jointly by PDBe and UniProt teams. |
Year(s) Of Engagement Activity | 2018 |
Description | EMBL training course "Structural bioinformatics (Virtual)" |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This course explored bioinformatics data resources and tools for the investigation, analysis, and interpretation of biomacromolecular structures. It focused on how best to analyse and interpret available structural data to gain useful information given specific research contexts. The course content also covered predicting protein structure and function, and exploring interactions with other macromolecules as well as with low-MW compounds. Workshops were presented on PDBe search, pages and tools, as well as PDBe-KB pages.This course was a virtual event delivered via a mixture of live-streamed sessions, pre-recorded lectures, and tutorials with live support. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ebi.ac.uk/training/events/structural-bioinformatics-virtual/ |
Description | EMBL-EBI training course "Summer school in bioinformatics (Virtual)" |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This course provided an introduction to the use of bioinformatics in biological research, giving participants guidance for using bioinformatics in their work whilst also providing hands-on training in tools and resources appropriate to their research. Participants were initially introduced to bioinformatics theory and practice, including best practices for undertaking bioinformatics analysis, data management and reproducibility. To enable specific exploration of resources in their particular field of interest, participants were divided into focused groups to work on a small project set by EMBL-EBI resource and research staff, ending in a presentation from each group on the final day of the course to bring together learnings from all participants. The course included training and mentoring by experts from EMBL-EBI and external institutes. PDBe supervised the group project for independent exploration and analysis of PDBe-KB data. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ebi.ac.uk/training/events/summer-school-bioinformatics-virtual/ |
Description | Indian Biophysical Society-PDBe workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This workshop was conducted as part of the Indian Biophysical Society meeting at the Indian Institute of Science Education and Research (IISER), India. |
Year(s) Of Engagement Activity | 2018 |
Description | Open source search for bioinformatics workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | More than 40 bioinformatics and search engine professionals attended the workshop. The attendees included professionals working in biological data management at major bioinformatics centres (EBI and NCBI) as well as researchers from UK universities. The participants also included search engine technology experts from computer science and bioinformatics communities and companies. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.ebi.ac.uk/pdbe/about/events/open-source-search-bioinformatics |
Description | PDBe API webinar series "Creating complex PDBe API queries" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This webinar was part of a 6-part PDBe API webinar series, introducing different levels of programmatic access at PDBe. The series ranged from basic data retrieval and search using the PDBe API to more advanced features, including access and reuse of PDBe data visualisation components. This webinar demonstrated how to create more complex queries by combining the PDBe search API with numerous other calls. By introducing specific case studies, we highlighted the scope of PDBe programmatic access. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ebi.ac.uk/training/events/creating-complex-pdbe-api-queries/ |
Description | PDBe API webinar series "Introduction to PDBe programmatic access" |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This webinar was part of a 6-part PDBe API webinar series, introducing different levels of programmatic access at PDBe.The series ranged from basic data retrieval and search using the PDBe API to more advanced features, including access and reuse of PDBe data visualisation components. This webinar gave an introduction to programmatic access at PDBe, highlighting the type of data that is available and how this can be utilised. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ebi.ac.uk/training/events/introduction-pdbe-programmatic-access/ |
Description | PDBe workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This workshop involved 20 participants at the Max F. Perutz Laboratories (MFPL) in Vienna. |
Year(s) Of Engagement Activity | 2018 |
Description | PDBe/EMPIAR HALOS consortium virtual workshop Hamburg |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This virtual workshop was organised by PDBe and HALOS, and also involved EMDB and EMPIAR. It provided an introduction to the Protein Data Bank and associated databases. It took the form of three afternoon online workshop sessions, combined with webinars that had be watched before the online sessions. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.halos.lu.se/calendar/pdbeempiar-workshop-hamburg |
Description | PDBe/Uniprot API workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | This workshop was conducted at the National Institute of Immunology in India and involved 50 international participants. |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation at Diamond light source |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Presentation on PDBe activities including SIFTS resource. The presentation described the new developments at PDBe including the web components and Web based 3D viewers that display annotations using SIFTS API. The SIFTS resource was described as a way to get value added annotation by linking Sequence and Structure based annotations from different data resources. The new query system and the search API at PDBe which is based on BioSolr developments was also described. |
Year(s) Of Engagement Activity | 2016 |
Description | Presentation at NII Shonan meeting in Japan |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The NII Shonan meeting was organised to discuss visualisation of biological information. The presentation concentrated both on the visulisation of data but also source of annotation information with SIFTS data central to linking structure and sequence information. There were further inquiries from participants on the SIFTS data and the REST API that makes these data accessible. One of the work groups also discuss how to query information in most efficient way including some of the developments at PDBe that have come about due to BioSolr project. |
Year(s) Of Engagement Activity | 2016 |
URL | http://shonan.nii.ac.jp/shonan/blog/2015/10/30/web-%E2%80%90based-molecular-graphics/ |
Description | Presentation at Unité de glycobiologie structurale et fonctionnelle, Université de Lille |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | The presentation entitled "PDBe - Bringing structure to biology" described PDBe developments including SIFTS resource and the new query system. The presentation also described new developments on REST API and planned developments for SIFTS resource. |
Year(s) Of Engagement Activity | 2017 |
URL | http://ugsf-umr-glycobiologie.univ-lille1.fr/Seminar-Friday-10th-February-Sameer-Velankar-PDBe-leade... |
Description | SWAT4LS workshop - A new Ontology Lookup Service at EMBL-EBI |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | The workshop presented how BioSolr has helped implement the new ontology lookup service. This created a lot of interest in the new developments from the participants. |
Year(s) Of Engagement Activity | 2015 |
URL | http://ceur-ws.org/Vol-1546/ |
Description | Structural bioinformatics course |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | 26 participants from an international background participated in this onsite workshop. |
Year(s) Of Engagement Activity | 2018 |
Description | Talk and online training course on PDBe at Warwick University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Undergraduate students |
Results and Impact | Workshop and talk to undergraduate Chemists at Warwick University, focusing on the PDB and accessing protein structure data. |
Year(s) Of Engagement Activity | 2020 |
Description | Talk at IISER (Pune) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation on PDBe, pdbe.org website and the infrastructure behind it, including SIFTS. |
Year(s) Of Engagement Activity | 2017 |
Description | Talk at MBU (Bengaluru) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | A presentation on PDBe, pdbe.org website and the infrastructure behind it, including SIFTS, API and search functionality. The talk was attended by over 50 people from the Molecular Biophysics Unit and other departments of the IISc in Bengaluru, India. |
Year(s) Of Engagement Activity | 2017 |
Description | Talk at NII (New Delhi) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation on PDBe, pdbe.org website and the infrastructure behind it, including SIFTS. |
Year(s) Of Engagement Activity | 2017 |
Description | Talk at Pune University (Pune) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | A presentation on PDBe, pdbe.org website and the infrastructure behind it, including SIFTS. |
Year(s) Of Engagement Activity | 2017 |
Description | The fun and frustration of writing a plugin for Elasticsearch for ontology indexing |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The blog describes the work on ontology indexer carried out as part of the BioSolr project. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.flax.co.uk/blog/2016/01/27/fun-frustration-writing-plugin-elasticsearch-ontology-indexing... |
Description | Webinar: Finding macromolecular structures more easily at PDBe |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This webinar was conducted as part of the online training program. |
Year(s) Of Engagement Activity | 2018 |
Description | XJoin for Solr, part 1: filtering using price discount data |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The blog describes the development work carried out in BioSolr and its applications outside bioinformatics field. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.flax.co.uk/blog/2016/01/25/xjoin-solr-part-1-filtering-using-price-discount-data/ |
Description | XJoin for Solr, part 2: a click-through example |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The blog describes application of the work carried out in the BioSolr project in the field of e-commerce and search engine deveopment. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.flax.co.uk/blog/2016/01/29/xjoin-solr-part-2-click-example/ |