Driving Innovation with the Allotrope Framework

Pharmaceutical and biotechnology company laboratories generate an enormous amount of experimental data from a variety of sources – instruments, software and human input. This R&D data plays a critical role in decision-making and the development of new insights that drive the drug development process responsible for bringing new medicines to patients.
Unfortunately, the sheer volume and complexity of this data make managing this critical asset challenging. The process of moving scientific data through its lifecycle for a typical modern pharma researcher is time consuming, tedious and distracting from the important work of scientific discovery and innovation. The reality is that much of a researcher’s time is spent on data management issues – acquiring, formatting, analyzing, exchanging, storing, retrieving and possibly retiring the data.
The root causes of the industry’s data management challenges are threefold: incompatible and/or outdated software, proprietary file formats and inconsistent contextual metadata. In most companies, raw data is stored in proprietary file formats that often change with software versions. These proprietary formats pose a significant challenge for companies that seek to share data between business units or with external collaborators/partners (e.g., CRO), especially if each business site or partner uses different hardware and/or software in their workflows. Additionally, in many instances, metadata is not captured automatically and is frequently incomplete and/or inaccurate. This makes it very difficult to access and search relevant stored data, often resulting in a situation where it is easier to repeat an experiment than locate historical data of interest.
Ideally, all R&D data should be collected along with metadata describing experimental details and context, and this data should be stored in a non-proprietary format that can be read by any appropriate software application. Due to lack of investment in data management tools and practices, however, many pharmaceutical companies struggle to extract maximum value from their analytical chemistry data. Incompatible data formats, incomplete metadata and difficulty searching historical data result in increased costs, inefficiencies and reduced innovation in modern pharmaceutical laboratories.
In order to address this issue, a number of pharmaceutical and biotechnology companies came together in 2012 to form the Allotrope Foundation with the intention to pool their collective expertise and resources to develop a solution to commonly experienced data management issues. The Allotrope Foundation aims to address the root cause of data management shortcomings through the development of common standards for the analytical data and metadata generated in pharmaceutical enterprise laboratories. Let’s take a look at some of the key details of their approach.
Allotrope Framework Overview
The Allotrope Foundation’s ultimate goal is the development of an advanced data architecture that will harmonize the collection, exchange, and management of laboratory data over its complete lifecycle. Towards this end, the Foundation has created a suite of software tools called the Allotrope Framework that allow software developers to implement a consistent set of data standards into the software that laboratories use to manage their workflows and data. The Framework has three components:
The Allotrope Data Format (ADF) is a family of vendor and platform agnostic specifications that are designed to standardize the collection, exchange and storage of the analytical data captured in laboratory workflows. Class libraries provide reusable software components that can be used to adapt existing applications or create new software solutions. The Foundation also provides a free ADF explorer – an application that can open any ADF file to view the data (data description, data cubes, data package) stored within. An ADF file can tell you:
- Why the data was gathered (sample, study, purpose)
- How the data was generated (instrument, method)
- How the data was processed (analysis method)
- The shape of the data (dimensions, measures, structure)
The ADF is intended to facilitate speedy real-time access to, and long-term stability of, archived analytical data. It has been designed to meet performance requirements of modern instrumentation, and also to be extensible by allowing new techniques and technologies to be incorporated while maintaining backwards compatibility with previous versions. The end result is a data format that is portable, allowing easy file transfer and use across operating systems and vendor platforms that are independent of the instrument that created the data.
The Allotrope Taxonomies and Ontologies (AFO) provide a controlled vocabulary and semantic model for the contextual metadata that is needed for the representation of laboratory analytical processes (tests and measurements) and eventual interpretation of the data. The domains modeled include: Equipment, Material, Process and Results. The standard language is being developed to cover a broad range of analytical techniques and instruments.
Allotrope Data Models (ADM) use the Shapes Constraint Language (SHACL) to define data structures (schemas, templates) that describe how to use the ontologies in a standardized (i.e. reproducible, predictable, verifiable) manner.
The Allotrope Framework is essentially a software development kit that allows manufacturers of analytical equipment to render their machine output in the ADF, and software developers to embed the Allotrope data format and terminologies into software and interact via standardized application programming interfaces.
Benefits of Using the Allotrope Framework
The Allotrope Foundation plans to have all Allotrope Framework components (ADF, AFO and ADM) publicly available by the end of 2017 under a Membership & Licensing access model available for both commercial and non-profit use cases. In collaboration with vendor partners, member companies have already begun to implement ADF compatible solutions into their laboratories to address their data management challenges. Benefits of utilizing the Allotrope Framework for data management include:
- Data Accessibility – The ADF eliminates the need for vendor-to-vendor technology integration by creating an extensible data representation that facilitates easy access and sharing of the data output from any vendor’s software or laboratory equipment.
- Data Integration – The Allotrope Framework’s standard format for data and metadata enables compatibility within laboratory infrastructure that will lower the effort required to integrate applications and workflows.
- Data Integrity – Laboratory infrastructure compatibility significantly reduces the possibility for human error in data collection or transfer by eliminating the need for manual transcription or conversion of data between incompatible formats or software. The Allotrope Framework effectively improves overall data integrity by increasing automation in laboratory data flow.
- Regulatory Compliance – Interoperability within laboratory infrastructure allows linked Quality Control (QC) data and complete traceability of data over its full lifecycle. Adoption of the Allotrope Framework results in data that is easily read, searched and shared, effectively addressing data integrity and regulatory compliance issues.
- Scientific Reproducibility – The Allotrope Framework enables a complete and accurate representation of the critical metadata needed to document experiments (methods, materials, conditions, results, algorithms).
- Improved Data Analytics – The Allotrope Framework dramatically improves the quality and completeness of metadata and reduces the time it takes to interconvert data between data sources, both of which are important keys to a successful big data and analytics strategy. Additionally, the data description layer of the ADF uses and RDF data model that provides the capability to build in business rules and other analytics on top of the standardized vocabularies.
- Reduced Costs – Ease of integration between laboratory equipment and software systems will serve to reduce IT expense by eliminating the need for customized solutions and software patches. Software and instrument interoperability will also lower effort and expense for support and maintenance. Additionally, adoption of the Allotrope Framework allows more laboratory automation which will improve overall operational efficiency, leading to even more cost savings.
Why Astrix
In order to realize the benefits of the Allotrope framework, organizations will need to:
- Understand what the ADF, AFO, ADM are and how they are intended to be used
- Decide how the Allotrope Framework is going to fit into their project
- Work with their subject matter experts to define the desired shape of their data (data description, data format, raw data)
- Work with Allotrope to understand how the Allotrope ontologies map to their datasets
- Understand the current state/format of their data, regarding:
- Instruments
- Methods
- Analyses
- Software
- Craft a staged project plan to move from their current state to the desired state
- Define the processes and tools that you will use to convert data from its present state into Allotrope-compliant files
- Train their in-house resources
- Support their in-house resources
- Plan for downstream uses of the data, including:
- Regulatory compliance
- Data archiving
- Maintain and evolve the system as needs change
As a member of the Allotrope Partner Network, Astrix Technology Group is uniquely positioned to assist your organization in developing and implementing an effective Allotrope Framework architecture and strategy. Laboratory automation and informatics are the central focus of our organization. Leveraging our scientific domain knowledge, technology expertise, industry experience and extensive partner network, our Team can help you with each one of the steps described above.
In addition, Astrix provides nearshoring options to support competitive pricing strategies for all its services. Astrix nearshoring offices provide expert informatics consulting across the full range of scientific informatics professional services such as project management, business analysis, managed services, software development, implementation, integration, QA, vendor selection and more. In addition, the Astrix nearshoring offices are located in time zones similar to the U.S. mainland, allowing our nearshoring teams to easily attend conference calls during local business hours, or travel on-site as needed to work with clients.
Conclusion
Through the release of the Allotrope Framework, the Allotrope Foundation has taken an important step forward in its mission to make the “smart” analytical laboratory a reality. The ultimate vision of the Foundation is to create a laboratory environment where data, analytical methods and hardware components are effortlessly shared amongst different platforms, one-click reports can be generated from data produced by any source(s), and data integrity and regulatory compliance are built into the system. The Foundation aims to create an automated laboratory environment that drives better data analytics, scientific discovery and innovation, ultimately providing better medicines to patients faster.
The data management challenges that the Allotrope Framework is designed to address are certainly not unique to the Life Science Industry, however. The overall methodology implemented in the Allotrope Framework solution is very much applicable to other industries. The Allotrope Foundation has, in fact, hosted a number or workshops designed to introduce the concepts and approaches of the Allotrope Framework to representatives from other sectors. As our world becomes more and more connected through technology, companies in many industries are beginning to think about standardization and interoperability of technologies as a key factor in the push to optimize innovation and maintain a competitive edge.