Data Mining in Clinical IT

A real life industrialised model

Michele Pontinen,  Senior Manager, Capgemini US, LLC.

Data mining has always been important to our industry. Having the ability to mine across the entire product lifecycle has always been the goal. To date, data and systems' ‘non-interoperability’ has been one of the major challenges to this effort. This case study will describe how data mining capability was enabled initially by Clinical IT. This work set the foundation for data mining across R&D and positioned the enterprise to realise their new product development strategy—connecting the dots between molecular discoveries and human health and disease.

The absence of data interoperability and standardisation is just one of the challenges faced by the pharma industry today. Data are expensive and difficult to collect with limited ROI after product registration. Increasingly, health authorities are moving to require a clearer understanding of product safety and efficacy—not only earlier in the development process but throughout the product lifecycle.

Pharmacy benefits managers are pressuring industry to provide evidence of efficacy and a good safety profile for favourable reimbursement rates and in some cases formulary placement.

Patients and their advocacy groups are becoming more vocal in demanding that centralised trial information be easily accessible to clinicians. Additionally, they call loudly for presenting product contraindications and side effect information in a meaningful and understandable language—more than just an aggregation of statistical information on the product insert. At the same time, the industry faces severe pressure to innovate: To move to adaptive trials, translational science, and focus new products and diagnostics on targeted treatment groups. And most important for their competitive edge in today’s environment, a company needs to ‘kill early and cheap.’

These challenges all share a common problem—the inability to access, analyse, share and mine interoperable data. If an organisation is able to make their product data both syntactically and semantically interoperable, they can transform data into information and finally into knowledge—creating a unique competitive advantage.

This article describes how a transformation initiative of the business and clinical IT delivered data mining capabilities to the enterprise. Enabling data mining initially in full development (trial Phase IIB through product registration), the initiative built the foundation to effectively launch a new strategy for product development—“to connect the dots between molecular discoveries and human health and disease.”

A definition of data interoperability
For some the term “interoperable” maybe new or it may mean different things to different people. In the context of this article, the term is defined as:

Transforming the enterprise – Doing more with less
This company began a full transformation of their organisation by implementing their strategy across a number of different functions overtime. This meant changing the way they did business. They needed to do more with less, and deliver on their vision for the future.
By manipulating key transformational levers (people, process and technology) the groundwork for their new operational model is laid:
• Interoperable data across all business functions
• Both syntactic and semantic interoperability
• Standard data representation
• Industry-evolving standards (i.e. CDISC, HL7, MIGS (Minimum information about Genome Sequence), MIMS (metagenomics), etc.)
• Single repository for all regulated data (JANUS-like architecture)
• Data mining capability
• Discovery through the end of the product life cycle
• Future integration with patient Electronic Health Records (EHRs)
• Enhanced pharmacovigilance activities
• Support for Modelling and Simulation (M&S) in full development
• Identify additional indications to enable diagnostic and combinational product development
• Support genomics / biomarker use in preclinical and clinical studies
• Speed dossier compilation
• Real-time research, clinical trial data capture and processing (reducing errors in data collection and speed analysis and reporting)
• Compliance maintained throughout the life cycle of the product.

Transforming the enterprise in this manner enables the company to identify and move to retain critical, core capabilities in-house. They improve their decision-making process by eliminating the “white spaces” between current operations and “best-in-class.” They eliminate all legacy “silos” of information and data about their products while they move to adopt novel new technologies—those that will provide their research scientists with the tools necessary to increase their understanding of disease.

Corporate information governance
The company established Information Governance at the corporate level. This move not only assured that “isolated silos of data” would not re-establish post-transformation, but it positioned them to realise their long-term goal of accessing and mining external repositories—and eventually to interface with patient Electronic Medical Records (EMRs). Their governance structure comprises three separate areas of Information Stewardship:

1. Business stewards: Accountable for consistent implementation of standard business rules for all future state information systems. Looking across initiatives, business stewards ensure business rules for data are defined in the context of requirements for the entire programme—for today and in the future.

2. Metadata stewards: Assure the enterprise has a single “source of truth” for metadata definitions and business rules. These artifacts are managed and maintained in a central repository at the corporate level.

3. Integration stewards: Identify and maintain that conversion, interface, data analysis, data standardisation and data architecture work done by individual projects that can be leveraged across the enterprise. The stewards also support projects by providing standard toolsets, templates and usage tips that can be leveraged by multiple teams, in multiple functional areas.


Several factors drove the decision to adopt a more centralised model. The model provides a formal command and control structure that assures all necessary processes and standards are in place to enable data interoperability and standardisation. A single group or function controls master metadata and data standardisation—providing a good leadership framework for delivering the vision, both near-term and future. The model also fits well with their approach to change management, critical for driving change across the enterprise. Finally, although governance is maintained at the corporate level, ownership is cascaded down into each discipline through the Stewards.

The data interoperability
Product data from all current and legacy clinical research trials (Phase III, IV) is being converted into CDISC SDTM for loading into their central repository. The decision to convert to CDISC met metadata and data interoperability acceptance criteria for eCDT submissions to the US health authorities (FDA). Other CDISC standards (ADaMs, LAB, ODM and BRIDG) are now being implemented to enable full electronic submission and to bridge HL7. The company plans on adopting SEND for their preclinical data once FDA reviews and advises on findings from the pilot. As standards evolve for new technologies, the company is positioned to generate, store and analyse enormous quantities of interoperable epidemiologic, genotypic and phenotypic data.

A JANUS-like data repository houses all of their product data in standard, interoperable format. This provides a single, centralised solution for not only preserving the knowledge (data) but also providing controlled access to those who wish to and are entitled to use it. This solution not only reduces the time and effort to locate and access meaningful data but also provides users the opportunity to access and mine all legacy product data.

Organisational change management
Managing change is a critical component of any business transformation. An effective change management programme delivers results by building sponsorship from the top, creating leaders who will act as change agents, and by changing behaviours at both the team and individual employee levels. For our initiative, the company implemented a robust, change management programme and communication plan across multiple functions and at multiple levels addressing each and every component of transformation.

Delivering the correct approach for each and every functional area and driving change across the enterprise, required balancing and instantiating a hierarchy of information (in multiple formats and each with different, directed content) without contradiction. The approach to change management addressed enterprise-level themes, operational-level messages and individual-level information. The message was always consistent and ownership for the new ways of working was internalised with the aid of a variety of change management tools.

Industrialising the solution
The challenge of converting all clinical research data into industry standards was great, given the company’s limited resources. The amount of clinical research data, its location (found in many current and legacy applications) and format meant the cost and resource requirements were going to be high. To bring down the cost and reduce the burden on internal resources, we developed an “industrialised” or “factory model”. As a partner in the transformation, we moved SDTM mapping to facilities offshore and set up a factory solution that reduced the cost and produced high quality SDTM maps and ADaM data sets.

Beyond data mining capabilities
Traditionally, data mining activities have centered around pharmacovigilance. Historically, most have experienced limited success either because trending is difficult with a small sample size or because submission timelines are so tight they force us to put scarce resources on meeting the submission deadline. As a result, all of the challenges and frustrations led the company to realise they could no longer conduct business as usual. The transformation needed to give them more than just data mining capability. Today, the company is realising the near-term goal of mining clinical data with the capability and capacity to deliver in three critical areas: Realise an immediate ROI on all of their product data; reduce their regulatory exposure; and deliver on the pressures to innovate by realising their competitive edge and building on it.


Realising an immediate ROI on the data
With the transformation of people, process and technology platform, the company immediately began to see ROI on their product data. This included the ability to mine across compounds and actually support intense pharmacovigilance activities. “One version of the truth” is replacing and eliminating duplicate product data. In the area of clinical trials, clinicians and data management no longer need to re-invent the wheel each time a new trial is designed and rolled out into production. Industry standards are reducing trial set-up costs.

The adoption of industry standards is reducing the timeline for data integration, analysis and report out. By moving all on-going clinical trials to their EDC platform and adopting CDISC standards, trials are set up, data are captured and accessible in real time, data moves nightly (in SDTM format) into the repository. Data sets are pushed out to their analytical application (SDD) in record time. The access to data in real time coupled with reducing the time to conduct and report out on a trial, is improving their patient safety monitoring, thereby significantly reducing publication time for submission.

From clinical development through product launch and marketing, the barriers for collaboration with co-development partners and external vendors (i.e. CROs) are being eliminated. Data suitable for mining no longer needs to be located, reformatted, and / or reprocessed when the company receives a request from health authorities. The adoption of industry standards is also impacting how the company contracts clinical trial services. CROs are now required to return trial data in CDISC format.

Reducing regulatory exposure
By mapping all regulated trial data to a single repository, the company is now able to retire their legacy systems and reduce their risk of non-compliance for electronic record and electronic signature regulation (21 CFR Part 11). As noted earlier, the repository also gives the company the ability to quickly respond to questions from regulatory authorities —as end-users no longer have to get access to, locate, and reformat data to respond. In addition, data quality, a critical component of the transformation, has been built into the new way of working and their regulated product information meets data veracity, authenticity and non-refutability requirements throughout the life of the record.

Transforming data into knowledge
With data interoperability, the company is now realising their competitive advantage and building on it. With interoperable data, they have turned data into information and into knowledge. This advantage positions them to evaluate their earlier work using new technologies such as biomarkers helping them to re-evaluate earlier discovery efforts that failed target nomination, candidate optimisation or candidate readiness milestones as well as the possibility of combinational product opportunities. The company now fully supports modelling and simulation in clinical development and is beginning to move to adaptive trials and translational science capabilities.

They now possess the capability and capacity to generate, store and analyse enormous quantities of epidemiologic, genotypic and phenotypic data. They are moving forward with gene-based clinical trials that hopefully will increase their knowledge of the biology and phenotypes of disease. All of these capabilities increase their understanding of disease and move them closer to targeted treatment products and innovative diagnostics. The company is now positioned to adopt and leverage all new industry standards as they evolve.

Author Bio

Michele Pontinen

Michele Pontinen is a Senior Manager in the Life Sciences Transformation Consulting Practice at Capgemini US, LLC. Michele leads the R&D consulting practice. She has over 25 years experience in the biopharmaceutical industry—both commercial and government research development.

TOP