Getting to the Essence of Data Integrity, APICs Practical Approach


Getting to the Essence of Data Integrity, APICs Practical Approach

Added Value of this Guidance

Data integrity has been an integrated concept since the first GMP’s were published many years ago. Recently we’ve seen many professional/industry organizations and authorities publishing data integrity articles, reports and guidelines. Organizations and governments worldwide feel the need to emphasize the importance of these principles as a result of inspection results.

If you have somehow missed all of this; “data integrity is the degree to which data are complete, consistent, accurate, trustworthy and reliable and that these characteristics of the data are maintained throughout the data life cycle. The data should be collected and maintained in a secure manner, such that they are attributable, legible, contemporaneously recorded, original or a true copy and accurate. Assuring data integrity requires appropriate quality and risk management systems, including adherence to sound scientific principles and good documentation practices.” – WHO TRS 996 Annex 5.

The result of all these articles, reports and guidelines can be summarized as confusion. Which guidance applies to me? Is the MHRA’s ‘GXP’ Data Integrity Guidance and Definitions in it’s first, second or third revision? Is the PIC/s Guidance on Data Integrity better than the MHRA’s? Which provides the most clarity and should I take the WHO TRS 996 – Annex 5 into account?

And now, to add to the confusion, the APIC has published its own vision on data integrity.

For those not familiar, the Active Pharmaceutical Ingredients Committee, is a Sector Group within Cefic (the European Chemical Industry Council). It is organized in sub-committees and hosted by Cefic secretariat.

You might be wondering why we “need” another guidance on data integrity from yet another organization that is NOT a (semi-) government institution.

I remember one of the speakers at a congress at the end of 2016 where he almost begged the authorities to stop publishing guidelines (WHO, PIC/S, MHRA, US-FDA, EMA) due to the peak in publications on data integrity. I can’t agree more, so if new papers are published by others, the question I have is: what is the added value? 

My primary reaction when the APIC guide was published, I was in despair “oh no, not one more”.

But after going through it, I liked the approach and on top: it’s free!

Business Work Flows

The APIC data integrity guideline starts with an almost forgotten art: creating flow-schemes starting with the business processes. followed by understanding the data flow GMP risks.

Why is mapping the flows before defining GMP risks so important? Imagine having to assess the risk of driving from Newark to New York. How could you accurately assess the risks without knowing which intersections you’ll be passing through, which highways to take and which congestion-prone areas to avoid?

Data mapping is no different. Which data goes where? Which calculations are performed along the way? Are there potential bottlenecks in the data-process? Only by making these assessments and mapping them will you be able to accurately define GMP risks.

Process Mapping & Computerized Systems Validation

And although it is not described in the APIC guideline; this approach is highly useful for maintaining and validating a computerized system under GMP/GDP conditions.

I wrote quite some posts on this topic and got into some good discussions with industry professionals from around the world, the main theme shared by all is that is if you talk computerized systems, you should really focus on data flow rather than the computerized system-hardware/software itself. One reaction I loved is that when we all design computerized systems and data flows in a proper way, there will be little work for computerized systems validation engineers.

Going to the Essence of Data Integrity

Already the APIC data integrity guidance was growing on me, making a strong start by pointing out that companies should develop their business flows first in order to understand the criticality of the data, avoiding the dreaded “do all” approach.

This is very helpful;

I remember being at a facility with 9 (nine !!) closets full of Computerized Systems Validation documentation. Sadly, when I asked a simple question like; “so in general, what was the approach for your validation” no one could provide an answer.

By now you can probably imagine why no one at that audit was able to answer my simple question; they did not understand the risks or the validation approach. Why? Because they never took the time to understand what they were validating, they were just performing the script, not living it.

By creating an overarching Business Flow, you create an infrastructure for your approach. By the way, this approach is not only valuable for data integrity, but for other disciplines as well (e.g. finance). Moreover, the team has insight in why they are doing those things.  Knowledge empowers ownership and a sense of responsibility amongst staff.

The guide continues with going one step deeper; working out business flows.

To provide you with an overview of this guide’s approach:

Stage Short Description
1. Business Process Mapping (high level),
2. Work out individual steps per Business Process (lower-level),
3. Define computerized/manual systems in place to collect/process/transfer data, per individual step,
4. Risk of data by the following qualifiers:

  • Position in the process (the earlier the less risky),
  • Criticality of the data itself.
5. Categorization of the system related to the composition of the system (e.g. storage of data electronically),
6. General requirements per category (of point 5):

  1. GDocP,
  2. Access Control,
  3. User Level,
  4. Audit Trail,
  5. Audit Trail Review,
  6. Back-up and Recovery.
7. GAP analysis of the system for the following main points of interest:

  1. Administrator Roles & Responsibilities
  2. Security/User Access Control,
  3. Signatures,
  4. Data review,
  5. Audit trail,
  6. Data lifecycle management,
  7. System lifecycle management,
  8. Time Stamps.

Each of these points is subdivided in sub-topics.

Acceptance Criteria per sub-topic is provided in the guideline.

8. For those gaps identified under point 7, an FMEA can be worked out


The stages are to be followed in chronological order from 1 to 8.

Some stages are connected such as stage 5 and 6.

First you define in what category your system is, then you use stage 6 to define what requirements exist for that category.

For example: there are no back-up requirements (point 6 in stage 6) for a category 1 data-flow system as defined in stage 5.

It took me some time to understand the logic of this exercise, but this is needed to perform the next exercise (stage 7) for a GAP Analysis.

During stage 7, you are challenging each topic with questions and acceptance criteria. The objective here is to find data integrity gap’s. If you find any gap’s, perform an FMEA (failure mode and effect analysis). This approach avoids the “do-all” approach we at PCS observe far too often, which is not only quite boring to be honest, but time-consuming and illogic as well. You risk missing important issues due to the overwhelming amount of rows in your FMEA sheet.


I liked the approach of going top-down by analyzing the processes first. The added benefit of this is to avoid the dreaded do-all approach on data-integrity, preventing you from having to create and analyze a massive amount of FMEA’s.

Always start with the actual situation, do smart thinking on the approach of this hot topic. In that sense this guideline is a nice contribution to the data integrity discussions in the field.

And to my personal opinion, it could quite well contribute to your computerized systems approach.

How could this approach support you in your day to day activities?

This APIC guidance is included in both our Data Integrity and Computerized Systems Validation training (Computersysteemvalidatie) (both in Dutch). In the Computerized Systems Validation training we distinctly focus on the practical aspects of data integrity in computerized systems. 

Are you experiencing data integrity issues with your computerized system whilst not being able to adjust it yourself? Have a look at PCS Intelligence, PCS Intelligence eQMS Software is, in essence, a workflow management system that is robust, flexible and has been in development since 1993. It contains easily adaptable elements such as centralized procedure distribution, review and approval of procedures, roles & authorizations, training records and procedures such as Change Control, Deviations and Out-Of-Specification (OOS), developed taking into account the latest data integrity guidances.

About the author

Add Comment