Case Study

Validated Cloud-Based Bioinformatics Pipeline

The client was a gene editing therapeutics company that developed transformative gene-based medicines for patients with serious diseases. They developed a preclinical bioinformatics application for use with a CRO partner. The custom bioinformatics pipeline that analyzed NGS datasets was to be used to predict efficacy and identify trial suspension points in an upcoming clinical trial. Due to the intended use of this application as a secondary digital endpoint, it was to be Validated for 21 CFR 11 and GxP compliance under GAMP-5 standards for validation. The client’s team previously developed software for non-clinical use and had no experience with computer software validation.


Arrayo provided development and validation expertise for the homegrown bioinformatics application and systems central to our client business, research, and clinical goals. Arrayo developed the web application in a secure AWS environment.

Phased approach:

First, we delivered a Validation Plan. We performed documentation review, addressed environment availability, GxP and non-GxP segregation. A comprehensive regulatory and security procedure was completed as well. Next, we selected the architecture and technology stack, followed by an agile development effort. We provided audit-level documentation for regulatory compliance as well as architecture documentation including written descriptions and diagrammatic portfolios of all services and API endpoints.

Subsequently, we worked on Tokenization / Authorization, integrated Atlassian stack with automated testing infrastructure, associated test results with changes and links to task documentation (JIRA), captured and documented build numbers with QA approval, created infrastructure to determine pass / fail / warn for automated test results, and executed manual testing of features & support of automated regression testing.

              We then secured databases and sources of data persistence by placing them in a privately secured VPC. Data was accessed through RESTful APIs that were available to potentially public sources, where we used the OAuth 2.0 based security principals. Claims that were carried to the user were role-based and allowed for the restriction of users to various endpoints hosted in the RESTful APIs.

We used standards data models such as fast healthcare interoperability resources (FHIR) and OMOP. FHIR has an OAuth 2.0 compliant schema (SMART) that is open sourced and heavily geared toward the proper security that must include PII and HIPAA compliant data as dictated by regulatory authorities.

              Finally, the execution included the completion, review, and approval of IQ, TraceMatrix, OQ and PQ. With all approvals and environment setup completed, we released the environment for validation. We then executed the OQ, PQ and documentation. We subsequently worked with the administrator to verify the installation and document the outcomes. We addressed smoke test and did release the environment for production use with required release notes.

We produced deliverables during various phases including a Validation plan with detailed strategy, deliverables, and execution approach, a 21 CFR Part 11 Assessment per the intended GxP use, the IQ Protocol of the infrastructure and software components, the OQ Protocol for Functional testing/validation of the software and the PQ Protocol for Performance qualification per the intended business process in alignment with the business SOP’s.

In addition, we provided a Traceability Matrix for the URS, FRS, DS, IQ, OQ, PQ and Business process SOPs for the intended use of the system.

Last, we delivered Change management procedures after validation.. The software was now ready for clinical use. Software change control mechanism was defined with appropriate roles in validation and production environments, and team members were trained to follow standard operating procedures.. Appropriate logging mechanisms were enabled in AWS to log activities of access and runs to feed audit reports during inspections. A process to continuously keep the environment in a validated state was defined. Test cases were documented and executed to verify process works as expected before production implementation and certifying the software for use in FDA regulated processes.


To conclude, Arrayo delivered the validated cloud-based bioinformatics pipeline and system for production use to predict efficacy and identify trial suspension points in a clinical trial. This included quality documentation for the system, operating requirements, and access controls