Case Study |

Validated cloud-based bioinformatics pipeline


The client is a gene editing therapeutics company that develops transformative gene-based medicines for patients with serious diseases. A homegrown preclinical bioinformatics application was developed for use with a CRO partner. The custom bioinformatics pipeline that analyzes NGS datasets was to be used to predict efficacy and identify trial suspension points in an upcoming clinical trial. Due to the intended use of this application as a secondary digital endpoint it was to be Validated for 21 CFR 11 and GxP compliance under GAMP-5 standards for validation. The client team was previously developing software for non-clinical use and had no experience with computer software validation.

The Project

Arrayo provided development and validation expertise for the homegrown bioinformatics application and systems central to our client business, research, and clinical goals. Arrayo developed the web application in a secure AWS environment. The team included a validation subject matter expert, a validation lead, a validation engineer, and a project manager.


Kick-off Phase

  • The purpose of phase 1 was to review documentation for completeness and deliver a Validation Plan. We performed documentation review, addressed environment availability, and GxP and non-GxP segregation. A comprehensive regulatory and security was completed as well followed by the following recommendations.


Development Phase

  • Architecture and technology stack had been selected as a Java Springboot Middleware API to interface with existing SQL (Oracle) databases. Client /UI side was implemented in react.js application using open source libraries and styled with Boostrap open source styling library
  • Agile development
  • POC for iteration delivered within the one week
  • Provided audit-level documentation for regulatory compliance
  • Provided architecture documentation including written descriptions and diagrammatic portfolios of all services and API endpoints
  • Tokenization / Authorization
  • Integrated Atlassian stack with automated testing infrastructure
  • Associated test results with changes and link to task documentation (JIRA)
  • Capture and document build numbers with QA approval
  • Create infrastructure to determine pass / fail / warn for automated test results
  • Manual testing of features & support of automated regression testing


Security and Compliance

  • Databases and sources of data persistence were secured by placing them in a privately secured VPC. Data was accessed through RESTful APIs that were available to potentially public sources, to this end, we used the OAuth 2.0 based security principals for RESTful APIs. Claims that were carried to the user were role-based and allowed for the restriction of users to various endpoints hosted in the RESTful APIs. These principals could trickle down to the schemas of databases where sensitive fields on individual tables could be obfuscated for less-privileged users.


  • We used standards data models such as fast healthcare interoperability resources (FHIR) and OMOP. FHIR has an OAuth 2.0 compliant schema (SMART) that is open sourced and heavily geared toward the proper security that must include PII and HIPAA compliant data as dictated by regulatory authorities.


Validation Phase

  • The execution included the completion, review, and approval of IQ, TraceMatrix, OQ and PQ. With all approvals and environment setup completed, we released the environment for validation. We then executed the OQ, PQ and documentation. We subsequently worked with the administrator to verify the install and document the outcomes. We addressed smoke test and did release the environment for production use with required release notes.
  • We produced deliverables during various phases including a Validation plan with detailed strategy, deliverables and execution approach, a 21 CFR Part 11 Assessment per the intended GxP use, the IQ Protocol of the infrastructure and software components, the OQ Protocol for Functional testing/validation of the software and the PQ Protocol for Performance qualification per the intended business process in alignment with the business SOP’s.
  • In addition, we provided a Traceability Matrix for the URS, FRS, DS, IQ, OQ, PQ and Business process SOP’s for the intended use of the system.
  • Last we delivered Change management procedures after validation, to maintain the system under validated state. The software was now ready for clinical use. Software change control mechanism was defined with appropriate roles in validation and production environments and team members were trained to follow standard operating procedures in day to day operations. Appropriate logging mechanism were enabled in AWS to log activities of access and runs to feed audit reports during inspections. A process to continuously keep the environment in a validated state was defined. Test cases were documented and executed to verify process works as expected before production implementation and certifying the software for use in FDA regulated processes.


  • Arrayo delivered the validated cloud-based bioinformatics pipeline and system for production use to predict efficacy and identify trial suspension points in clinical trial. This included quality documentation for the system, operating requirements and access controls.