INSIGHTS
Case Study

Scalable NGS Pipelines for Bioinformatics Groups

A company needed to run bioinformatics pipelines to quickly generate insights from experimental results. Additionally, they found it difficult to determine why NGS pipelines had failed and were unable to understand the pipelines’ results. We were requested to get their NGS pipelines out of just being shell scripts and enable the lab team to run these pipelines on demand rather than having to request the bioinformatics team to run them.

Delivery

For the initial phase of this solution, we evaluated Prefect and Nextflow. Through our evaluation of the two options, we, along with relevant stakeholders, agreed to rewrite the most highly used pipelines in Nextflow, which would be executed in batch jobs using containers.

With Nextflow, pipelines could be picked up part way through and called inside of the existing containers. Moreover, the conversion was easier due to its similarity to bash scripting. We also created an API with an associated front end, which provided the company with self- service NGS pipelines. Additionally, we created a request that allowed the team to ping all the pipelines that had been converted into Nextflow. We created forms for each pipeline so that a JSON could be sent to the API, which would be parsed by the pipeline and kicked off. Finally, we added notifications to notify the user when their pipeline ran.

Value

The front end enabled the lab team to run these jobs as requested without the need for the bioinformatics team’s support.

Related Insights