By Scott Scheirey
AWS re:Invent 2022 had over 50,000 attendees eagerly awaiting the latest releases that will ultimately push cloud innovation in years to come. Amazon releases thousands of ‘significant services and features’ every year, which ultimately transform the cloud landscape for many industries including Life Sciences.
Among all of the excitement at AWS re:Invent 2022, I found the release of Amazon Omics to be the shining star for the life sciences industry, and here’s why:
Amazon Omics utilizes existing storage, compute, management, and modeling tools to supercharge multi-omics research. Having sequence stores, bioinformatics workflows (and managed integrations with popular tools like Nextflow), and the ability to optimize variant and annotation data, Amazon Omics enables multimodal and multi-omic analysis. It even supports clinical and medical imaging data.
Why is the Amazon heritage so important you might ask? Amazon not only supports the many data types used in multi-omics driven research, like BAM, FASTQ, or VCF files, but it allows the user to take advantage of the many technology solutions that AWS has to offer. This allows customers to take data straight from their instruments / storage in S3 buckets, and further structure, query, and analyze in Amazon Omics. Amazon architectures can host most third-party tools in their cloud environment, and Amazon Omics is a great workhorse for researchers to build sustainable, scalable multi-omic workflows allowing for cost savings, and complete control over the environment.
I think this is crucial for life science companies. Although there are numerous of platforms that allow for data storage, querying capabilities, and analysis, they often come with larger storage and compute costs than using AWS technology solutions themselves. Many companies that offer ‘cloud hosting or storage’ will upcharge the AWS costs, which would otherwise be much more expensive than integrating their solution with your own AWS architecture.
Customers are also consistently weeding through different regulatory requirements and compliance with different platform vendors in order to adhere to their specific industry standards. Why not keep your architecture under your own roof, have full control over your compliance, security, and governance, and use all the tools you’re already using?
Having an Amazon multi-omic environment allows customers to store in S3, Query across data using Athena, and build ML models using SageMaker. Most life sciences companies already have an AWS architecture, which facilitates sharing and availability to different locations, scaling scientific discovery by provisioning compute infrastructure, and full control over internal access using IAM. Users can even use AWS Lake Formation to control access, and support querying of different teams in each multi-omic data lake.
From my perspective this is so important for research, as there is no ‘one solution fits all’ way to establish multi-omic workflows. Although there is software that people use for specific analysis, Genomics for example, customers are often integrating their favorite tools to build an effective pipeline (Cell Ranger, Star Aligner, Seurat, and Picard to name a few). Having a managed Amazon sandbox to play in will allow researchers to take advantage of nearly limitless resources to drive their research, without having to focus on backend configurations.
This is exciting for us at PTP, as it will allow us to better support our research driven clients who want to spend more time focusing on the data and cutting edge results. By having an Amazon multi-omic environment, it will allow researchers a tremendous amount of leeway to level up their discovery process, further FAIRify their data, and have their workflows all under one roof with complete oversight.
Why Amazon Omics?