Data Platforms & Storage Infrastructure Header Image

 

Data Platforms & Storage Infrastructure

Optimize Data Storage Solutions for Peak Speed, Performance, and Cost Efficiency

April 2 - 4, 2025 ALL TIMES EDT

Are your data management challenges growing more complex with each passing day? Do you have the right framework in place to efficiently store, process, and secure your expanding data volumes while ensuring compliance with organizational standards? As data availability and interoperability become increasingly critical, are you struggling to keep up with scalable solutions for distributed and federated analytics? How do you manage the trade-offs between speed, performance, and cost without compromising on quality? With so many vendors and technologies available, how can you confidently choose the most effective storage methods and platforms to meet your unique needs? Are your data storage practices sustainable? The Data Platforms & Storage Infrastructure track delves into these critical questions, providing insights into the latest trends such as cloud-based solutions, high-performance computing, and the integration of AI and machine learning. This track showcases how leading organizations are pioneering advancements in large-scale data management, including innovative storage platforms, integration and migration strategies, and governance frameworks, to meet the demands of the life sciences industry. Join us to explore best practices and cutting-edge approaches that are shaping the future of data infrastructure.

Wednesday, April 2

8:00 amRegistration Open and Morning Coffee

9:00 amRecommended Pre-Conference Workshops and Symposia*

On Wednesday, April 2, 2025, Cambridge Healthtech Institute is pleased to offer five pre-conference Workshops scheduled across two time slots (9:00 am–12:00 pm and 1:15–4:15 pm) and three Symposia from 9:00 am–4:20 pm. All are designed to be instructional, interactive, and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Thursday–Friday.

*Separate registration required. See details on the Symposia here and details on the Workshops here.

4:40 pm

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

4:45 pm PLENARY KEYNOTE INTRODUCTION:Explainable AI in Drug Discovery

Kshitij Kumar, CEO & Founder, CLOVERTEX

4:55 pm PLENARY KEYNOTE PANEL DISCUSSION:

From Bytes to Breakthroughs: Next-Generation AI Driving the Future of Life Sciences and Healthcare

PANEL MODERATOR:

Abbie Celniker, PhD, Partner, Third Rock Ventures LLC

Next-Generation AI has the potential to revolutionize life sciences by delivering unprecedented insights, automation, and efficiency. But what will those industry transformations look like? This keynote panel convenes leaders from biopharma, healthcare, and emerging tech who are applying AI—generative models and beyond—to accelerate drug discovery, diagnostics, and patient care. Panelists will share real-world case studies, discuss overcoming both technical and organizational challenges, and explore how AI is evolving from predictive tools to autonomous, decision-making systems. Look beyond the hype to uncover where AI is making a tangible impact today and where the next frontiers of innovation lie.

PANELISTS:

Tala Fakhouri, PhD, MPH, Associate Director for Data Science and AI Policy, FDA (participating virtually)

Per Greisen, PhD, President, BioMap

Sofia Guerra, Vice President, Bessemer Venture Partners

Subha Madhavan, PhD, Vice President and Head, AI/ML, Quantitative and Digital Sciences, Pfizer Inc.

Sonya Makhni, MD, Medical Director, Mayo Clinic Platform

6:10 pmWelcome Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

The Bio-IT Kickoff Reception is a reunion—reconnect with friends, explore cutting-edge research, and celebrate innovation! Enjoy poster presentations, networking, and vote for the Best of Show and Poster awards.

7:25 pmClose of Day

Thursday, April 3

7:00 amRegistration and Morning Coffee

8:00 am

Organizer's Remarks

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News

8:05 am PLENARY KEYNOTE INTRODUCTION:Build for Now & the Future: 8 Critical Pillars for Your Enterprise AI Strategy 

Jesse Cugliotta, Global Industry GTM Lead, Healthcare & Life Sciences, Snowflake, Inc.

HARNESSING AI FOR DRUG DISCOVERY: FROM INFRASTRUCTURE TO IMPLEMENTATION

8:15 am PLENARY KEYNOTE PRESENTATION:

Data and Computing Infrastructure for the Life Sciences: Best Practices, Observations, and Lessons Learned

Chris Dwan, Independent Consultant, Dwan, LLC

This talk will provide practical, real-world advice based on Dwan's quarter century of experience designing and implementing high-performance computing and large-scale data systems for health care and the life sciences. Topics will include network architectures, cloud vs. "terrestrial" infrastructure, practical data strategies, information security, quality and compliance from R&D to the clinic, differentiated computing platforms, human and organizational factors, and of course AI.

8:45 am PLENARY KEYNOTE PRESENTATION:

Generative AI, Aging Research and Robotics as a Platform for Drug Discovery: From Hype to Clinical Efficacy

Alex Zhavoronkov, PhD, Founder & CEO, Insilico Medicine

9:15 amSession Q&A

9:30 amCoffee Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Start your morning with coffee, connections, and cutting-edge research! Enjoy poster presentations, network in the Exhibit Hall, vote for awards, and a chance at a fabulous raffle prize!

10:15 amOrganizer's Welcome Remarks

FOUNDATIONS OF MODERN DATA PLATFORMS

10:20 am Chairperson's Remarks

Michael Riener, President & CEO, RCH Solutions

10:25 am

Building a Unified Analytics Ecosystem: From Concept to Reality

Anand Ganesan, Product Lead, GD-IT, Regeneron Pharmaceuticals, Inc.

Sriram Krishnamurthy, Director, GD-IT, Regeneron Pharmaceuticals, Inc.

We will present our journey in unifying disparate analytics environments, including SAS, open-source platforms, and Lakehouse platforms like Databricks. Our evolution from a SAS-centric setup to a multi-platform, multi-language environment incorporating Python and R will be discussed. We will outline our strategy for creating a unified, connected ecosystem that empowers business users to choose their preferred platform and collaborate seamlessly across different environments.

10:55 am

Aligning Culture, Data Architecture, and Process to Build a Field-Leading Lab-in-the-Loop Data Platform for ML-Guided Design

Stephen Northup, Staff Software Engineer, Dyno Therapeutics

In today’s market, every company wants to apply ML to their R&D pipeline. To do this effectively requires a data platform that is dramatically different than was common in the recent past. To empower ML innovation, companies require a ‘lab in the loop’, where data and insights can flow freely across team and departmental boundaries. This talk will equip engineering leaders to understand strategic trade offs as well as practical on-the-ground tactical decisions based on hard-learned lessons from getting a startup off the ground to establishing it as a leader in ML-guided design.

11:25 am

Data and Computing Infrastructure for the Life Sciences: Best Practices, Observations, and Lessons Learned

Chris Dwan, Independent Consultant, Dwan, LLC

This talk will provide practical, real-world advice based on Dwan's quarter century of experience designing and implementing high-performance computing and large-scale data systems for health care and the life sciences. Topics will include network architectures, cloud vs. "terrestrial" infrastructure, practical data strategies, information security, quality and compliance from R&D to the clinic, differentiated computing platforms, human and organizational factors, and of course AI.

11:55 am A TCO Analysis of Public and Private Storage Clouds: Controlling Costs of Forever, and Forever Growing, Data Sets

Tim Sherbak, Life Science Solutions, Quantum Corp.

As data volumes continue to grow exponentially, organizations face challenges in managing long-term storage costs. This session presents a Total Cost of Ownership (TCO) analysis comparing public and private storage cloud models, focusing on cost control and new technologies for cost-effectively storing, protecting, and accessing data for years and decades.

12:10 pm Challenges in Capturing High-Quality Structured Data in the Biosciences: Gaps and Emerging Solutions

Joseph Mann, Head of Technical Implementations , Uncountable

The biosciences generate vast amounts of data, yet much remains trapped in unstructured or disconnected formats, limiting its usability for analysis, automation, and AI-driven insights. Variability in assay protocols, fragmented data entry, and the lack of standardized ontologies create inconsistencies that hinder integration, collaboration, reproducibility, and research timelines. This talk will explore these challenges through a case study, highlighting the difficulties of structuring experimental and analytical data across diverse platforms. We’ll discuss how modern data technologies like Uncountable bridge these gaps by leveraging FAIR data principles and automated data capture to create more efficient, scalable, and reliable workflows. By understanding and embracing these solutions, researchers and organizations can build data ecosystems that drive innovation, improve decision-making, and accelerate discovery in the biosciences.

12:25 pm Painting the Vision of Lab of the Future

Raveen Sharma, Managing Dirctor, Life Sciences & Healthcare, ConvergeHEALTH by Deloitte

Mileidy Giraldo, Worldwide Lead, Life Sciences Strategy & Lab of the Future, Amazon Web Svcs

Lab of the Future has the potential to revolutionize laboratory operations through data, AI, and automation. But what will these transformations look like? This session explores how to harness these technologies to enhance research efficiency, improve decision-making, and maintain a competitive edge. Attendees will gain insights into real-world applications, discuss overcoming both technical and organizational challenges, and explore how labs are evolving from manual processes to automated, data-driven environments. Look beyond the current practices to uncover where innovation is making a tangible impact today and where the next frontiers of laboratory modernization lie.

12:55 pmSession Break and Transition to Lunch

1:05 pm LUNCHEON PRESENTATION: Bioinformatics AI—Faster, Smarter, Inevitable

Jamie Littlejohns, CEO, Velsera

AI has revolutionized industries from finance to engineering—so why not life sciences? The answer lies in infrastructure. Drug discovery requires secure, scalable, and interoperable data platforms to power new AI-driven breakthroughs. In this talk, we unveil the next wave for bioinformatics where AI doesn’t just speed up analysis but fundamentally redefines how researchers and clinicians find, integrate, and interpret data. With real-world examples including AI-powered knowledge curation, data harmonization, and federated data access, we will explore where the field is heading and how AI is shifting bioinformatics from art to optimized science.

1:35 pmRefreshment Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Bio-IT's hall is bigger than ever—one break won’t cut it! Enjoy dessert and coffee after lunch, explore booths and posters, vote for awards, and participate in our raffle for a chance to win a prize!

MODERNIZING FOR THE NEW ERA OF LIFE-SCIENCE INNOVATIONS

2:25 pm Chairperson's Remarks

Bill Lynch, Life Sciences Strategic Alliances Manager, Healthcare, Pure Storage, Inc.

2:30 pm

A Data Platform for Data Productization in a Data Mesh: The Good, the Bad, and the Ugly

Pierre Alexander Fischer, PhD, Product Line Lead, Data Integrations Generating Insights (DIGI), Roche

This session provides key lessons from our journey implementing a data mesh strategy to make FAIR data available for enterprise use, moving from a centralized monolithic platform to a cloud-based self-service system with federated governance. We'll discuss successes, challenges, and failures encountered, offering insights on the technology, process, and people aspects of this transformation. Our goal is to humbly help others avoid the pitfalls we experienced modernizing our data platform.

3:00 pm

Building a Serverless Data Platform for Biotech Companies

Patrick O'Mara, Associate Director, Research Informatics, Photys Therapeutics

This presentation showcases the construction of a serverless data pipeline for biotech start-ups. Using Apache Airflow on Google Cloud Composer, it automates data transfers between Egnyte, CDD Vault, and BigQuery. Highlighting CDD Vault's robust API and data model, the talk demonstrates how this solution enhances data management, accelerates insights, and supports innovation, offering a scalable and cost-efficient approach for biotech organizations. 

3:30 pm

Modeling Outcomes Using Surveillance Data and Scalable AI for Cancer (MOSSAIC): Overview

Fernanda Foertter, MSc, Oak Ridge National Lab

MOSSAIC is a project that aimed to reduce the time it takes to process reports sent to cancer registries and make that data available broadly for policy makers. It uses NLP to autocode cancer pathology reports that are submitted to Survaillance, Epidemiology and End Results (SEER) registries across the country. This talk will give an update on the near decade long project by highlighting some tools and methods used.

4:00 pm Talk Title to be Announced

John Capello, CTO, Product Management, Nasuni

4:15 pm The Current State of GPUs and What They Mean to AI in the Life Sciences

Keith Pijanowski, AI Solutions Engineer, MinIO

Nvidia, Intel, and AMD’s GPU advances are both thrilling and daunting. We’ve reached petaFLOP-scale performance, but such power can overwhelm your data infrastructure. What happens when GPUs outpace your network or storage? This short talk explores their progress and how object storage and GPUs can drive innovation in life sciences.

4:30 pmBest of Show Awards Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)

Unwind with colleagues at our lively reception! Explore posters, vote for the best, network with exhibitors, enjoy a drink, and try to win a raffle prize. Celebrate Best of Show winners!

5:45 pmClose of Day

Friday, April 4

7:00 amRegistration Open and Morning Coffee

7:00 amQuick Bytes & Networking Breakfast—Lifted Rooftop Restaurant & Bar (Sponsorship Opportunity Available)

Start your morning with ‘Quick Bytes & Networking’! Enjoy a cozy restaurant-style setting, quick bites, and speed networking. Connect, converse, and energize your Bio-IT experience before the plenary keynote!

8:00 am

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

8:05 am

Innovative Practices Awards: Excellence in Technological Innovation

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News

Since 2003, Bio-IT World has hosted an elite awards program with the goal of highlighting outstanding examples of how technology innovations and strategic initiatives are being applied to advance life sciences research. The 2025 Innovative Practices Awards winners represent excellence in innovation in the areas of informatics, pre-competitive collaboration, clinical and health IT, and genomics. Companies driving the winning entries include Genmab, Genedata, NHS England, IQVIA, Pistoia Alliance, Regeneron, and Quris-AI. For more details about the Awards, visit www.bioitworldexpo.com/innovativepractices.

8:20 am PLENARY KEYNOTE PRESENTATION:

The Longitude Prize on ALS: A Groundbreaking Global Prize Harnessing the Power of AI to Drive Treatment for ALS

Tris Dyson, Founder, Challenge Works

Jeffrey D. Rothstein, MD, PhD, Professor, Neurology and Neuroscience; Director, Brain Science Institute, Johns Hopkins University

The Longitude Prize series brings together the brightest minds to solve the world's most challenging innovation problems. The Longitude Prize on ALS, launching in June 2025, will bring together computational biologists, neurodegenerative researchers and AI-driven biotech globally to uncover novel therapeutic targets for ALS. 

ADVANCING DRUG DISCOVERY AND HEALTHCARE THROUGH DATA-DRIVEN INNOVATION: FROM GENOMICS TO THERAPEUTICS

8:35 am PLENARY KEYNOTE INTRODUCTION:Shaping the Next Era of Precision Health with Multiomics and AI-Driven Predictive Insights

Rami Mehio, Vice President, Head of Global Software and Informatics, Illumina, Inc.

8:45 am PLENARY KEYNOTE PRESENTATION:

Scaling Genomic Medicine: Transforming Newborn Screening through Informatics and Innovation

Robert C. Green, MD, MPH, Professor and Director of Genomes2People Research, Mass General Brigham, Broad Institute, Ariadne Labs, and Harvard Medical School

The BabySeq Project has pioneered the integration of genomic sequencing into newborn and childhood screening, uncovering unexpected risk variants and transforming healthcare delivery. This keynote explores the groundbreaking progress in genomic medicine, featuring real-world stories of families impacted by these discoveries. Learn about the informatics challenges and innovative solutions required to scale genomic screening for national and global implementation, reshaping the future of precision medicine.

9:15 am PLENARY KEYNOTE PRESENTATION:

Unlocking the Power of Machine Learning and Data-at-Scale to Deliver with Speed the Best Therapeutic Candidates

Justin M. Scheer, PhD, Vice President In Silico Discovery & Head, Molecular Computational Team, Johnson & Johnson Innovative Medicine

The challenges of high costs, lengthy timelines, and significant attrition have prompted our industry to integrate AI/ML into all aspects of the business. This presentation highlights J&J's strategic investments in AI/ML technologies to enhance the drug discovery processes, including molecule design and optimization. By investing in these technologies with a modality agnostic approach, J&J aims to tackle the hardest targets in drug discovery, ultimately increasing the success rate of delivering better molecules faster.

9:45 amCoffee Break in the Exhibit Hall with Poster Competition Winners Announced (Sponsorship Opportunity Available)

Bio-IT is all about connections! Explore booths, award-winning posters, and network with clients, colleagues, and exhibitors. Grab coffee, build relationships, and stay for a chance to win a raffle prize!

10:30 amOrganizer's Remarks

INNOVATIONS IN DATA INFRASTRUCTURE

10:35 am Chairperson's Remarks

Parker Happ, RSD, Boston, Sales, Komprise

10:40 am

The Power of Mimicry: Integrating AI/ML with Molecular Function Optimization

Adam Kraut, Director, Research Informatics and Data Architecture, Metaphore Bio

Discover how Metaphore leverages molecular mimicry alongside cutting-edge AI and ML algorithms to optimize molecular functions for drug development. This presentation will explore our function-first approach, integrating high-throughput experimental data with scalable data platforms and storage infrastructure. We will discuss how unifying data management and employing advanced analytics enable us to map, mimic, and optimize millions of molecules efficiently. Learn how our innovative integration of AI/ML with robust data solutions accelerates the discovery of next-generation therapeutics, transforming the landscape of drug development.

11:10 am

Data as a Strategic Asset: Empowering Users and Systems with Generate’s Next-Gen Data Lake

Alok Saldanha, PhD, Principal Software Engineer, Informatics, Generate Biomedicines

Come explore the best practices that have driven the success of Generate’s Next-Gen Data Lake. Learn how we leveraged a central repository of entities and vocabularies and reimagined lab processes to capture experimental context, and enhanced our computational infrastructure to be implicitly traceable. By building a unified data exchange we empower scientists to confidently access and interpret data across the platform.

11:40 am

Intuence: A Next-Generation Data-Analysis Platform

Vimala Selvaraj, Senior Principal Scientific Product Operational Manager, Novartis Biomedical Research

Integrating various tools, workflows, and data products into a connected suite helps scientists make faster and better decisions. Intuence Discovery (ID), which supports early drug discovery, focusing on lead optimization and technologies like protein degraders and cyclic peptides. ID helps with data annotation, gathering, analysis, and decision-making, covering the DMTA cycle from data querying to compound selection and analysis. ID is a fast, single webpage that enhances data-driven discussion.

12:10 pm Supercharge Computational Drug Discovery with AI-Powered Serverless High-Performance Computing (HPC)

Fengbo Ren, CEO, Computer Science & Engineering, Fovus Corp.

Fovus is an AI-powered, serverless high-performance computing (HPC) platform delivering intelligent, scalable, and cost-efficient supercomputing power at the computational scientists' fingertips. Fovus uses AI to optimize HPC strategies and orchestrates cloud logistics, making cloud HPC a no-brainer and ensuring sustained time-cost optimality for computational drug discovery amid quickly evolving cloud infrastructure. By accelerating time-to-insights and optimizing cloud costs, Fovus helps Biotech clients accelerate Design-Make-Test-Analyze (DMTA) cycles and discover more with less. Join this talk to learn how Fovus can supercharge your computational drug discovery with case studies and GROMACS/AlphaFold 3 benchmarking results.

12:25 pm

Powering AI/ML at Scale: Building a Cloud-Native Infrastructure for Biopharma Innovation

Anand Murthy, Director, AI and Data Platform, Moderna

As AI and machine learning transform biopharma R&D, building a scalable, cost-efficient, and compliant cloud infrastructure is essential for accelerating innovation. We have embraced a fully cloud-native approach to power AI/ML workloads, enabling seamless access to data, high-performance compute, and secure collaboration. This session will explore key architectural decisions, trade-offs considered, and best practices for optimizing cloud environments to support AI/ML at scale. Attendees will gain insights into leveraging cloud technologies to drive scientific breakthroughs while maintaining flexibility, security, and cost efficiency.

12:40 pm Harnessing Agentic AI in R&D Cloud Ecosystems: Accelerating Clinical Innovation

Shakthi Kumar, Chief Strategy and Business Officer, EDETEK Inc

Imagine a world where clinical development is faster, smarter, and more efficient. The fusion of agentic AI with R&D cloud ecosystems is making this vision a reality. Join us to explore how this cutting-edge technology is revolutionizing clinical data management and analytics. (Spoiler: It's a game-changer!) Learn about: Transformative power of R&D Cloud Ecosystems in delivering the next-gen digital data pathways. Innovative impact of agentic AI on clinical workflows. Real-world case studies showcasing the benefits of this integration. (Just a preview!)

1:10 pmSession Break and Transition to Lunch

1:20 pmLuncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own

1:50 pmRefreshment Break in the Exhibit Hall with Last Chance for Poster Viewing (Sponsorship Opportunity Available)

Feeling tired? Recharge during the final Networking Exhibit Hall break! Visit booths, explore posters, connect with peers, and turn in your Game Cards for a chance to win a raffle prize.

TRENDS FROM THE TRENCHES: BRIDGING TRADITIONAL INSIGHTS WITH INNOVATIVE ADVANCEMENTS

2:30 pm

Chairperson's Remarks

Dirk Petersen, Director of Supercomputing Center, Oregon State University

2:35 pm

Trends from the Trenches

Ari E. Berman, PhD, CEO, BioTeam, LLC

Since 2010, “Trends from the Trenches” has been a cornerstone of the Bio-IT program, delivering candid and occasionally blunt assessments of the most impactful and overhyped IT technologies in life sciences. This talk will provide a deep dive into computing, storage, cloud, data science, machine learning, and more, with a focus on supporting data-intensive science. Looking ahead, this talk will share forward-thinking predictions about emerging technologies and trends poised to shape the future of life sciences innovation, offering actionable insights for navigating the next wave of IT evolution.

3:05 pm

In the Trenches with AI Supercomputing: Driving Innovation in Life Sciences and Quantum Simulations

Dirk Petersen, Director of Supercomputing Center, Oregon State University

Launching in 2026, a new AI supercomputer powered by Nvidia’s latest Rubin-generation GPUs will transform research at Oregon State University’s Huang Collaborative Innovation Complex. This mini talk highlights its capabilities, from accelerating protein structure prediction to advancing quantum simulations to something completely new and different. Learn how you can get access to this cutting-edge resource and drive innovation in life sciences and quantum computing simulations and discover opportunities to collaborate.

3:15 pm

Transforming Big Data into Actionable Insights: Leveraging the Sequence Read Archive (SRA) for Life Sciences and Public Health

J. Rodney Brister, PhD, Acting Program Head, Sequence Read Archive, NCBI, NLM, NIH

As the world's largest publicly available repository of raw sequence data, the Sequence Read Archive (SRA) plays a pivotal role in advancing public health and life sciences research. This presentation highlights state-of-the-art tools and strategies for managing and analyzing the SRA’s massive datasets, showcasing its impact on infectious disease surveillance, genomic epidemiology, and precision medicine. Discover how innovative informatics solutions are transforming raw data into actionable insights for global health challenges.

3:30 pm

The Biologist Explores Learning: Insights on LLMs, Deep Learning, and Personal Discoveries

Brian Osborne, PhD, Senior Principal Consultant, BioTeam, LLC

Many biologists who have spent years coding and thinking in terms of bioinformatics - protein and DNA sequence, genomics - are now engaging with machine learning, NLP, and LLMs. In this talk a bioinformaticist will talk about the many lessons learned and twists and turns encountered in these new fields. Topics will include new ways of thinking about computing with CPUs and GPUs, re-representing data, training, iteration and validation, version control and environments, new definitions of “pipeline”,  and coming face-to-face with prediction and statistics.

3:45 pmSession Q&A

4:05 pmClose of Conference


Hackathon

Search Agenda

Conference Tracks

Data Platforms & Storage Infrastructure