Data Science and Analytics Technologies

Tools and Methods for Extracting Insights and Value from Data to Advance Biomedical Research

April 16 - 17, 2024 ALL TIMES EST

The Data Science and Analytics Technologies track explores the tools, technologies, and programming languages employed by data scientists to extract additional insights and value from data. During presentations, we will discuss the significance of scalable platforms versus personalized data science assistance, the transformation into a data-driven organization, inventive strategies for data management and analytics, the art of identifying the genuine questions that require answers, the practical impact of data science, and the practical application of data science tools.

Monday, April 15

Recommended Pre-Conference Workshops and Symposia*8:00 am

On Monday, April 15, 2024, Cambridge Healthtech Institute is pleased to offer eight pre-conference Workshops scheduled across three time slots (8:00–10:00 am, 10:30 am–12:30 pm, and 2:00–4:00 pm) and six Symposia from 8:00 am–4:20 pm. All are designed to be instructional, and interactive and provide in-depth information on a specific topic. They allow for one-on-one interaction and provide a great way to explain more technical aspects that would otherwise not be covered during the main conference tracks that take place Tuesday–Wednesday.

*Separate registration required. See details on the Symposia here and details on the Workshops here.

PLENARY KEYNOTE PROGRAM

4:30 pm

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

4:35 pm Plenary Keynote Introduction

Greg Mazzu, Regional Sales Manager, WEKA

4:45 pm PLENARY KEYNOTE PRESENTATION:

Unleashing the Power of Advanced Computing in Biomedical Informatics: A Vision for Transformative Collaboration

Daniel Stanzione, PhD, Executive Director, Texas Advanced Computing Center (TACC)

In the dynamic intersection of life science and computing, our mission at the Texas Advanced Computing Center (TACC) is to propel biomedical informatics into a new era of discovery and innovation. As computational leaders, we are dedicated to harnessing the potential of high-performance computing (HPC), machine learning (ML), and data analytics to revolutionize medicine. In this visionary pursuit, we prioritize the development of user-friendly interfaces and intuitive platforms. This approach ensures accessibility for executives and leaders in the life sciences industry, promoting seamless interaction with computational tools and fostering an environment where scientific and technological advancements coalesce. This presentation shares our vision for shaping the future of biomedical informatics where innovation, collaboration, and cutting-edge technologies converge to redefine the boundaries of what is possible in the realm of medicine.

Welcome Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)6:00 pm

Close of Day7:15 pm

Tuesday, April 16

Registration and Morning Coffee7:00 am

PLENARY KEYNOTE PROGRAM

8:00 am

Organizer's Remarks

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News

8:05 am Plenary Keynote Introduction

Josh Bond, Head of Product Management, Product Management, Revvity Signals

8:15 am PLENARY KEYNOTE PRESENTATION:

Unveiling Tomorrow's Possibilities: Embrace the Power of Digital Twins in Cancer Care and Research

Caroline Chung, MD, MSc, FRCPC, CIP, Vice President, Chief Data Officer, Director of Data Science Development & Implementation, Institute for Data Science in Oncology, MD Anderson Cancer Center

Explore the transformative potential of digital twins in revolutionizing cancer care and research. Gain insights into how digital twins can help deepen biological understanding, accelerate drug discovery, and personalize therapeutic strategies to optimize treatment outcomes for every individual. Amidst the exciting opportunities are the challenges that must be tackled to harness the power of digital twins to advance precision oncology, empower researchers and clinicians with unprecedented insights, and improve patient outcomes.

Coffee Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)9:30 am

Organizer's Welcome Remarks10:15 am

PLATFORM, TOOL, AND SOFTWARE SOLUTIONS TO IMPROVE DATA ANALYSIS, SCALABILITY, INTEGRATION, AND WORKFLOWS

10:20 am Chairperson's Remarks

Joyce Wang, PhD, CEO, Ontologic

10:25 am

Integrative Multiomics Data Analysis in Biopharma: Navigating Scalability, Complexity, and Integration in the Era of Big Data

Michael A. Freitas, PhD, Professor, Biological Chemistry and Pharmacology, Ohio State University

In an era where biopharma is inundated with volumes of multiomics data, the challenge extends beyond data collection to robust analysis, scalability, and effective integration. This presentation offers an overview of software solutions that merge cloud computing, data engineering, and custom workflows. Rooted in real-world challenges, we’ll explore the trajectory from raw data to actionable insights, ensuring data security, shareability, and enterprise adaptability along with insights gained from customer-led software development.

10:55 am

Unlocking Data Science Potential: Leveraging LLMs and Software Skills for Accelerated Workflows

Eric Ma, PhD, Principal Data Scientist, Moderna, Inc.

This talk explores the potential of data science through the strategic integration of Large Language Models (LLMs) and software skills. Explore how this symbiotic relationship accelerates workflows, facilitating efficient data analysis and interpretation. Join us to grasp the transformative impact of synergizing LLMs and software expertise, offering a gateway to enhanced productivity and innovation in the dynamic realm of data science.

11:25 am

Privacy-Preserving Federated Learning-as-a-Service: Building Trustworthy AI Models and Biomedical Insights

Ravi K. Madduri, Scientist, Computation Institute, University of Chicago

Federated Learning (FL) enables creation of more robust models without the exposure of local datasets. However, FL does not guarantee the privacy of data, because the information extracted and utilized to infer the private local data is used for training. We developed Advanced Privacy Preserving Federated Learning framework (APPFL), with advances in differential privacy, to enable Privacy-Preserving Federated Learning (PPFL). We enabled training of AI models in a distributed setting across multiple institutions, where sensitive data are located, with the ability to scale to help create robust, trust-worthy AI models in biomedicine applications where data privacy is essential.

11:55 am Swamps, Hallucinations and How to Avoid Them: Why Data Quality, Technology, and Expertise Matter

Frederik van den Broek, PhD, Senior Director, Professional Services and Consulting, Corporate R&D, Elsevier

Many believe Artificial Intelligence and Large Language Models will revolutionize life sciences and healthcare. With great expectations also come great risks of overlooking the foundations on which models are built. This increases the risk as in the past of large life science AI initiatives failing to deliver on expectations. This talk focuses on the three main pillars that serve as foundations for good applications and models: data quality, technology, and expertise.

12:25 pm Artificial Intelligence: PrimateAI-3D and SpliceAI for Precision Medicine and Drug Discovery

Kyle Kai-How Farh, MD, PhD, Vice President & Distinguished Scientist, Illumina Artificial Intelligence Lab, Illumina

PrimateAi-3D and Splice AI, leading algorithms from the Illumina Artificial Intelligence Laboratory, are disrupting how we approach drug discovery and precision medicine. In this session we will review the data illustrating the impact that these two algorithms have on our understanding of variants of unknown significance (VUS) and the non-coding region of the genome – which can no longer be thought of as “junk”. 

Session Break & Transition to Lunch12:55 pm

1:05 pm LUNCHEON PRESENTATION:Coding Against a Rare Pediatric Cancer: xCures & Slalom's IT Mastery in the Moonshot

Mika Newton, CEO, xCures Inc.

Jeff Pierce, Managing Director, Life Sciences, Slalom Inc.

This talk details the work to create the DIPG-Onelink website and a health-data platform designed to help rare cancers as a part of President Biden’s Cancer Moonshot Initiative, which aims to reduce cancer death rates by 50% over the next 25 years. Learn about the website and platform's impact in the DIPG community and how they exemplify the tangible benefits of collaborative efforts in healthcare technology.

Refreshment Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)1:35 pm

PLATFORM, TOOL, AND SOFTWARE SOLUTIONS TO IMPROVE DATA ANALYSIS, SCALABILITY, INTEGRATION, AND WORKFLOWS

Chairperson's Remarks (Sponsorship Opportunity Available)2:25 pm

2:30 pm

A FAIR-Compliant Cloud Computing Solution for Pharmaceutical R&D

Hugh Salamon, PhD, Bioinformatics and Data Science Strategist

A collaboratively architected and implemented flexible platform created by Nebulaworks cloud engineers and a pharma client established an environment to deliver bioinformatics services that ensures data analysis results and metadata are preserved. A code-first, easily deployable containerized technology design provides data services, analysis services, and AI/ML-readiness. The architecture enables bioinformatics staff to focus on data integration and analysis that support asset evaluation from discovery through to clinical development.

3:00 pm

Advanced Data Science and Visualization Platform to Expedite Clinical Development

Kanishk Singh, Product Manager, Drug Development Data & Analytics IT, Bristol Myers Squibb Co.

The field of Data Science is characterized by its rapid evolution, marked by daily developments that hold the potential to significantly enhance the expeditious conduct of clinical trials. In this context, the implementation of a dynamic data science platform built upon a polycloud architecture becomes indispensable, as it empowers data scientists to harness the cutting-edge capabilities offered by premier cloud service providers. Furthermore, the integration of advanced data visualization capabilities into this platform serves to foster seamless collaboration among researchers. Learn valuable insights from our experience in implementing these platforms within a large pharmaceutical company, where they are utilized by hundreds of individuals. Our presentation will shed light on how we've successfully deployed this dynamic platform to cater to our users' evolving requirements amidst a rapidly changing technological landscape. We will demonstrate our approach to fulfilling diverse experimentation use cases leveraging Multiverse/Polycloud Data Science Platform

3:30 pm

New Methods in Enzyme and Drug Characterization

Ryan Walsh, PhD, Research Scholar, Biochemistry, Ronin Institute

Biological sciences has a reproducibility problem related to the bottleneck created by the over simplification of biological modeling. This talk discuss methods developed to improve the analysis of biological interactions so that more nuanced understandings of physiological interactions can be  produced. Learn about problems that have been caused by incomplete understanding of drug interactions specifically related to Alzheimer's disease drug development. This will be illustrated by describing enzymatic mechanism of action studies and how these studies can benefit from automation of the analytical process.

4:00 pm Addressing Antibody Property Prediction at Scale

Thomas Blarre, PhD, Product Manager, Discngine

Recent advances in high-throughput antibody discovery from Next Generation Sequencing and Phage Display allow for a larger number of candidates to test, proportionally increasing the potential late-stage failures. Therefore, identifying liabilities, developability issues and immunogenicity early becomes essential. Here, we present an automatable, high-throughput, state-of-the-art, physics-based SaaS solution allowing to predict reliable 3D structures of various antibody types and derive 3D sensitive predictors for liabilities & developability.

4:15 pm Volume, Validity, and Value: Considerations When Using Full-Text XML in Text and Data Mining

Adam Churchill, Product Solutions Manager, Corporate Solutions, CCC (Copyright Clearance Center)

Text and data mining remains a hugely important part of research and development workflows. It’s important to consider the provenance of data, the type of data and what kind of outcome we are trying to achieve when choosing the XML content that will feed our workflows. In this talk, Adam Churchill will explore the use of full-text scientific articles in TDM workflows and some of the challenges that come with it.

Best of Show Awards Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)4:30 pm

Close of Day5:45 pm

Wednesday, April 17

Registration and Morning Coffee7:30 am

PLENARY KEYNOTE PROGRAM

8:00 am

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

8:05 am

Innovative Practices Awards

Joseph Cerro, Independent Consultant

John Conway, Chief Visioneer Officer, 20/15 Visioneers

Chris Dwan, Independent Consultant, Dwan, LLC

Allison Proffitt, Editorial Director, Bio-IT World and Clinical Research News

Since 2003, Bio-IT World has hosted an elite awards program with the goal of highlighting outstanding examples of how technology innovations and strategic initiatives are being applied to advance life sciences research. The 2024 Innovative Practices Awards winners represent excellence in innovation in the areas of informatics, pre-competitive collaboration, clinical and health IT, and genomics. Companies driving the winning entries include AstraZeneca, DNAnexus, Pistoia Alliance, Regeneron, Tempus, and UK Biobank.

8:20 am Plenary Keynote Introduction

Kshitij Kumar, Founder and CEO, Clovertex

8:30 am PLENARY KEYNOTE PRESENTATION:

Lights, Camera, Science: Film and Social Media Influence on Real-World Scientific Progress and Innovation

David Hewlett, Actor/Writer/Director; Creator, The Tech Bandits

Now, more than ever, life sciences are subject to misinterpretation, reduction, and inaccuracies at the hands of social media and Hollywood. And while it might be tempting to ignore the fake science streaming on YouTube and TikTok, there’s a generation of would-be investigators for whom those platforms might be their primary introduction to research and discovery. David Hewlett has had his share of big screen roles representing science—and science fiction—and he believes it’s imperative that the scientific and technology communities take back the narrative, filling gaps between what’s real and what could be real soon! He’s meeting this future generation where they are in schools, on YouTube, and on Twitch, championing real science in all its iterative, messy, exploratory glory, to recruit bright, diverse minds to lead the next generation of real scientists. He’s got our report from the front lines.

Coffee Break in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)9:45 am

Organizer's Remarks10:30 am

DRIVING DISCOVERY: DATA SCIENCE AT THE FOREFRONT OF BIOMEDICAL ADVANCEMENT

Chairperson's Remarks (Sponsorship Opportunity Available)10:35 am

10:40 am CO-PRESENTATION:

Enabling Early Research through Industry-University Collaboration

Brian Martin, Head of AI, R&D Information Research; Research Fellow, AbbVie, Inc.

Kayvan Najarian, PhD, Professor, Computational Medicine & Bioinformatics, University of Michigan

The level of innovation and research being performed globally at universities and research institutions is critical to the continued growth and success of innovation within the life sciences and healthcare industry as a whole. Join us for a discussion around an example collaboration as part of the Center for Data-Driven Drug Development and Treatment Assessment (DATA) at the University of Michigan, supported by the National Science Foundation. We will describe a powerful model of collaboration that enables high-risk/high-potential early research while preserving most intellectual property concerns and encouraging industry and academia-wide collaboration.

11:10 am

Integrating Human and Mammalian Model Data for Improved Human Health

Paul Flicek, DSc, Chief Data Science Officer, The Jackson Laboratory

The massive expansion of biological data collections in the last 20 years alongside dramatic developments in artificial intelligence has created an unprecedented opportunity for using data to integrate human and mammalian model research, unlock biological insights, and empower the biomedical community worldwide. To meet this challenge, Data Science at The Jackson Laboratory is leveraging our unique expertise in mammalian genetics to unlock and integrate robust, but currently unconnected or unavailable, life sciences datasets for data-driven discovery. This talk will discuss the data science opportunities for transforming how we incorporate model data into biomedical discovery.

Presentation to be Announced11:40 am

12:10 pm Leverage Genomic AI to Deliver a More Aaccurate and Comprehensive Genome

Min Li, PhD, Senior Director of Product Management, Illumina

Illumina is leading Genomic AI to understand the human genome and provide insights out of large-scale human genomic data. Cutting-edge AI algorithms developed by the Illumina Artificial Intelligence Laboratory have been integrated into the entire genomic workflow resulting in increased accuracy and efficiency in data analysis and interpretation. This integration holds immense potential to bolster drug discovery efficiency, provide deeper clinical insights and more accurate genetic risk prediction. 

12:40 pm Leveraging GenomOncology's Clinical Omics Platform: Seamless Integration, In-Depth Analysis, Strategic Data Utilization

Ian Maurer, Chief Technology Officer, GenomOncology

GenomOncology's dockerized, customizable, and scalable solutions streamline precision medicine data integration, analytics, and utilization across the healthcare ecosystem. In the dynamic landscape of precision medicine, evolving data presents challenges, and organizations lacking software that integrates, analyzes, and incorporates this data are disadvantaged. GenomOncology’s Precision Oncology Platform and suite of applications, backed by an extensive knowledge base and robust data framework, offer comprehensive clinical omics support, optimizing workflows and elevating patient care.

Session Break & Transition to Lunch1:10 pm

1:20 pm LUNCHEON PRESENTATION:How Data Management Informs Data Strategy: Perspectives Using OMERO Plus

Erin Diel, PhD, Head of Product, Glencoe Software

OMERO Plus is an image data management platform designed for storage and analysis of bioimaging data. Backed by Bio-Formats, OMERO Plus natively stores and retrieves image formats from various domains, including High Content Screening and Digital Pathology. Because complex image datasets and their associated sequencing, segmentation, or other analytical results can be stored and retrieved remotely, OMERO Plus is the data engine of choice for data analytics and AI in bioimaging.

Refreshment Break in the Exhibit Hall with Last Chance Poster Viewing (Sponsorship Opportunity Available)1:50 pm

TRENDS FROM THE TRENCHES

Chairperson's Remarks (Sponsorship Opportunity Available)2:30 pm

2:35 pm

Trends from the Trenches

Ari E. Berman, PhD, CEO, BioTeam, Inc.

Laura Boykin Okalebo, PhD, Senior Scientific Consultant, BioTeam, Inc.

Since 2010, “Trends from the Trenches” has been one of the most popular annual traditions in the Bio-IT program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the most worthwhile, and the most overhyped information technologies (IT) for life sciences. Learn about computing, storage, data transfer, networks, cloud, data science, machine learning, and more that are involved in supporting data-intensive science.

Close of Conference4:05 pm






Ways to Participate

Conference Tracks

Data Platforms & Storage Infrastructure