Is the burden of managing your data growing larger every day? Do you have a scalable and robust data management infrastructure in place to process, analyze, and store vast quantities of data according to your organization's policies? Is your organization
using new tools and analytical processes such as AI and deep learning that stress your supporting IT infrastructure beyond the expectations of system designers? Managing data has become a prevalent issue in the life sciences industry. Organizations
are spending millions on systems and platforms to manage and store many types of data (e.g., experimental, operational, clinical) from many disparate sources. The role of data engineering is critical in orchestrating, configuring, managing, and scaling
solutions to manage the data bloat problem. The Data & Storage Management track presents in-depth case studies from leading life sciences organizations who are implementing solutions to address these data issues. Presentations
will focus on people, process and technology issues related to storage platforms, architectures, integration and migration plans, governance, collaboration, scalability and cost efficiencies.
Final Agenda
Tuesday, April 16
7:00 am Workshop Registration Open and Morning Coffee
8:00 – 11:30 Recommended Morning Pre-Conference Workshops*
W5. Managing Sensitive and HIPAA-Controlled Data with Globus
12:30 – 4:00 pm Recommended Afternoon Pre-Conference Workshops*
W12. Data Science Driving Better Informed Decisions
* Separate registration required.
2:00 – 6:30 Main Conference Registration Open
4:00 PLENARY KEYNOTE SESSION
Amphitheater
5:00 – 7:00 Welcome Reception in the Exhibit Hall with Poster Viewing and Meet the Experts: Plenary Keynote Speaker
Wednesday, April 17
7:30 am Registration Open and Morning Coffee
8:00 PLENARY KEYNOTE SESSION
cambridge Complex
9:45 Coffee Break in the Exhibit Hall with Poster Viewing
10:50 Chairperson's Remarks
Hongliang Tang, Senior Director and Chief Architect, Huawei American Storage Research Lab, Futurewei Technologies, Inc.
11:00 DB4Sci, Open Source Database as a Service (DBaaS) for On-Prem and Cloud
John Dey, HPC Systems Engineer, Scientific Computing, Fred Hutchinson Cancer Research Center
Cloud based databases as a service (DBaaS) have extremely simplified database management. We can create database instances using best practice configuration including backup and DR plans with a single push of a button. However, databases are sensitive
to latency and cloud-based databases cannot be used effectively from on-prem. Supporting Postgres, MongoDB, MariaDB/MySQL and Neo4J graph databases DB4Sci is the ideal DBaaS solution for on-premise and multi-cloud deployments that supports high performance
backup to cloud storage. The audience will learn how one can deploy a very robust and fast database service with a simple architecture. At its core db4sci is a rather simple Python-Flask app that uses docker commands to manage database instances in
containers. The simplistic architecture is intentionally not designed around enterprise features such as High-Availability (HA) and business continuity. Instead we focus on our ability to recover from disasters (DR). Data is backed up to cloud storage
at regular intervals and can be restored by an administrator or by the end user, for example on a server in a different cloud. We can demonstrate that users can be back in business within a few minutes after a major failure.
11:30 Data Centralization for Any Lab, Any Equipment, Any Software
Charles Fracchia, Founder and CEO, BioBright
Jarrod Medeiros, Director of Informatics and IT, Casma Therapeutics
It’s all too easy to end up with cloud infrastructures that mirror the shortcomings of local data management. In this talk, we will present how carefully designed software can make data available seamlessly, removing the need for scientists to dig
through disparate systems to find what they need and analyze it. We will present a new model that allows data centralization and cloud-based data analysis while minimizing the burden on the scientist. We will share concrete use-cases for how to effectively
migrate to a data-centric workflow that takes advantage of cloud storage and analytics. Attendees will leave with five steps to help plan and evaluate their approach to cloud data management: 1) examining the current flow of data, 2) finding out the
scientific needs for data, 3) calculating storage needs for seamless scale up, 4) connecting the dots between storage and analysis, and 5) designing for future integrations/growth.
12:00 pm Architecting for Success with Machine Learning Data Platforms for Image Analysis and Precision Medicine
William Beaudin, Director of Solutions Engineering, DDN Storage
Aspects of precision medicine, including automated image analysis or mining patient data to better target therapies, leverage AI and deep learning. While early training data fits in-node, successful approaches attract more data. Forward thinking organizations
adopt scalable architectures; the unprepared fall behind. We review key considerations for machine-learning platforms ensuring effortless scaling, deeper insights, and shorter path to value.
12:15 Internet2: Leveraging Distributed Resources to Speed Discovery
Dan Taylor, Director, Business Development, Internet2
Few Life Sciences organizations take advantage of the vast resources available to R&D organizations for continuous innovation and keeping pace with big data. This session will discuss the infrastructure underlying collaborations that use private,
academic and public resources – including commercial cloud and supercomputing centers storage and processing - to maximize options and speed discovery.
12:30 Session Break
12:40 NEW: Luncheon Co-Presentation I: Accelerating Life Sciences Workflows Using Software Defined Storage
David Hiatt, Director, Product Marketing and Business Development, WekaIO
Chris Dagdigian, Co-Founder and Senior Director, Infrastructure, BioTeam, Inc.
In this presentation we will compare the results of Cryo-EM and genomic pipelines run on a traditional storage architecture to those run on a modern scale-out storage system. See how the modern scale-out system can meet the mixed workload challenges of
life sciences and outperform the storage system for the largest supercomputer in the world.
1:10 Luncheon Presentation II: Accelerate Precision Medicine with High Performance Data and AI
Frank Lee, PhD, Global Industry Leader for Healthcare and Life Sciences - IBM Systems
Get your data and apps ready for precision medicine and research in the multicloud era, to derive faster insights with high performance data and AI architecture. Join Frank Lee, PhD, Global Industry Leader for Healthcare and Life Sciences, as he presents
real-life use cases and best practices for high performance genomics and imaging with deep learning that will help you deliver new records for speed and scale, cost efficiencies, collaboration and ease of use.
1:40 Session Break
1:50 Chairperson’s Remarks
Brigitte Raumann, Product Manager, Globus, University of Chicago
1:55 Achieving Compliant Collaboration: Securely Managing Protected Data to Accelerate Discovery
Brigitte Raumann, Product Manager, Globus, University of Chicago
Researchers working with protected data -- such as HIPAA-regulated data and controlled unclassified information -- face many challenges in managing this data and sharing it with colleagues. Meeting compliance requirements is complicated, and investigators
must often either slow their process to address this burden, or resort to using distilled, de-identified data instead. With higher assurance levels provided by Globus, the leading research data management service, users can optimize their protected
data environments by integrating secure, scalable data management capabilities into existing workflows and applications. attendees will learn about features and enhancements to the Globus service that make it possible to manage protected data in a
compliant manner, and will gain an understanding for how their organization can benefit from these features.
2:25 Research, Privacy and Security
Kris Torgerson, Chief Information and Privacy Officer, Oak Ridge National Laboratory
Research, Privacy, and Security… can they coexist? How to enable research, influence outcomes, and protect the mission responsibly. In a world where well-funded bad actors are actively working to own your data, what are strategies to minimize risk?
2:55 Solving Genomic Data Privacy in the Age of AI
Esteban Rubens, Global Principal, Enterprise Imaging Healthcare, Pure Storage
Health data protection is of paramount importance, with all stakeholders in the healthcare industry looking to adopt AI to improve patient care. We will provide examples of an API-driven Data Hub solution that enables life-science & healthcare organizations
to leverage the advancements of AI to help improve diagnoses, find better treatments, and discover new drugs while protecting confidential patient information.
3:25 Refreshment Break in the Exhibit Hall with Poster Viewing, Meet the Experts: Bio-IT World Editorial Team, and Book Signing with Joseph Kvedar, MD, Author, The Internet of Healthy Things℠ (Book will be available
for purchase onsite)
4:00 Infrastructure Automation: Real Examples
Karl Gutwin, Senior Scientific Consultant, BioTeam, Inc.
A wide variety of infrastructure automation tooling has existed in many forms for many years; however, it has yet to achieve consistent presence within our day-to-day systems and processes. Automation, when successful, has the potential to measurably
improve clarity, reliability and capacity for engineering and operations teams. This talk will walk through the most prevalent automation tools that I have seen in practice, and give real, working examples that you can take back to your office and
try yourself. We’ll cover the why, the how and what could possibly go wrong when using automation for your IT infrastructure.
4:30 A Novel Psychiatric Registry System and Its Utilization for Clinical and Pharmaceutical Research
András London, PhD, Assistant Professor, University of Szeged
The development of medical IT systems has opened new opportunities and challenges in many fields of healthcare. One goal is to create a "learning health system" that incorporate data from patients, clinicians, laboratories, and many other information
sources to translate information to knowledge. There has been a continuously growing demand to create patient registries where the collected data is readily applicable for statistical analysis using both standard and advanced methods, such as machine
learning. In spite of the wide-range applicability of registry databases, the development and spread of them is yet highly limited due to the significant additional extra effort needed besides (e.g., the daily patient care and other administrative
obligations). A possible solution to the problem can be the integration of patient registries with the standard EHR patient administration systems. In this talk we present our experiences through the development of a psychiatric registry, its integration
to patient administration systems and data mining to investigate the effects of negative symptoms of schizophrenia.
5:00 Managing Genomic Data with Regional Encryption for Efficient Storage, Regulated Access and Proven Compliance
Dan Greenfield, PhD, Co-founder & CEO, PetaGene
PetaGene has added encryption and data management to its award-winning compression. This enables organizations to manage access to their genomic data by internal and external teams, secured with fine-grain regional encryption and deep auditing of data
usage. Moreover, this is done in a manner transparent to existing tools and pipelines and integrates with existing on-premises and cloud storage infrastructure.
5:30 Best of Show Awards Reception in the Exhibit Hall with Poster Viewing
Thursday, April 18
7:30 am Registration Open and Morning Coffee
8:00 PLENARY KEYNOTE SESSION & AWARDS PROGRAM
Amphitheater
9:45 Coffee Break in the Exhibit Hall and Poster Competition Winners Announced
10:30 Chairperson’s Remarks
Bill Fox, Vice President, Vertical Strategy & Chief Strategist, Healthcare & Life Sciences, MarkLogic
10:40 The Evolution of DNA Encoded Library Data Management: Lessons Learned Along the Way
Neil Carlson, Investigator, Medicinal Science & Technology, GSK Cambridge R&D
GSK’s DNA Encoded Library Technology (ELT) generates hundreds of millions of sequences each week as the primary readout for analysis. We have developed a robust data management, tracking and delivery platform to meet the often-changing needs of
our diverse user base. This talk will review the 12-year evolution of our informatics platform, focusing on the specific challenges we've faced and how we've chosen to address them. Attendees will benefit from the lessons we've learned
and will be able to apply these learnings when designing their own storage and delivery solutions.
11:10 The Usage of DNA Encoded Libraries to Predict Target Tractability: Application of the Informatics Platform
Ken Lind, PhD, Computational Chemist, GSK Cambridge R&D
Recent advances in both genome-wide screening and genome-wide analyses have enabled the identification of numerous putative therapeutically relevant targets for hit identification programs. Pursuing all of these targets in small molecule hit ID programs
is neither feasible nor warranted. We have developed and deployed Encoded Library Technology (ELT) protocols that rapidly predict the small molecule tractability of novel targets. Attendees will learn how we leverage our analysis platform to quickly
prioritize novel therapeutically relevant targets and focus small molecule hit identification efforts on those that are most likely to succeed.
11:40 "Data Wars" What R&D Organizations Need to Do In Order to Survive The Near Future
John F. Conway, Global Head of R&D&C IT, Science and Enabling Units IT, AstraZeneca
R&D organizations, from startup to mature need to quickly transform a culture around Data, Information, and Knowledge as an Asset and Emulate a Data company. R&D organizations need improved stringency from data capture to contextualization to
reuse. The FAIR principles are criteria to measure success in the journey, but it starts with a written scientific data strategy that outlines the what, the who and the how from a change management and cadence perspective. Simply put we have to stop
treating our data like trash but instead as another form of currency that has immense value.
12:10 pm Session Break
12:20 Luncheon Co-Presentation: Building a Modern Research Data Hub
Bill Fox, Vice President, Vertical Strategy & Chief Strategist, Healthcare & Life Sciences, MarkLogic
Imran Chaudhri, Chief Architect, Healthcare & Life Sciences, MarkLogic
One of the primary challenges in transforming real world data into valuable real world evidence lies in its diverse format and structure. The need to extract greater value from this multi-structured data is compelling pharmaceutical companies to move
away from outdated and siloed IT infrastructures in favor of more agile, modern data management solutions. In this discussion, you will learn how the MarkLogic data hub framework empowering many of the top global pharmaceutical companies to quickly
breakthrough data silos to build innovative applications at less time and cost. To illustrate, the talk will also include a demo of a pharmacovigilance application built for a Top 10 pharma by just two people in four weeks.
12:50 Session Break
1:20 Dessert Refreshment Break in the Exhibit Hall with Poster Viewing
1:55 Chairperson’s Remarks
Chris Dwan, Senior Technologist and Independent Life Sciences Consultant
2:00 PANEL DISCUSSION: High Performance Consultancies
Moderator:
Chris Dwan, Senior Technologist and Independent Life Sciences Consultant
Panelists:
Tanya Cashorali, CEO, Founder, TCB Analytics
Aaron Gardner, Director of Technology, BioTeam, Inc.
Eleanor Howe, PhD, Founder and CEO, Diamond Age Data Science
An organization must learn and understand the value of why, when and how to use a consultancy. Highly trained and skilled professional experts gather to discuss their role in leading and managing projects for organizations to help them achieve goals.
They will discuss a variety of themes including the best kinds of projects to hire a consultancy for, the timeline of when an organization should hire a consultant vs. full time staff, and big challenges on the horizon. The session will feature short
podium presentations, followed by a moderated Q&A panel with attendees. The topic of hiring a consulting company came up in the data science plenary keynote at Bio-IT 2018. We want to spend time at Bio-IT 2019 exploring this topic in finer detail.
3:20 KEYNOTE PRESENTATION: Trends from the Trenches 2019
Chris Dagdigian, Co-Founder and Senior Director, Infrastructure, BioTeam, Inc.
The “Trends from the Trenches” in its original “state of the state address” returns to Bio-IT! Since 2010, the “Trends from the Trenches” presentation, given by Chris Dagdigian, has been one of the most popular
annual traditions on the Bio-IT Program. The intent of the talk is to deliver a candid (and occasionally blunt) assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. The presentation has
helped scientists, leadership, and IT professionals understand the basic topics related to computing, storage, data transfer, networks, and cloud that are involved in supporting data intensive science.
4:00 Conference Adjourns