2024 ARCHIVES

FAIR Data

Accelerate Biomedical Discovery with FAIR Data Resources and Best Practices

April 15, 2024 ALL TIMES EST

The FAIR Data symposium is expected to shed light on the FAIR landscape of data repositories and knowledgebases that form the foundations of biomedical, behavioral, and health-related research facilitating data sharing, analysis, and knowledge generation. Participants can expect to gain an understanding of the various resources and their unique value in the larger universe of data resources along with the services they provide and collective impact on accelerating scientific discovery to improve healthcare outcomes, and empowering researchers and practitioners across domains. Additionally, the Symposium provides an opportunity to network and identify collaborations to develop common capabilities and infrastructure amongst data resources.

Sunday, April 14

Registration Open5:00 pm

Monday, April 15

Registration and Morning Coffee7:00 am

Organizer's Remarks8:00 am

8:05 am

Chairperson's Remarks

Ishwar Chandramouliswaran, Program Director, Office of Data Science Strategy, NIH

8:10 am

How FAIR is FAIR Enough?

Vinay C. Desai, PhD, MBA, Senior Director Regeneron IT, Regeneron Pharmaceuticals, Inc.

Michael Georgiadis, Principal Scientific Business Analyst, Research IT, Regeneron Pharmaceuticals, Inc.

Michael Livstone, PhD, Scientific Data Curation Lead, Regeneron Pharmaceuticals, Inc.

John McLoughlin, Associate Director IT, Regeneron Pharmaceuticals, Inc.

We have begun an effort to make our data FAIR, building from the principle that metadata should be collected once ("born FAIR") and then transmitted wherever it is needed. We weighed the pros and cons of several approaches against business needs and adopted an Agile approach to building and evolving FAIR systems for instrument files. One major consideration was how much FAIRness is required to achieve adequate, scalable usability.

8:30 am

Data Repository Attributes—FAIR Repositories

Michael Witt, Head, Distributed Data Curation Center, Purdue University

One important step towards achieving FAIR data is the development and improvement of data repositories to be findable, accessible, and interoperable. The recent recommendation from the Research Data Alliance, Common Descriptive Attributes of Research Data Repositories, provides guidance to enable well-described repositories to support researchers, funders, publishers, repository developers and managers, registries, and other stakeholders, including both users and user agents.

8:50 am

Putting FAIR into Practice: It Takes a Village

Susanna-Assunta Sansone, PhD, Professor of Data Readiness, Department of Engineering Science; Academic Lead for Research Practice, University of Oxford

The FAIR Principles have succeeded to unite stakeholders worldwide behind a common concept: good data management under common standards. However, FAIR is aspirational, and the narrative principles are insufficient to circumscribe the valid mechanisms to achieve the behaviours they describe. This presentation provides an overview of a community-driven resource, and a community task force showing how they contribute to enabling FAIR compliance and turn FAIR into reality.

9:10 am

Systematic Annotation of Bioassay Protocols Using Ontologies

Alex Clark, PhD, Research Scientist, Research Informatics, Collaborative Drug Discovery

We have previously demonstrated retroactive markup of assay protocols with the BioAssay Express and DataFAIRy projects using a hybrid machine learning/expert curation strategy. Going forward it is preferable to generate machine readable data at the point of data creation, and we have integrated this functionality into the CDD Vault software-as-a-service product. We will discuss how user interface improvements can make metadata easier to create than raw text.

Networking Coffee Break9:40 am

10:00 am

The Elixir FAIR Cookbook: Turning FAIR into Practice

Philippe Rocca-Serra, PhD, Senior Director FAIR Collaborations R&D, AstraZeneca, Cambridge UK; Associate Member of Faculty, Oxford e-Research Centre, University of Oxford

Created by data managers, professionals in academia, (bio)pharmaceutical companies, and information service industries, the FAIR Cookbook is an online resource of hands-on recipes that guides researchers and data stewards in their FAIRification journey. It also provides policy-makers and trainers with practical examples to recommend in their guidance and use in their educational material. Part of the ELIXIR ecosystem, this resource is open to contributions of new recipes.

10:20 am

FAIR for Machine Learning; Building on the Lessons from FAIR Software

Fotis Psomopoulos, PhD, Senior Researcher, INAB|CERTH

Ensuring that data are FAIR is nowadays a clear expectation across all science domains, as a result of many years of global efforts. Research software, has only just started to receive the same level of attention in recent years, with targeted actions towards the definition of the FAIR principles as applied to research software, as well as concerted efforts around reproducibility, quality, and sustainability. Given the rapid rise of ML as a key technology across all science domains, it is important to build on our collective experience, and already start addressing the challenges ahead of us, towards making ML FAIR.

10:40 am

Implementing FAIR Biomedical Research Software

Bhavesh Patel, PhD, Associate Research Professor, FAIR Data Innovations Hub, California Medical Innovations Institute

Research software such as data analysis tools and AI models have become an essential part of biomedical research. Making them FAIR is therefore critical to enable the reproducibility of research results, prevent duplicate efforts, and ultimately increase the pace of discoveries. In this talk, we discuss what it means to make software FAIR and present the FAIR Biomedical Research Software (FAIR-BioRS) guidelines, which are actionable guidelines for making software FAIR.

11:00 am

FAIR—Alliance of Genomic Resources

Paul Sternberg, PhD, Bren Professor of Biology, Biology & Biological Engineering, California Institute of Technology

The Alliance of Genome Resources is a consortium of model organism knowledgebases (MODs) and the Gene Ontology Consortium whose goals are to use biocuration coupled with AI/ML to make information computable and FAIR, to support comparative genomics, and to promote sustainability of core community data resources. We develop widely-used ontologies, make shared instances of common tools and processes, and present both integrative and organism-centric views of genetic and genomic data. Our progress and challenges in making complete data lifecycle FAIR will be discussed.

11:20 am

Developing a US National PID Strategy

Todd Carpenter, Executive Director, National Information Standards Organization (NISO)

John Chodacki, Director, University of California Curation Center (UC3)

This presentation outlines a national strategy for integrating persistent identifiers (PIDs) and metadata into research following OSTP guidelines for research integrity. Co-led by Todd Carpenter (NISO) and John Chodacki (CDL/RDA-US), we'll discuss consensus-building for PIDs, promoting best practices, and recommending PIDs for specific uses. The goal is to enhance metadata quality, ensure seamless data flow across platforms, and potentially set a National Standard, benefiting a broad spectrum of research stakeholders.

Luncheon Presentation (Sponsorship Opportunity Available) or Enjoy Lunch on Your Own11:40 am

Session Break12:40 pm

12:55 pm

Chairperson's Remarks

Ishwar Chandramouliswaran, Program Director, Office of Data Science Strategy, NIH

1:00 pm

Figshare FAIR Best Practices

Dan Valen, Head of Strategic Development, Figshare

Figshare is a flexible generalist repository that allows researchers to FAIR-ly share any research output in a trusted repository so it is discoverable and reusable. As part of the NIH Generalist Repository Ecosystem Initiative (GREI), Figshare has been working together with other repositories to enhance its interoperable metadata, use of persistent identifiers, user interface, search capabilities, and metrics reporting to support both the sharing and discovery of FAIR open data.

1:20 pm

FAIR and Compliant: A Blueprint for Collaborating on Protected Data

Rachana Ananthakrishnan, Executive Director, University of Chicago

The explosion of the amount of data coming off instruments, new research data sharing policy requirements for publication of scientific data, and the availability of a wide diversity of storage systems contribute to the increased demands on system administrators in research computing. Globus (globus.org) is a comprehensive platform for research IT which includes data description and discovery, protected data management, and automation. The platform balances findability and privacy, and improves the users’ experience through service offerings which abstract away system complexities and reduce the obstacles in data management, while enabling access to remote computing.

1:40 pm

Building on a FAIRly Strong Foundation to Connect Academic Research to Translational Impact

Jack DiGiovanna, PhD, CSO, Velsera

Making data and analytics FAIR has transformative potential within organizations to build on existing knowledge. FAIR resources also democratize access to information and tools in underserved communities. Global standards and analysis platforms provide strong foundational elements. However, FAIRness across time and different sectors of the biomedical workforce presents challenges. Here we summarize how platforms make data and analysis FAIR today and what we see as key areas of future focus.

Networking Refreshment Break2:00 pm

2:20 pm

RDMKit Alliance

Munazah Andrabi, PhD, Data & Community Manager, The University of Manchester

As the scientific community strives for FAIR data, good research data management (RDM) is becoming crucial. The ELIXIR RDMkit, is a framework for best practices for RDM, acting as a hub of RDM information. It offers links to registries for tools, training materials, standards, and databases, as well as to services offering deeper knowledge for FAIRification practices. Guided by a community-driven model, RDMkit has contributions from data stewards, researchers, and RDM experts from all areas of biomedical sciences. In this talk, we will present the RDMkit—its aims, content, community, and potential prospects for the RDMkit Alliance, a global cooperation.

2:40 pm

Why Industry Should Care: Boosting Research Efficiency with FAIR Data

Juergen Harter, PhD, CEO, The Cambridge Crystallographic Data Centre (CCDC)

The Cambridge Crystallographic Data Centre (CCDC) maintains the Cambridge Structural Database (CSD), a trusted repository of 1.2M+ experimental 3D structures used by academics and researchers in pharmaceutical, agrochemical, and fine chemical industries. FAIR data concepts have driven our approach to sustaining this critical global resource. We will share our FAIR journey and reflect on the value and importance of the FAIR data principles to the life sciences industry at large.

3:00 pm PANEL DISCUSSION WITH SYMPOSIUM SPEAKERS FROM THE MORNING:

FAIR Resources

PANEL MODERATORS:

Ishwar Chandramouliswaran, Program Director, Office of Data Science Strategy, NIH

Nick Lynch, PhD, Founder & CTO, Curlew Research; Member, FAIRplus Consortium

PANELISTS:

Paul Sternberg, PhD, Bren Professor of Biology, Biology & Biological Engineering, California Institute of Technology

Bhavesh Patel, PhD, Associate Research Professor, FAIR Data Innovations Hub, California Medical Innovations Institute

Fotis Psomopoulos, PhD, Senior Researcher, INAB|CERTH

Susanna-Assunta Sansone, PhD, Professor of Data Readiness, Department of Engineering Science; Academic Lead for Research Practice, University of Oxford

Philippe Rocca-Serra, PhD, Senior Director FAIR Collaborations R&D, AstraZeneca, Cambridge UK; Associate Member of Faculty, Oxford e-Research Centre, University of Oxford

Michael Witt, Head, Distributed Data Curation Center, Purdue University

3:40 pm PANEL DISCUSSION WITH SYMPOSIUM SPEAKERS FROM THE AFTERNOON:

FAIR Platform

PANEL MODERATORS:

Ishwar Chandramouliswaran, Program Director, Office of Data Science Strategy, NIH

Nick Lynch, PhD, Founder & CTO, Curlew Research; Member, FAIRplus Consortium

PANELISTS:

Munazah Andrabi, PhD, Data & Community Manager, The University of Manchester

Jack DiGiovanna, PhD, CSO, Velsera

Juergen Harter, PhD, CEO, The Cambridge Crystallographic Data Centre (CCDC)

Rachana Ananthakrishnan, Executive Director, University of Chicago

Dan Valen, Head of Strategic Development, Figshare

Close of Symposium4:20 pm

Transition to Plenary Keynote4:20 pm

4:30 pm

Organizer's Remarks

Cindy Crowninshield, Executive Event Director, Cambridge Healthtech Institute

4:35 pm

Plenary Keynote Introduction

Greg Mazzu, Regional Sales Manager, WEKA

4:45 pm PLENARY KEYNOTE PRESENTATION:

Unleashing the Power of Advanced Computing in Biomedical Informatics: A Vision for Transformative Collaboration

Daniel Stanzione, PhD, Executive Director, Texas Advanced Computing Center (TACC)

In the dynamic intersection of life science and computing, our mission at the Texas Advanced Computing Center (TACC) is to propel biomedical informatics into a new era of discovery and innovation. As computational leaders, we are dedicated to harnessing the potential of high-performance computing (HPC), machine learning (ML), and data analytics to revolutionize medicine. In this visionary pursuit, we prioritize the development of user-friendly interfaces and intuitive platforms. This approach ensures accessibility for executives and leaders in the life sciences industry, promoting seamless interaction with computational tools and fostering an environment where scientific and technological advancements coalesce. This presentation shares our vision for shaping the future of biomedical informatics where innovation, collaboration, and cutting-edge technologies converge to redefine the boundaries of what is possible in the realm of medicine.

Welcome Reception in the Exhibit Hall with Poster Viewing (Sponsorship Opportunity Available)6:00 pm

Close of Day7:15 pm