Valerio Pascucci

Valerio Pascucci
University of Utah

Time: Friday, March. 20 12:30 PM - 1:30 PM      Location: MKB 622

The National Science Data Fabric and its evolution into a federated omnicloud for AI development

Abstract:

Modern science, with its explosion of data from instruments, sensors, and simulations, urgently requires a national cyberinfrastructure to break down barriers to discovery and ensure broad access. This presentation introduces the National Science Data Fabric (NSDF), a production-grade, national-scale cyberinfrastructure. NSDF is designed to deliver peta-scalable FAIR data services, seamlessly integrating them into computing resources both from public clouds and High-Performance Computing (HPC) environments. NSDF has successfully supported a diverse range of scientific endeavors, including experiments at the Cornell Synchrotron Light Source, Dark Matter studies utilizing SLAC data, weather and climate research at NASA, support for autonomous laboratories at DOE national labs, and the development of AI tools for advanced precision surgery. The talk will also discuss how the NSDF tools and experience allowed the introduction of "ATLAS—Advanced Training and Learning at Scale" a national omnicloud for complete AI development in medicine. ATLAS, which involves over 30 institutions, features a comprehensive cyberinfrastructure offering the following services:

  • Data Marketplace: A complete representation of the US population's data.
  • Computing and Storage: Virtually unlimited growth potential in computing and storage, leveraging hyperscalers like Google and leadership computing environments such as TACC and ALCF.
  • AI Tools and Services: A full suite of tools for data searching, data deduplication and harmonization, annotation services for AI training, and clinical and regulatory consulting.
  • Federated Learning Framework: The ATLAS system enables the use of images from multiple hospitals and secure enclaves without requiring any data to be moved outside the institution of origin.
The talk will close with a few considerations on how to reconnect ATLAS to the original NSDF design and make most of its services domain-agnostic so they can be applied to other applications with societal impact, such as manufacturing and materials science.

Bio:

Valerio Pascucci is the founding Director of the Center for Extreme Data Management Analysis and Visualization (CEDMAV), a Faculty of the Scientific Computing and Imaging Institute (Inaugural John R. Parks Endowed Chair), a Professor of the School of Computing of the University of Utah, and DOE Laboratory Fellow of PNNL. Valerio has received the 2022 IEEE VGCT Visualization Technical Achievement Award and the 2022-2023 Distinguished Research Award (DRA) from the University of Utah and has been inducted into the IEEE VGTC Visualization Academy in 2022. Valerio is also the President of ViSOAR LLC, a University of Utah spin-off, and the founder of Data Intensive Science, a 501(c) nonprofit providing outreach and training to promote the use of advanced technologies for science and engineering. Valerio's research interests include Big Data management and analytics, progressive multi-resolution techniques in scientific visualization, discrete topology, and compression. Valerio is the coauthor of more than three hundred refereed journal and conference papers and was an Associate Editor of the IEEE Transactions on Visualization and Computer Graphics.