Research Projects

Projects

ANACIN-X: Analysis and modeling of Non-determinism andAssociated Costs in eXtreme scale applications
Source of Support:
National Science Foundation (NSF): CCF
Project Period:
Jun 1, 2019 - May 31, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project advances the study of nondeterministic HPC applications by studying therecording costs of Record-and-replay (R&R) tools and by defining strategy so that these tools can scale tothe exascale domain.
  • NSF REU Supplement, $16,000, single PI, 2019
Web Page:
Anacin-x
BIGDATA: IA: Collaborative Research: In Situ Data Analytics for Next Generation Molecular Dynamics Workflows
Source of Support:
National Science Foundation (NSF): IIS and Advanced Cyberinfrastructure (OAC)
Project Period:
Oct 1, 2017 – Sep 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This interdisciplinary project tackles the data challenge of data analysis of molecular dynamics simulations on the next-generation supercomputers. Specifically, this effort combines machine learning and data analytics approaches, workflow management methods, and high performancecomputing techniques to analyze molecular dynamics data as it is generated.
Web Page:
Analytics4MD
CIF21 DIBBs: PD: Cyberinfrastructure Tools for Precision Agriculture in the 21st Century
Source of Support:
National Science Foundation (NSF): Advanced Cyberinfrastructure (OAC)
Project Period:
Jul 1, 2017 – Jun 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This interdisciplinary project applies computer science approaches and computational resources to large multidimensional environmental datasets, and synthesizes this information into ecoinformatics, a branch of informatics that analyzes ecological and environmental science variables suchas information on landscapes, soils, climate, organisms, and ecosystems.
Web Page:
SOMOSPIE
Collaborative Research: PPoSS: Planning: Performance Scalability, Trust, and Reproducibility: A Community Roadmap to Robust Science in High-throughput Applications
Source of Support:
National Science Foundation (NSF)
Project Period:
Oct 1, 2020 – Sep 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
The project recruits a cross-disciplinary community working together in three virtual mini-workshops called virtual world cafes to define, design, implement, and use a set of solutions for robust science.
Web Page:
RobustScience
Augmenting Hatchet to support scalability and replicability solutions for HPC applications
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Aug 1, 2020 – Jul 31, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project uses Hatchet and its features (e.g. query language) to study scalability and replicability problems in applications of interest to LLNL at a large scale; and develops tooling to support the analysis and study of such problems to identify the source of the scalability and replicability problems.
Collaborative Research: EAGER: Advancing Reproducibility in Multi-Messenger Astrophysics
Source of Support:
National Science Foundation (NSF)
Project Period:
Aug 1, 2020 – Jul 31, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
The project provides the astrophysics community with a transformative building block to a roadmap for reproducible open science. Findings about the reproducibility process of the EHT and NICER results are captured and disseminated through documentation, data products, and methods used.
Leverage Containerized Environments for Reproducibility and Traceability of Scientific Workflows - the case study of Analytics for Neural Network Workflows
Source of Support:
Sandia National Laboratories
Project Period:
Jul 15, 2020 – Jul 14, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project builds a prototype of a containerized environment which encapsulates each component of a scientific workflow (i.e., data and applications) in individual container environment for transparent and automatic metadata collection and access, easy-to-read record trail, and tight connections between data and metadata.
Study Performance Portability of the Vector Particle-In-Cell Project (VPIC) across architectures
Source of Support:
Los Alamos National Laboratory
Project Period:
May 1, 2020 – Apr 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies aspect of performance portability associate to the Vector Particle-In-Cell Project or VPIC code across platforms by addressing questions such as “Is the execution of VPIC sensitive to new architectures on which it runs? How do we continue to extract as much performance as possible despite differences in hardware? What performance is lost when using a performance portability framework?”
Flux Scheduler Specializations: Improving Workflow Performance with Scheduler Structure and Policy Tuning
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Apr 1, 2020 – Mar 31, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies how features of a user-level, highly-configurable scheduler like Flux can best be leveraged to maximize workflow performance. The project aims to answer this question through the development of a model that tunes scheduler settings to maximize workflow performance even under conditions of system stress such as scheduler fragmentation and resource drains.
EAGER: Reproducibility in Computational and Data-Enabled Science-Paradigms, Practices, and Infrastructure
Source of Support:
National Science Foundation (NSF)
Project Period:
Aug 16, 2019 – Aug 15, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This project seeks to improve understanding of how the scientific community can adapt to the increasing use of computing and large-scale data resources. One challenge is ensuring that computational results -such as those from simulations- are "reproducible", that is, the same results areobtained when one re-uses the same input data, methods, software and analysis conditions. In 2019, the National Academies of Science, Engineering, and Medicine (NASEM) issued a report on "Reproducibility and Replication in Science" with a series of recommendations. The project will assess the implications of these recommendations on the scientific discovery process for computationally and data-enabled research.
JDRD: Empowering Training and Validation Stages in Al-Orchestrated Workflows
Source of Support:
Science Alliance - University of Tennessee, Knoxville
Project Period:
Oct 1, 2019 – Sep 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies AI-orchestrated workflows--workflows including experimental, computational, and data manipulation steps in one or multiple domains, where an important component is one or more neural networks (NN) used for searching or decision making. The project aim is to transform the process of training the NN in AI-orchestrated workflows from simulated data (clean, non-adversarial data) to deploying on real data (noisy, adversarial data) with the integration of mitigating methods.
Study of Data-intensive Workflows on Next-generation Systems with Emphasis on Memory Access
Source of Support:
Sandia National Laboratories
Project Period:
Aug 1, 2019 – Jul 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
The project designs and implements a C++ suite of data-intensive mini-applications to study data management costs with emphasis on memory access times and use, power consumption, and replicability.
Moving towards self-adjusting scheduling policies for high performance workflows with Flux’s fully hierarchical scheduling
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Feb 8, 2019 – Jan 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
The project tackles scheduler specializations by systematically studying fully hierarchical scheduling models with Flux and defining models supporting a given workflow to employ the best scheduler specialization strategy at runtime.
Driving Next-Generation Schedulers with Machine Learning-Based Application Patterns
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Aug 1 2018, 2018 – Jul 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
This project develops methods to identify and understand irregular HPC job patterns and integrates knowledge of these irregular HPC patterns into multi-objective schedulers. The work leverages results of a previous award from Lawrence Livermore National Laboratory.
Collaborative: EAGER: Exploring and Advancing the State of the Art in Robust Science in Gravitational Wave Physics
Source of Support:
National Science Foundation (NSF): Advanced Cyberinfrastructure (OAC) #1823372
Project Period:
May 31, 2018 – Apr 30, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
The project develops and uses a survey to collect information about LIGO workflows that are composed of a series of experimental, computational, and data manipulation steps.
Building a “miniature” version of the ORNL‘s Summit supercomputer for Computational Science Research at UTK
Source of Support:
2019 IBM Global University Program Shared University Research Award
Project Period:
Jun 21, 2019
Location of Project:
University of Tennessee, Knoxville
Description:
The award enabled the purchase of a supercomputer for computational science applicationsat UTK.