Research Projects

Projects

SHF: Small: Methods, Workflows, and Data Commons for Reducing Training Costs in Neural Architecture Search on High-Performance Computing Platforms
Source of Support:
National Science Foundation (NSF)
Project Period:
Oct 01, 2022 - Sep 30, 2025
Award Number:
2223704
Location of Project:
University of Tennessee, Knoxville
Description:
This project addresses the urgent need to reduce the use of high-performance-computing resources for the training of neural networks, while assuring explainable, reproducible and nearly-optimal neural networks.
Web Page:
A4NN
OAC: Piloting the National Science Data Fabric: A Platform Agnostic Testbed for Democratizing Data Delivery
Source of Support:
National Science Foundation (NSF)
Project Period:
Oct 01, 2021 - Sep 30, 2024
Award Number:
2138811
Location of Project:
University of Tennessee, Knoxville
Description:
This project aims to build a National Science Data Fabric (NSDF), a testbed experimenting with critical technology needed to democratize data-driven sciences by constructing a CI platform designed for equitable access. NSDF connects storage, compute, and networking components with a software stack that empowers end-users with scalable tools that are easy to use, integrate and scale. Community-driven education and outreach will guarantee equitable access to all resources and engage an open network of universities, including minority-serving institutions in a federated data fabric configurable for individual and shared scientific use.
Web Page:
NSDF
SENSORY: Software Ecosystem for Knowledge Discovery - a Data-Driven Framework for Soil Moisture Applications
EAGER: A Comprehensive Approach for Generating, Sharing, Searching, and Using High- Resolution Terrain Parameters
Source of Support:
National Science Foundation (NSF)
Project Period:
Jun 01, 2021 - May 31, 2024
Nov 01, 2023 - Sep 30, 2025
Award Number:
2103845, 2334945
Location of Project:
University of Tennessee, Knoxville
Description:
This project connects multi-disciplinary advances across the scientific community (such as generating datasets at scale and supporting cloud-based cyber-infrastructures) to develop a data-driven software ecosystem for analyzing, visualizing, and extracting knowledge from the growing data collections (from fine-grained, in situ soil sensor information to coarse-grained, global satellite measurements) and releasing this knowledge to applications in environmental sciences.
Web Page:
SOMOSPIE
ANACIN-X: Analysis and Modeling of Non-determinism and Associated Costs in eXtreme Scale Applications
Source of Support:
National Science Foundation (NSF): CCF
Project Period:
Aug 1, 2019 - Jul 31, 2022
Award Number:
1900888
Location of Project:
University of Tennessee, Knoxville
Description:
This project advances the reproducibility study of HPC applications by proposing an open-source modular framework for automatic measurement, analysis, and visualization of non-determinism and root causes of non-determinism in MPI applications.
Web Page:
Anacin-x
Analytics for Molecular Dynamics (A4MD)
Source of Support:
National Science Foundation (NSF): IIS and Advanced Cyberinfrastructure (OAC)
Project Period:
Jun 1, 2018 – Sep 30, 2022
Award Number:
1841758
Location of Project:
University of Tennessee, Knoxville
Description:
This interdisciplinary project tackles the data challenge of data analysis of molecular dynamics simulations on the next-generation supercomputers. Specifically, this effort combines machine learning and data analytics approaches, workflow management methods, and high performancecomputing techniques to analyze molecular dynamics data as it is generated.
Web Page:
Analytics4MD
Leveraging Kokkos Abstractions to Automate Checkpointing
Source of Support:
Argonne National Laboratory ANL
Project Period:
May 01, 2021 - Apr 30, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project focuses on what patterns the memory abstractions used by Kokkos form and how they can be efficiently captured and persistent with VELOC.
CIF21 DIBBs: PD: Cyberinfrastructure Tools for Precision Agriculture in the 21st Century
Source of Support:
National Science Foundation (NSF): Advanced Cyberinfrastructure (OAC)
Project Period:
Jul 1, 2017 – Oct, 2018
Award Number:
1724843
Location of Project:
University of Tennessee, Knoxville
Description:
This interdisciplinary project applies computer science approaches and computational resources to large multidimensional environmental datasets, and synthesizes this information into ecoinformatics, a branch of informatics that analyzes ecological and environmental science variables suchas information on landscapes, soils, climate, organisms, and ecosystems.
Web Page:
SOMOSPIE
Collaborative Research: PPoSS: Planning: Performance Scalability, Trust, and Reproducibility: A Community Roadmap to Robust Science in High-throughput Applications
Source of Support:
National Science Foundation (NSF)
Project Period:
Oct 1, 2020 – Sep 30, 2022
Award Number:
2028923
Location of Project:
University of Tennessee, Knoxville
Description:
The project recruits a cross-disciplinary community working together in three virtual mini-workshops called virtual world cafes to define, design, implement, and use a set of solutions for robust science.
Web Page:
RobustScience
Augmenting Hatchet to support scalability and replicability solutions for HPC applications
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Aug 1, 2020 – Jul 31, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project uses Hatchet and its features (e.g. query language) to study scalability and replicability problems in applications of interest to LLNL at a large scale; and develops tooling to support the analysis and study of such problems to identify the source of the scalability and replicability problems.
Collaborative Research: EAGER: Advancing Reproducibility in Multi-Messenger Astrophysics
Source of Support:
National Science Foundation (NSF)
Project Period:
Aug 1, 2020 – Jul 31, 2022
Award Number:
2041977
Location of Project:
University of Tennessee, Knoxville
Description:
The project provides the astrophysics community with a transformative building block to a roadmap for reproducible open science. Findings about the reproducibility process of the EHT and NICER results are captured and disseminated through documentation, data products, and methods used.
Leverage Containerized Environments for Reproducibility and Traceability of Scientific Workflows - the case study of Analytics for Neural Network Workflows
Source of Support:
Sandia National Laboratories
Project Period:
Jul 15, 2020 – Jul 14, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project builds a prototype of a containerized environment which encapsulates each component of a scientific workflow (i.e., data and applications) in individual container environment for transparent and automatic metadata collection and access, easy-to-read record trail, and tight connections between data and metadata.
Study Performance Portability of the Vector Particle-In-Cell Project (VPIC) across architectures
Source of Support:
Los Alamos National Laboratory
Project Period:
May 1, 2020 – Apr 30, 2023
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies aspect of performance portability associate to the Vector Particle-In-Cell Project or VPIC code across platforms by addressing questions such as “Is the execution of VPIC sensitive to new architectures on which it runs? How do we continue to extract as much performance as possible despite differences in hardware? What performance is lost when using a performance portability framework?”
Flux Scheduler Specializations: Improving Workflow Performance with Scheduler Structure and Policy Tuning
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Apr 1, 2020 – Mar 31, 2022
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies how features of a user-level, highly-configurable scheduler like Flux can best be leveraged to maximize workflow performance. The project aims to answer this question through the development of a model that tunes scheduler settings to maximize workflow performance even under conditions of system stress such as scheduler fragmentation and resource drains.
EAGER: Reproducibility in Computational and Data-Enabled Science-Paradigms, Practices, and Infrastructure
Source of Support:
National Science Foundation (NSF)
Project Period:
Aug 16, 2019 – Aug 15, 2022
Award Number:
1941443
Location of Project:
University of Tennessee, Knoxville
Description:
This project seeks to improve understanding of how the scientific community can adapt to the increasing use of computing and large-scale data resources. One challenge is ensuring that computational results -such as those from simulations- are "reproducible", that is, the same results areobtained when one re-uses the same input data, methods, software and analysis conditions. In 2019, the National Academies of Science, Engineering, and Medicine (NASEM) issued a report on "Reproducibility and Replication in Science" with a series of recommendations. The project will assess the implications of these recommendations on the scientific discovery process for computationally and data-enabled research.
JDRD: Empowering Training and Validation Stages in Al-Orchestrated Workflows
Source of Support:
Science Alliance - University of Tennessee, Knoxville
Project Period:
Oct 1, 2019 – Sep 30, 2021
Location of Project:
University of Tennessee, Knoxville
Description:
This project studies AI-orchestrated workflows--workflows including experimental, computational, and data manipulation steps in one or multiple domains, where an important component is one or more neural networks (NN) used for searching or decision making. The project aim is to transform the process of training the NN in AI-orchestrated workflows from simulated data (clean, non-adversarial data) to deploying on real data (noisy, adversarial data) with the integration of mitigating methods.
Study of Data-intensive Workflows on Next-generation Systems with Emphasis on Memory Access
Source of Support:
Sandia National Laboratories
Project Period:
Aug 1, 2019 – Jul 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
The project designs and implements a C++ suite of data-intensive mini-applications to study data management costs with emphasis on memory access times and use, power consumption, and replicability.
Moving towards self-adjusting scheduling policies for high performance workflows with Flux’s fully hierarchical scheduling
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Feb 8, 2019 – Jan 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
The project tackles scheduler specializations by systematically studying fully hierarchical scheduling models with Flux and defining models supporting a given workflow to employ the best scheduler specialization strategy at runtime.
Driving Next-Generation Schedulers with Machine Learning-Based Application Patterns
Source of Support:
Lawrence Livermore National Laboratory
Project Period:
Aug 1 2018, 2018 – Jul 31, 2020
Location of Project:
University of Tennessee, Knoxville
Description:
This project develops methods to identify and understand irregular HPC job patterns and integrates knowledge of these irregular HPC patterns into multi-objective schedulers. The work leverages results of a previous award from Lawrence Livermore National Laboratory.
Collaborative: EAGER: Exploring and Advancing the State of the Art in Robust Science in Gravitational Wave Physics
Source of Support:
National Science Foundation (NSF): Advanced Cyberinfrastructure (OAC) #1823372
Project Period:
May 31, 2018 – Apr 30, 2020
Award Number:
1841399
Location of Project:
University of Tennessee, Knoxville
Description:
The project develops and uses a survey to collect information about LIGO workflows that are composed of a series of experimental, computational, and data manipulation steps.
Building a “Miniature” Version of the ORNL‘s Summit supercomputer for Computational Science Research at UTK
Source of Support:
2019 IBM Global University Program Shared University Research Award
Project Period:
Jun 21, 2019 - June 20, 2024
Location of Project:
University of Tennessee, Knoxville
Description:
The award enabled the purchase of a supercomputer for computational science applicationsat UTK.