## Journal Papers _{ (Top) }

### Gap-Free Annual Soil Moisture Global across 15km Grids: 1991-2016.*(In Review)*

**M. Taufer**, and R. Vargas.

### Flux: Overcoming Scheduling Challenges for Exascale Workflows

**M. Taufer**.

### Spatial Gap-Filling of ESA CCI Satellite-Derived Soil Moisture Based on Geostatistical Techniques and Multiple Regression

**M. Taufer**, and R. Vargas.

@article{llamas2020spatial,

title={Spatial gap-filling of ESA CCI satellite-derived soil moisture based on geostatistical techniques and multiple regression},

author={Llamas, Ricardo M and Guevara, Mario and Rorabaugh, Danny and Taufer, Michela and Vargas, Rodrigo},

journal={Remote Sensing},

volume={12},

number={4},

pages={665},

year={2020},

publisher={Multidisciplinary Digital Publishing Institute}

}

### Memory-Efficient and Skew-Tolerant MapReduce over MPI for Supercomputing Systems

**M. Taufer**.

### A Graphic Encoding Method for Quantitative Classification of Protein Structure and Representation of Conformational Changes

**M. Taufer**, and T. Estrada.

### Building a Vision for Reproducibility in the Cyberinfrastructure Ecosystem: Leveraging Community Efforts

**M. Taufer**.

### A Survey of Algorithms for Transforming Molecular Dynamics Data into Metadata for In Situ Analytics based on Machine Learning Methods

**M. Taufer**, T. Estrada, and T. Johnston.

### A Three-phase Workflow for General and Expressive Representations of Nondeterminism in HPC Applications

**M. Taufer.**

### Creating a Portable, High-Level Graph Analytics Framework for Compute and Data-Intensive Applications

**M. Taufer**, and S. Chandrasekaran.

### Spatial Gap-Filling of ESA CCI Satellite-Derived Soil Moisture based on Linear Geostatistics

**M. Taufer**, and R. Vargas.

### The Future of Scientific Workflows

**M. Taufer**, and J. Vetter.

### Record-and-Replay Techniques for HPC Systems: A survey

**M. Taufer**.

### In-Situ Data Analytics and Indexing of Protein Trajectories

**M. Taufer**.

### Enabling Scalable and Accurate Clustering of Distributed Ligand Geometries on Supercomputers

**M. Taufer**.

### The Future of Scientific Workflows

**M. Taufer**, and J. Vetter.

### Creating a Portable, High-Level Graph Analytics Framework For Compute and Data-Intensive Applications

### Enhancing Reproducibility for Computational Methods - Data, code and workflows should be available and cited

**M. Taufer**.

### Performance Characterization of Irregular I/O at the Extreme Scale

**M. Taufer**.

### It-Situ Data Analysis of Protein Folding Trajectories

**M. Taufer**.

### Scheduling DAG-based Workflows on Single Cloud Instances: High- performance and Cost Effectiveness with a Static Scheduler

**M. Taufer**and A. L. Rosenberg.

### Free Energetics of Carbon Nanotube Association in Aqueous Inorganic NaI Salt Solutions: Temperature Effects using All-Atom Molecular Dynamics Simulations and High-Performance Graphical Processing Unit Based Resources

**M. Taufer**, and S. Patel.

### Pursuing Resource Utilization and Coordinated Progression in GPU-enabled Molecular Simulations

**M. Taufer**.

### MEMS Accelerometers and Distributed Sensing for Rapid Earthquake Characterization

**M. Taufer**.

### On the Powerful Use of Simulations in the Quake-Catcher Network to Efficiently Position Low-cost Earthquake Sensors

**M. Taufer**, E. Cochran, and J. Lawrence.

### Enhancement of Accuracy and Efficiency for RNA Secondary Structure Prediction by Sequence Segmentation and MapReduce

**M. Taufer**.

### GPU enabled Macromolecular Simulation: Challenges and Opportunities

**M. Taufer**, N. Ganesan, and S. Patel.

### A Scalable and Accurate Method for Classifying Protein-Ligand Binding Geometries using a MapReduce Approach

**M. Taufer**.

### Hierarchical Fractional-step Approximations and Parallel Kinetic Monte Carlo Algorithms

**M. Taufer**, and L. Xu.

### Structural, Dynamic, and Electrostatic Properties of Fully Hydrated DMPC Bilayers from Molecular Dynamics Simulations Accelerated with Graphical Processing Units (GPUs)

**M. Taufer**.

### Evaluation of Several Two-Step Scoring Functions Based on Linear Interaction Energy, Effective Ligand Size, and Empirical Pair Potentials for Prediction of Protein-Ligand Binding Geometry and Free Energy

**M. Taufer**, C. L. Brooks III, R.S. Armen.

### Molecular Dynamics Simulations of Aqueous Ions at the Liquid-Vapor Interface Accelerated Using Graphics Processors

**M. Taufer**, and S. Patel.

### A 3 Terminal Stem-loop Structure in Nodamura Virus RNA2 Forms an Essential Cis-acting Signal for RNA replication

**M. Taufer**, and K. L. Johnson.

### Performance Prediction and Analysis of BOINC Projects: An Empirical Study with EmBOINC

**M. Taufer**, and D. Anderson.

### Computational Multi-Scale Modeling in Protein-Ligand Docking

**M. Taufer**, R.S. Armen, J. Chen, P.J. Teller, and C.L. Brooks III.

### PseudoBase++: An Extension of PseudoBase for Easy Searching, Formatting, and Visualization of Pseudoknots

**M. Taufer**, A. Licon, R. Araiza, D. Mireles, A. Gultyaev, F.H.D. Van Batenburg, M-Y Leung.

### RNAVLab: A Virtual Laboratory for Studying RNA Secondary Structures based on Grid Computing Technology. Journal of Parallel Computing

**M. Taufer**, M-Y. Leung, T. Solorio, A. Licon, D. Mireles, R. Araiza, K.L. Johnson.

### A Distributed Evolutionary Method to Design Scheduling Policies for Volunteer Computing

**M. Taufer**.

### Integrate GridFTP into Firefox - Build grid protocols into Mozilla-based tools

**M. Taufer**, B. Stearn, R. Zamudio, D. Catarino.

### Predictor@Home: A Protein Structure Prediction Supercomputer Based on Global Computing

**M. Taufer**, C. An, A. Kerstens , and C.L. Brooks III.

### Study of an Accurate and Fast Protein-Ligand Docking Algorithm based on Molecular Dynamics

**M. Taufer**, M. Crowley, D. Price, A.A. Chien, and C.L. Brooks III.

### a Performance Monitoring Tool for Sandbox-based Desktop Grid Platforms

**M. Taufer**, and A.A. Chien: DGMonitor.

### The Computational Chemistry Prototyping Environment

**M. Taufer**.

## Book Chapters (Top)

### Data Movement in Data-Intensive High Performance Computing

**M. Taufer**, J. H. Rogers, H. Abbasi, J. Hill, and L. Carrington.

### Scheduling on Large Scale Volatile Desktop Grids, from Greedy and Naive to Intelligent and Adaptive Policies

**M. Taufer**.

### Protein Docking

**M. Taufer**.

### A Protein Structure Prediction Supercomputer Based on Volunteer Computing

**M. Taufer**and C.L. Brooks III, Predictor@Home.

## Research Papers in Refereed Conferences, Symposiums, and Workshops (Top)

### A Novel Metric to Evaluate In Situ Workflows

**M. Taufer**, and E. Deelman.

### CanarIO: Sounding the Alarm on IO-Related Performance Degradation

*(Acceptance Rate: 45/110, 24.7%)*M.R. Wyatt II, S. Herbein, K. Shoga, T. Gamblin, and

**M. Taufer.**

### Characterization of In Situ and In Transit Analytics of Molecular Dynamics Simulations for Next-generation Supercomputers

**M. Taufer**.

### SOMOSPIE: A Modular SOil MOisture SPatial Inference Engine based on Data Driven Decisions

**M. Taufer**.

### Characterization of Power Usage and Performance in Data-Intensive Applications using MapReduce over MPI

**M. Taufer**.

### Initial Thoughts on Cybersecurity And Reproducibility

**M. Taufer**.

### Applicability study of the PRIMAD model to LIGO gravitational wave search workflows

**M. Taufer**.

### On the Power of Combiner Optimizations in MapReduce over MPI Workflows

**M. Taufer**.

### Flux: Overcoming Scheduling Challenges for Exascale Workflows

**M. Taufer**.

### Graphic Encoding of Proteins for Efficient High-Throughput Analysis

**M. Taufer**.

### Leveraging Neural Networks for Resource Prediction and IO-Aware Scheduling

**M. Taufer**.

### KeyBin2: Distributed Clustering for Scalable and In-Situ Analysis

**M. Taufer**, and T. Estrada.

### Leveraging In Situ Data Analysis to Enable Computational Steering of Brain’s NeocortexSimulations with GENESIS

**M. Taufer**.

### Bloomfish: A Highly Scalable Distributed K-mer Counting Framework

**M. Taufer**.

### Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems

**M. Taufer**.

### HYPPO: A Hybrid, Piecewise Polynomial Modeling Technique for Non-Smooth Surfaces

**M. Taufer**.

### Scheduling Matters: Area-oriented Heuristics for Resource Management

**M. Taufer**.

### Development of a Scalable Method for Creating Food Groups Using the NHANES Dataset and MapReduce

**M. Taufer**.

### Machine Learning Predictions of Runtime and IO Traffic on High-end Clusters

**M. Taufer**.

### Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters

**M. Taufer**.

### Resource Management for Running HPC Applications in Container Clouds

**M. Taufer**.

### A Genetic Programming Approach to Design Resource Allocation Policies for Heterogeneous Workflows in the Cloud

**M. Taufer**.

### Bandwidth Modeling in Large Distributed Systems for Big Data Applications

**M. Taufer**.

### Using Surrogate-based Modeling to Predict Optimal I/O Parameters of Applications at the Extreme Scale

**M. Taufer**.

### Applying Frequency Analysis Techniques to DAG-based Workflows to Benchmark and Predict Resource Behavior on Non-Dedicated Clusters

**M. Taufer**.

### Benchmarking the Performance of Scientific Applications with Irregular I/O at the Extreme Scale

**M. Taufer**.

### Study the Network Impact on Earthquake Early Warning in the Quake-Catcher Network Project

**M. Taufer**.

### Enabling In-situ Data Analysis for Large Protein Folding Trajectory Datasets

**M. Taufer**.

### Performance Dissection of a MD Code across CUDA and GPU Generations

**M. Taufer**.

### Secondary Structure Predictions for Long RNA Sequences based on Inversion Excursions and MapReduce

**M. Taufer**, and M.-Y. Leung.

### Performance Impact of Dynamic Parallelism on Different Clustering Algorithms and the New GPU Architecture

**M. Taufer**.

### ExSciTecH: Expanding Volunteer Computing to Explore Science, Technology, and Health

**M. Taufer**.

### A modularized MapReduce framework to support RNA secondary structure prediction and analysis workflows

**M. Taufer**.

### Reengineering high-throughput molecular datasets for scalable clustering using MapReduce

**M. Taufer**.

### Dealing with performance/portability and performance/accuracy trade-offs in heterogeneous computing systems: A case study with matrix multiplication modulo primes

**M. Taufer**.

### On the Powerful Use of Simulations in the Quake-Catcher Network to Efficiently Position Low-cost Earthquake Sensors

**M. Taufer**, E. Cochran, and J. Lawrence.

### Simulating Application Resilience at Exascale

**M. Taufer**and A. Rodrigues.

### Providing Application-Level QoS in Volunteer Computing

**M. Taufer**.

### FEN ZI: GPU Enabled Molecular Dynamics Simulation of Large Membrane Regions Based on CHARMM Force Field and PME

**M. Taufer**.

### Rolling Partial Prefix-Sums to Speedup Uniform and Affine Recurrence Equations

**M. Taufer**.

### Automatic Selection of Near-Native Protein-Ligand Conformations using a Hierarchical Clustering and Volunteer Computing

**M. Taufer**.

### Accelerating HMMER on GPUs by Implementing Hybrid Data and Task Parallelism

**M. Taufer**.

### Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs

**M. Taufer**, S. Collins, and D. G. Vlachos.

### Improving Numerical Reproducibility and Stability in Large-Scale Numerical Simulations on GPUs

**M. Taufer**, O. Padron, P. Saponaro, S. Patel.

### A Dynamic Programming Algorithm for Finding the Optimal Segmentation of an RNA Secondary Structure Prediction

**M. Taufer**, M.-Y. Leung, and K.L. Johnson.

### MNEOMIC: Network Environment for Measurement and Observation for Network Interaction and Control

**M. Taufer**.

### Applying Organizational Self-Design to a Real-world Volunteer Computing System

**M. Taufer**, and K. S. Decker.

### Balancing Scientist Needs and Volunteer Preferences in Volunteer Computing using Constraint Optimization

**M. Taufer**.

### EmBOINC: An Emulator for Performance Analysis of BOINC Projects

**M. Taufer**.

### Modeling Job Lifespan Delays in Volunteer Computing Projects

**M. Taufer**.

### Towards Large-Scale Molecular Dynamics Simulations on Graphics Processors.

**M. Taufer**.

### A Distributed Evolutionary Method to Design Scheduling Policies for Volunteer Computing

**M. Taufer**.

### On the Effectiveness of Rebuilding RNA Secondary Structures from Sequence Chunks

**M. Taufer**, T. Solorio, A. Licon, D. Mireles, and M.-Y. Leung.

### Evaluation of IEEE 754 Floating-Point Arithmetic Compliance Across a Wide Range of Heterogeneous Computers

**M. Taufer**, and P.J. Teller.

### Towards Optimal Scheduling for Global Computing Under Probabilistic, Interval, and Fuzzy Uncertainty, with Potential Applications to Bioinformatics

**M. Taufer**, and M.-Y. Leung.

### SimBA: a Discrete Event Simulator for Performance Prediction of Volunteer Computing Projects

**M. Taufer**, A. Kerstens, T. Estrada, D.A. Flores, and P.J. Teller.

### RNAVLab: A unified environment for computational RNA structure analysis based on grid computing technology

**M. Taufer**, M.-Y. Leung, K. L. Johnson, A. Licon.

### Moving Volunteer Computing towards Knowledge-Constructed, Dynamically-Adaptive Modeling and Scheduling

**M. Taufer**, A. Kerstens, T. Estrada, D.A. Flores, R. Zamudio, P.J. Teller, R. Armen, and C.L. Brooks III.

### Topaz: Extending Firefox to Accommodate the GridFTP Protocol

**M. Taufer**, K. Bhatia, and B. Stearn.

### The Effectiveness of Threshold-based Scheduling Policies in BOINC Projects

**M. Taufer**, P. Teller, A. Kerstens, and D. Anderson.

### CompPknots: a Framework for Parallel Prediction and Comparison of RNA Secondary Structures with Pseudoknots

**M. Taufer**.

### Extending Grid Protocols onto the Desktop using the Mozilla Framework

**M. Taufer**, R. Zamudio, and D. Catarino.

### A Systematic Multi-step Methodology for Performance Analysis of Communication Traces of Distributed Applications based on Hierarchical Clustering

**M. Taufer**, and F. Wolf.

### Metrics for Effective Resource Management in Global Computing Environments

**M. Taufer**, P.J. Teller, D.P. Anderson, and C.L. Brooks III.

### Predictor@Home: A "Protein Prediction Supercomputer" Based on Public-Resource Computing

**M. Taufer**, C. An, A. Kerstens, and C.L. Brooks III.

### Homogeneous Technique to Ensure Integrity of Molecular Simulation Results Using Public Resources

**M. Taufer**, D.P. Anderson, P. Cicotti, and C.L. Brooks III.

### Study of an Fast Protein-Ligand Docking Algorithm based on Molecular Dynamics

**M. Taufer**, M. Crowley, D. Price, A.A. Chien, and C.L. Brooks III.

### Characterizing and Evaluating Desktop Grids: An Empirical Study

**M. Taufer**, C.L. Brooks III, H. Casanova, and A.A. Chien.

### DGMonitor: a Performance Monitoring Tool for Sandbox-based Desktop Grid Platforms

**M. Taufer**, and A.A. Chien.

### A Performance Monitor based on Virtual Global Time for Clusters of PCs

**M. Taufer**, and T. Stricker.

### Combining Task- and Data Parallelism to Speed up Protein Folding on a Desktop Grid Platform Is efficient protein folding possible with CHARMM on the United Devices MetaProcessor?

**M. Taufer**, T. Stricker, G. Settanni, A. Cavalli, and A. Caflisch.

### Implementation and Characterization of Protein Folding on a Desktop Computational Grid Is CHARMM a suitable candidate for the United Devices MetaProcessor

**M. Taufer**, T. Stricker, G. Settanni, and A. Cavalli.

### Scalability and Resource Usage of an OLAP Benchmark on Clusters of PCs

**M. Taufer**, T. Stricker, and R. Weber.

### On the Migration of the Scientific Code DYANA from SMPs to Clusters of PCs and on to the Grid

**M. Taufer**, T. Stricker, G. Roos, P. Guentert.

### Performance Characterization of a Molecular Dynamics Code on PC Clusters - Is there any easy parallelism in CHARMM?

**M. Taufer**, E. Perathoner, A. Cavalli, A. Caflisch, and T. Stricker.

### Accurate Performance Evaluation, Modeling and Prediction of a Message Passing Simulation Code based on Middleware

**M. Taufer**, and T. Stricker.

### Molecular Dynamics Simulations on Cray Clusters using the Sciddle-PVM Environment

**M. Taufer**, and U. von Matt.

## Educational Papers (Top)

### Collaborative Research Tools for Students, Staff, and Faculty

**M. Taufer**, P.J. Teller, A. Kerstens, R. Romero.

## Posters (Top)

### Modeling Record-and-Replay for Nondeterministic Applications on Exascale Systems

**M. Taufer**.

### Large-Scale SoilMoisture Modeling Based on Linear Geostatistics and Remotely Sensed Data

**M. Taufer**, and Rodrigo Vargas.

### Creating a Portable, High-Level Graph Analytics Paradigm For Compute and Data-Intensive Applications

**M. Taufer**and S. Chandrasekaran.

### Data Analytics for Modeling Soil Moisture Patterns across United States Ecoclimatic Domains

**M. Taufer**.

### Fully Hierarchical Scheduling: Paving the Way to Exascale Workloads

**M. Taufer**.

### Network Quality of Service in Docker Containers

**M. Taufer**.

### A Two-Tiered Approach to I/O Quality of Service in Linux

**M. Taufer**.

### Resource Management Layers for Dynamic CPU Resource Allocation in Containerized Cloud Environments

**M. Taufer**.

### Predictions of Large-scale QMCPack I/Os on Titan using Skel

**M. Taufer**.

### On the Cost of a General GPU Framework - The Strange Caseof CUDA 4.0 vs. CUDA 5.0

**M. Taufer**.

### Secondary Structure Predictions for Long RNA Sequences Based on Inversion Excursions

**M. Taufer**and M.-Y. Leung.

### Benchmarking Gender Differences in Voluntary Computer Projects

**M. Taufer**.

### Study of Protein-ligand Binding Geometries using a Scalable and Accurate Octree-based Algorithm in MapReduce

**M. Taufer**.

### Simulations of Large Membrane Regions using GPU-enabled Computations - Preliminary Results

### Parallelization of Tau-Leaping Coarse-Grained Monte Carlo Method for Efficient and Accurate Simulations on GPUs

**M. Taufer**, and D.G. Vlachos.

### Improving Reproducibility and Stability of Numerically Intensive Applications on Graphics Processing Units

**M. Taufer**, P. Saponaro, and O. Padron.

### Docking@Home: Searching for New Drugs using Volunteer’s Computers

**M. Taufer**.

### Role of RNA secondary structure in replication of Nodamura virus RNA2

### Performance Analysis of Volunteer Computing Traces

**M. Taufer**, and K. Reed.

### SimBA: a Discrete Event Simulator for Performance Prediction of Volunteer Computing Projects

**M. Taufer**, P. Teller, and A. Kerstens.

### Predictor@home: A Multiscale, Distributed Approach for Protein Structure Prediction

**M. Taufer**, and C.L. Brooks III.

### Predictor@home: A Multiscale, Distributed Approach for Protein Structure Prediction

**M. Taufer**, and C.L. Brooks III.

## Technical Reports (Top)

### Parallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs

### Improving Numerical Reproducibility and Stability in Large-Scale Numerical Simulations on GPUs

### Topaz: a Firefox Protocol Extension for GridFTP based on Data Flow Diagrams

**M. Taufer**, B. Stearn, and K. Bhatia.

### Characterizing and Evaluating Desktop Grids: An Empirical Study

**M. Taufer**, J. Karanicolas, C. L. Brooks III, H. Casanova, and A. Chien.

### a Performance Monitoring Tool for Sandbox-based Desktop Grid Platforms

**M. Taufer**, and A. Chien: DGMonitor.

### Combining Task- and Data Parallelism to Speed up Protein Folding on a Desktop Grid Platform - Is efficient protein folding possible with CHARMM on the United Devices MetaProcessor

**M. Taufer**, T. Stricker, G. Settanni, A. Cavalli, and A. Caflisch.

### Implementation and Characterization of Protein Folding on a Desktop Computational Grid – Is CHARMM a suitable candidate for the United Devices MetaProcessor

**M. Taufer**, T. Stricker, G. Settanni, and A. Cavalli.

### Inverting Middleware Framework: Framework for Performance Analysis of Distributed OLAP Benchmarks on Clusters of PCs by Filtering and Abstracting Low Level Resource Usage

**M. Taufer**, T. Stricker, and R. Weber.

### Performance Characterization and Modeling of the Molecular Simulation Code OPAL

**M. Taufer**, and U. von Matt.

## Thesis (Top)

### Inverting Middleware: Performance Analysis of Layered Application Codes in High Performance Distributed Computing

**M. Taufer**.

### Development of the Parallelization of the Software Package OPAL for the Simulation of Dynamic Molecules on Supercomputers

**M. Taufer**.