SOMOSPIE: A modular
SOil MOisture SPatial Inference Engine
based on data-driven decisions

Funded by the National Science Foundation (NSF) under grant numbers 1724843, 2103845, 2103836, 2138811, and 2334945

About SOMOSPIE

SOMOSPIE is a modular SOil MOisture SPatial Inference Engine that allows Earth scientists to address the coarse-grained resolution and spatial information gaps associated with satellite data. The modular components of SOMOSPIE consists of:
Input of available satellite data at its native spatial resolution.
Selection of a geographic region of interest.
Prediction, of missing values across the entire region of interest (i.e., gap-filling) and at finer-grained resolution.
Analysis and visualization of generated predictions.

SOMOSPIE and FAIR

SOMOSPIE supports reproducibility, explainability, and portability of results. Its new features allow users to:
Deploy container technology on cloud platforms to perform rapid data movement and achieve portability.
Collect workflow execution's record trails to enable data traceability and results explainability.

Run the SOMOSPIE Software

SOMOSPIE can be cloned from the following repositories:
  1. GitHub: https://github.com/TauferLab/SOMOSPIE
  2. Docker: https://hub.docker.com/r/globalcomputinglab/somospie
SOMOSPIE can be installed on different operating systems .

Publications

Gabriel Laboy, Ian Lumsden, Jack Marquez, Kin Wai NG Lugo, Rodrigo Vargas, and Michela Taufer. A Modular, Cross-Platform Toolkit for High-Resolution Terrain Parameter Analysis. In Proceedings of the 21st IEEE International Conference on eScience (eScience), Chicago, IL, USA, September 2025. IEEE Computer Society. (Acceptance Rate: 33/98, 33.6%).
Gabriel Laboy, Paula Olaya, Jack Marquez, Michael Sutherlin, Rodrigo Vargas, and Michela Taufer. Advancing the GEOtiled Framework Through Scalable Terrain Parameter Computation. In Proceedings of the 34th International Symposium on High-Performance Parallel and Distributed Computing (HPDC), pages 1–2, Notre Dame, IN, USA, July 20–23 2025. ACM. (Short Paper).
Befikir Bogale, Ian Lumsden, Dalal Sukkari, Dewi Yokelson, Stephanie Brink, Olga Pearce, and Michela Taufer. Surrogate Models for Analyzing Performance Behavior of HPC Applications Using RAJAPerf. In Proceedings of the International Conference on Computational Science (ICCS), page 1–8, Singapore, July 7–9 2025. Springer. [link]

@InProceedings{10.1007/978-3-031-97635-3_39,
author="Bogale, Befikir and Lumsden, Ian and Sukkari, Dalal and Yokelson, Dewi and Brink, Stephanie and Pearce, Olga and Taufer, Michela",
editor="Lees, Michael H. and Cai, Wentong and Cheong, Siew Ann and Su, Yi and Abramson, David and Dongarra, Jack J. and Sloot, Peter M. A.",
title="Surrogate Models for Analyzing Performance Behavior of HPC Applications Using the RAJA Performance Suite",
booktitle="Computational Science -- ICCS 2025",
year="2025",
publisher="Springer Nature Switzerland",
address="Cham",
pages="327--335",
isbn="978-3-031-97635-3"
}

Paula Olaya, Sophia Wen, Jay Lofstead, and Michela Taufer. PerSSD: Persistent, Shared, and Scalable Data with Node-Local Storage for Scientific Workflows in Cloud Infrastructure. In Proceedings of the 2024 IEEE International Conference on Big Data, Washington DC, US, December 2024. IEEE Computer Society. (Acceptance Rate: 600/124, 18.8%). [link]

@INPROCEEDINGS{10826021,
author={Olaya, Paula and Wen, Sophia and Lofstead, Jay and Taufer, Michela},
booktitle={2024 IEEE International Conference on Big Data (BigData)},
title={PerSSD: Persistent, Shared, and Scalable Data with Node-Local Storage for Scientific Workflows in Cloud Infrastructure},
year={2024},
volume={},
number={},
pages={272-281},
keywords={Automation;Software architecture;File systems;Pipelines;Information sharing;Geoscience;Manuals;Performance gain;Reproducibility of results;Noise measurement},
doi={10.1109/BigData62323.2024.10826021}}

Michela Taufer, Heberth Martinez, Aashish Panta, Paula Olaya, Jack Marquez, Amy Gooch, Giorgio Scorzelli, and Valerio Pascucci. Leveraging National Science Data Fabric Services to Train Data Scientists. In Proceedings of the 2024 Workshop on Education for HighPerformance Computing (EduHPC)-Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis (SC24), Atlanta, GA, USA, November 2024. IEEE Computer Society. [link]

@INPROCEEDINGS{10820725,
author={Taufer, Michela and Martinez, Heberth and Panta, Aashish and Olaya, Paula and Marquez, Jack and Gooch, Amy and Scorzelli, Giorgio and Pascucci, Valerio},
booktitle={SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis},
title={Leveraging National Science Data Fabric Services to Train Data Scientists},
year={2024},
volume={},
number={},
pages={355-362},
keywords={Surveys;Data analysis;Pain;Visual analytics;High performance computing;Data visualization;Memory;Tutorials;Geoscience;Fabrics;Data analysis;data visualization;workforce development},
doi={10.1109/SCW63240.2024.00053}}

Michela Taufer, Daniel Milroy, Todd Gamblin, Andrew Jones, Bill Magro, Heidi Poxon, and Seetharami Seelam. HPC and Cloud Convergence Beyond Technical Boundaries: Strategies for Economic Sustainability, Standardization, and Data Accessibility. IEEE Computer, 2024. [link]

@ARTICLE{10547086,
author={Taufer, Michela and Milroy, Daniel and Gamblin, Todd and Jones, Andrew and Magro, Bill and Poxon, Heidi and Seelam, Seetharami},
journal={Computer},
title={HPC and Cloud Convergence Beyond Technical Boundaries: Strategies for Economic Sustainability, Standardization, and Data Accessibility},
year={2024},
volume={57},
number={6},
pages={128-136},
keywords={},
doi={10.1109/MC.2024.3387013}}

Camila Roa, Mats Rynge, Paula Olaya+, Karan Vahi, Todd Miller, John Goodhue, James Griffioen, David Hudak, Shelley Knuth, Ricardo Llamas, Rodrigo Vargas, Miron Livny, Ewa Deelman, and Michela Taufer. End-to-end Integration of Scientific Workflows on Distributed Cyberinfrastructures: Challenges and Lessons Learned with an Earth Science Application. In Proceedings of the 15th IEEE/ACM International Conference on Utility and Cloud Computing (UCC), pages 1–10, Taormina (Messina), Italy, December 2023. IEEE Computer Society. (Acceptance Rate: 20/50, 40%). [link]

@inproceedings{10.1145/3603166.3632142,
author = {Roa, Camila and Rynge, Mats and Olaya, Paula and Vahi, Karan and Miller, Todd and Griffioen, James and Knuth, Shelley and Goodhue, John and Hudak, David and Romanella, Alana and Llamas, Ricardo and Vargas, Rodrigo and Livny, Miron and Deelman, Ewa and Taufer, Michela},
title = {End-to-end Integration of Scientific Workflows on Distributed Cyberinfrastructures: Challenges and Lessons Learned with an Earth Science Application},
year = {2024},
isbn = {9798400702341},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3603166.3632142},
doi = {10.1145/3603166.3632142},
booktitle = {Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing},
articleno = {17},
numpages = {9},
keywords = {workflows, machine learning, soil moisture, high throughput computing, containers},
location = {Taormina (Messina), Italy},
series = {UCC '23}
}

Camila Roa. GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters. Poster presented at the 32nd International ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC), pages 1–2, Orlando, Florida, USA, Jun 2023. [link]

@article{https://doi.org/10.1145/3588195.3595941,
author = {Roa, Camila and Olaya, Paula and Llamas, Ricardo and Vargas, Rodrigo and Taufer, Michela },
title = {{GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters}},
journal = {Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing},
pages = {311–312},
year = {2023}
}

Olaya, Paula and Kennedy, Dominic and Llamas, Ricardo and Valera, Leobardo and Vargas, Rodrigo and Lofstead, Jay and Taufer, Michela Building Trust in Earth Science Findings through Data Traceability and Results Explainability. IEEE International Conference on Cloud Computing (2023). [link]

@article{9942337,
author={Olaya, Paula and Kennedy, Dominic and Llamas, Ricardo and Valera, Leobardo and Vargas, Rodrigo and Lofstead, Jay and Taufer, Michela},
journal={IEEE Transactions on Parallel and Distributed Systems},
title={Building Trust in Earth Science Findings through Data Traceability and Results Explainability},
year={2023},
volume={34},
number={2},
pages={704-717},
doi={10.1109/TPDS.2022.3220539}
}

Olaya, Paula and Luettgau, Jakob and Roa, Camila and Llamas, Richardo and Vargas, Rodrigo and Wen, Sophia and Chung, I-Hsin and Seelam, Seetharami and Park, Yoonho and Lofstead, Jay and others Enabling Scalability in the Cloud for Scientific Workflows: An Earth Science Use Case. IEEE International Conference on Cloud Computing (2023). [link]

@INPROCEEDINGS{10255013,
author={Olaya, Paula and Luettgau, Jakob and Roa, Camila and Llamas, Ricardo and Vargas, Rodrigo and Wen, Sophia and Chung, I-Hsin and Seelam, Seetharami and Park, Yoonho and Lofstead, Jay and Taufer, Michela},
booktitle={2023 IEEE 16th International Conference on Cloud Computing (CLOUD)},
title={Enabling Scalability in the Cloud for Scientific Workflows: An Earth Science Use Case},
year={2023},
pages={383-393},
doi={10.1109/CLOUD60044.2023.00052}}

Roa, C., Olaya, P., Llamas, R., Vargas, R., Taufer, M. GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters. Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing (2023). [link]

@article{https://doi.org/10.1145/3588195.3595941,
author = {Roa, Camila and Olaya, Paula and Llamas, Ricardo and Vargas, Rodrigo and Taufer, Michela },
title = {{GEOtiled: A Scalable Workflow for Generating Large Datasets of High-Resolution Terrain Parameters}},
journal = {Proceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing},
pages = {311–312},
year = {2023}
}

Kennedy, Dominic and Olaya, Paula and Lofstead, Jay and Vargas, Rodrigo and Taufer, Michela Augmenting Singularity to Generate Fine-grained Workflows, Record Trails, and Data Provenance 2022 IEEE 18th International Conference on e-Science (e-Science) (2022). [link]

@inproceedings{9973642,
author={Kennedy, Dominic and Olaya, Paula and Lofstead, Jay and Vargas, Rodrigo and Taufer, Michela},
booktitle={2022 IEEE 18th International Conference on e-Science (e-Science)},
title={Augmenting Singularity to Generate Fine-grained Workflows, Record Trails, and Data Provenance},
year={2022},
pages={403-404},
doi={10.1109/eScience55777.2022.00059}
}

Dwivedi, D., Santos, A. L. D., Barnard, M. A., Crimmins, T. M., Malhotra, A., Rod, K. A., et al. Biogeosciences perspectives on Integrated, Coordinated, Open, Networked (ICON) science. Earth and Space Science (2022). [link]

@article{https://doi.org/10.1029/2021EA002119,
author = {Dwivedi, D. and Santos, A. L. D. and Barnard, M. A. and Crimmins, T. M. and Malhotra, A. and Rod, K. A. and Aho, K. S. and Bell, S. M. and Bomfim, B. and Brearley, F. Q. and Cadillo-Quiroz, H. and Chen, J. and Gough, C. M. and Graham, E. B. and Hakkenberg, C. R. and Haygood, L. and Koren, G. and Lilleskov, E. A. and Meredith, L. K. and Naeher, S. and Nickerson, Z. L. and Pourret, O. and Song, H.-S. and Stahl, M. and Taş, N. and Vargas, R. and Weintraub-Leff, S.},
title = {{Biogeosciences Perspectives onIntegrated, Coordinated, Open, Networked (ICON) Science}},
journal = {Earth and Space Science},
volume = {9},
number = {3},
pages = {e2021EA002119},
year = {2022}
}

Loescher, H. W., Vargas, R., Mirtl, M., Morris, B., Pauw, J., Yu, X., et al. Building a Global Ecosystem Research Infrastructure to address global grand challenges for macrosystem ecology. Earth's Future (2022). [link]

@article{https://doi.org/10.1029/2020EF001696,
author = {Loescher, Henry W. and Vargas, Rodrigo and Mirtl, Michael and Morris, Beryl and Pauw, Johan and Yu, Xiubo and Kutsch, Werner and Mabee, Paula and Tang, Jianwu and Ruddell, Benjamin L. and Pulsifer, Peter and Bäck, Jaana and Zacharias, Steffen and Grant, Mark and Feig, Gregor and Zhang, Leiming and Waldmann, Christoph and Genazzio, Melissa A.},
title = {{Building a Global Ecosystem Research Infrastructure to Address Global Grand Challenges for Macrosystem Ecology}},
journal = {Earth's Future},
volume = {10},
number = {5},
pages = {e2020EF001696},
year = {2022}
}

Llamas, Ricardo M. and Valera, Leobardo and Olaya, Paula and Taufer, Michela and Vargas, Rodrigo Downscaling Satellite Soil Moisture Using a Modular Spatial Inference Framework Remote Sensing (2022). [link]

@Article{rs14133137,
AUTHOR = {Llamas, Ricardo M. and Valera, Leobardo and Olaya, Paula and Taufer, Michela and Vargas, Rodrigo},
TITLE = {Downscaling Satellite Soil Moisture Using a Modular Spatial Inference Framework},
JOURNAL = {Remote Sensing},
VOLUME = {14},
YEAR = {2022},
NUMBER = {13},
ARTICLE-NUMBER = {3137},
URL = {https://www.mdpi.com/2072-4292/14/13/3137},
ISSN = {2072-4292},
DOI = {10.3390/rs14133137}
}

Shane M. Franklin, Alexandra N. Kravchenko, Rodrigo Vargas, Bruce Vasilas, Jeffry J. Fuhrmann, and Yan Jin. The unexplored role of preferential flow in soil carbon dynamics Soil Biology and Biochemistry (2021). [link]

@article{FRANKLIN2021108398,
author = {Daniel L. Warner and Mario Guevara and John Callahan and Rodrigo Vargas},
title = {{The unexplored role of preferential flow in soil carbon dynamics}},
journal = {Soil Biology and Biochemistry},
volume = {161},
pages = {108398},
year = {2021},
}%

Daniel L. Warner, Mario Guevara, John Callahan, Rodrigo Vargas. Downscaling satellite soil moisture for landscape applications: A case study in Delaware, USA. Journal of Hydrology: Regional Studies (2021). [link]

@article{WARNER2021100946,
author = {Daniel L. Warner and Mario Guevara and John Callahan and Rodrigo Vargas},
title = {{Downscaling satellite soil moisture for landscape applications: A case study in Delaware, USA}},
journal = {Journal of Hydrology: Regional Studies},
volume = {38},
pages = {100946},
year = {2021},
}%

Sassan Saatchi, Marcos Longo, Liang Xu, Yan Yang, Hitofumi Abe, et al. Detecting vulnerability of humid tropical forests to multiple stressors. One Earth (2021). [link]

@article{SAATCHI2021988,
author = {Sassan Saatchi and Marcos Longo and Liang Xu and Yan Yang and Hitofumi Abe and Michel André and Juliann E. Aukema and Nuno Carvalhais and Hinsby Cadillo-Quiroz and Gillian Ann Cerbu and Janet M. Chernela and Kristofer Covey and Lina María Sánchez-Clavijo and Isai V. Cubillos and Stuart J. Davies and Veronique {De Sy} and Francois {De Vleeschouwer} and Alvaro Duque and Alice Marie {Sybille Durieux} and Kátia {De Avila Fernandes} and Luis E. Fernandez and Victoria Gammino and Dennis P. Garrity and David A. Gibbs and Lucy Gibbon and Gae Yansom Gowae and Matthew Hansen and Nancy {Lee Harris} and Sean P. Healey and Robert G. Hilton and Christine May Johnson and Richard Sufo Kankeu and Nadine Therese Laporte-Goetz and Hyongki Lee and Thomas Lovejoy and Margaret Lowman and Raymond Lumbuenamo and Yadvinder Malhi and Jean-Michel M. {Albert Martinez} and Carlos Nobre and Adam Pellegrini and Jeremy Radachowsky and Francisco Román and Diane Russell and Douglas Sheil and Thomas B. Smith and Robert G.M. Spencer and Fred Stolle and Hesti Lestari Tata and Dennis del Castillo Torres and Raphael Muamba Tshimanga and Rodrigo Vargas and Michelle Venter and Joshua West and Atiek Widayati and Sylvia N. Wilson and Steven Brumby and Aurora C. Elmore},
title = {Detecting vulnerability of humid tropical forests to multiple stressors},
journal = {One Earth},
volume = {4},
number = {7},
pages = {988-1003},
year = {2021},
issn = {2590-3322},
}

M. Guevara, M. Taufer, and R. Vargas Gap-free global annual soil moisture: 15 km grids for 1991–2018 Earth Syst. Sci. Data (2021). [link]

@Article{essd-13-1711-2021,
author = {Mario Guevara and Michela Taufer and Rodrigo Vargas},
title = {{Gap-free Global Annual Soil Moisture: 15\,km grids for 1991--2018}},
journal = {Earth System Science Data},
volume = {13},
year = {2021},
number = {4},
pages = {1711--1735},
}

R. Llamas, M. Guevara, M. Taufer, and R. Vargas. Spatial Gap-Filling of ESA CCI Satellite-Derived Soil Moisture based on Linear Geostatistics. Remote Sensing 12(4):665 (2020). 10.3390/rs12040665 [link]

@article{llamas2020spatial,
author = {Llamas, Ricardo M and Guevara, Mario and Rorabaugh, Danny and Taufer, Michela and Vargas, Rodrigo},
title = {{Spatial gap-filling of ESA CCI satellite-derived soil moisture based on geostatistical techniques and multiple regression}},
journal = {Remote Sensing},
volume = {12},
number = {4},
pages = {665},
year = {2020},
publisher={Multidisciplinary Digital Publishing Institute}
}

D. Rorabaugh, M. Guevara, R. Llamas, J. Kitson, R. Vargas, and M. Taufer. SOMOSPIE: A modular SOil MOisture SPatial Inference Engine based on data-driven decisions. In Proceedings of the 2019 15th International Conference on eScience (eScience) (2019). [link]

@inproceedings{rorabaugh2019somospie,
author = {Rorabaugh, Danny and Guevara, Mario and Llamas, Ricardo and Kitson, Joy and Vargas, Rodrigo and Taufer, Michela},
title = {{SOMOSPIE: A modular SOil MOisture SPatial Inference Engine based on data-driven decisions}},
booktitle = {Proceedings of the 2019 15th International Conference on eScience (eScience)},
pages = {1--10},
year = {2019},
organization = {IEEE Computer Society}
}

E. Stell, M. Guevara, and R. Vargas. Soil swelling potential across Colorado: A digital soil mapping assessment. Landscape and Urban Planningv.190 (2019). [link]

@article{stell2019soil,
author = {Stell, Emma and Guevara, Mario and Vargas, Rodrigo},
title = {{Soil swelling potential across Colorado: A digital soil mapping assessment}},
journal = {Landscape and Urban Planning},
volume = {190},
pages = {103599},
year = {2019},
publisher = {Elsevier}
}

M Guevara and R. Vargas. Downscaling satellite soil moisture using geomorphometry and machine learning, PLOS ONE, v.14 (2019). [link]

@article{guevara2019downscaling,
author = {Guevara, Mario and Vargas, Rodrigo},
title = {{Downscaling satellite soil moisture using geomorphometry and machine learning}},
journal = {PloS one},
volume = {14},
number = {9},
pages = {e0219639},
year = {2019},
publisher = {Public Library of Science San Francisco, CA USA}
}

D. Warner,M Guevara, S. Inamdar, and R. Vargas. Upscaling soil-atmosphere CO2 and CH4 fluxes across a topographically complex forested landscape, Agricultural and Forest Meteorology, v.264 (2019). [link]

@article{warner2019upscaling,
author = {Warner, Daniel L and Guevara, Mario and Inamdar, Shreeram and Vargas, Rodrigo},
title = {{Upscaling soil-atmosphere CO2 and CH4 fluxes across a topographically complex forested landscape}},
journal = {Agricultural and forest meteorology},
volume = {264},
pages = {80--91},
year = {2019},
publisher = {Elsevier}
}

T. Kitson, P. Olaya, E. Racca, M. Wyatt, M. Guevara, R. Vargas, a M. Taufer Data analytics for modeling soil moisture patterns across united states ecoclimatic domains. In Proceedings of 2017 IEEE International Conference on Big Data (Big Data) (2017). [link]

@inproceedings{kitson2017data,
author = {Kitson, Thomas and Olaya, Paula and Racca, Elizabeth and Wyatt, Michael R and Guevara, Mario and Vargas, Rodrigo and Taufer, Michela},
title = {{Data analytics for modeling soil moisture patterns across United States ecoclimatic domains}},
booktitle = {Proceedings of the 2017 IEEE International Conference on Big Data (Big Data)},
pages = {4768--4770},
year = {2017},
organization = {IEEE Computer Society}
}

T. Kitson, P. Olaya, E. Racca, Michael R. Wyatt II, M. Guevara, R. Vargas, and M. Taufer. Data Analytics for Modeling Soil Moisture Patterns across United States Ecoclimatic Domains. In Proceedings of the 2017 IEEE International Conference on Big Data (2017). [link]

@inproceedings{kitson2017data,
author = {Kitson, Thomas and Olaya, Paula and Racca, Elizabeth and Wyatt, Michael R and Guevara, Mario and Vargas, Rodrigo and Taufer, Michela}
, title = {{Data analytics for modeling soil moisture patterns across united states ecoclimatic domains}},
booktitle = {Proceedings of the 2017 IEEE International Conference on Big Data (Big Data)}, pages = {4768--4770},
year = {2017},
organization = {IEEE Computer Society}
}

R. McKenna, V. Pallipuram, R. Vargas, and M. Taufer. From HPC Performance to Weather Modeling: Transforming Methods for HPC Predictions into Models of Extreme Climate Conditions. In Proceedings of the Tenth IEEE International Conference on e-Science and Grid Technologies (eScience), pp. 108 - 117. Munich, Germany. August 31 – September 4, 2015. 10.3390/rs12040665

@inproceedings{mckinney2015hpc,
author = {McKinney, Ryan and Pallipuram, Vivek K and Vargas, Rodrigo and Taufer, Michela},
title = {{From HPC performance to climate modeling: Transforming methods for HPC predictions into models of extreme climate conditions}},
booktitle = {Proceedings of 2015 IEEE 11th International Conference on e-Science},
pages = {108--117},
year = {2015},
organization = {IEEE Computer Society}
}