A handful of hex logos for the software I am working on

Software

I am deeply committed to open-source software and have been building professional-grade tools for over 15 years—going back to the days of Google Code here. My work combines rigorous statistical methods with modern software engineering practices, including version control (Git/GitHub), continuous integration, agentic AI, container technologies, and agile methodologies. I have led scientific software development teams to deliver high-performance, reliable, and user-friendly tools that are widely adopted by researchers and practitioners. Most of my work is built in C++ with wrappers for R and Python, optimized for use in high-performance computing environments. Recently, during the last couple of years, I have incorporated AI into my work, influencing development and innovation (see here for a few examples). Across all projects, I design with performance and usability in mind, ensuring that complex methods can be both scalable and accessible.

A couple of projects to highlight include:

The epiworld framework is an advanced agent-based modeling framework written in C++ that was designed for rapid prototyping of simulation models focused on epidemiological modeling. The library is available in R, Python, as well as a shiny package.
The most notable application of epiworld is with the modeling of measles during the recent US outbreaks. You can see a version of the shiny app that was developed here.

The rgexf package. Create, read, and write ‘GEXF’ (Graph Exchange ‘XML’ Format) graph files (used in ‘Gephi’ and others). Using the ‘XML’ package, rgexf allows reading and writing GEXF files, including attributes, ‘GEXF’ visual attributes (such as color, size, and position), network dynamics (for both edges and nodes), and edges’ weights. Users can build/handle graphs element-by-element or massively through data frames, visualize the graph on a web browser through ‘gexf-js’ (a ‘javascript’ library), and interact with the ‘igraph’ package.
You can see a live version of the gexf-js library in action here.

The ergmito R packagee. Simulation and estimation of Exponential Random Graph Models (ERGMs) for small networks using exact statistics as shown in Vega Yon et al. (2020) https://doi.org/10.1016/j.socnet.2020.07.005. As a difference from the ‘ergm’ package, ‘ergmito’ circumvents using Markov-Chain Maximum Likelihood Estimator (MC-MLE) and instead uses Maximum Likelihood Estimator (MLE) to fit ERGMs for small networks. As exhaustive enumeration is computationally feasible for small networks, this R package takes advantage of this and provides tools for calculating likelihood functions, and other relevant functions, directly, meaning that in many cases both estimation and simulation of ERGMs for small networks can be faster and more accurate than simulation-based algorithms.

The following is an exhaustive list of the software packages I have either built or contributed. You can take a look at my most recent contributions and ongoing open source projects on my GitHub

Bayer, D., Morris, D. H., Vega Yon, G. G., Martin, T., & Bidari, S. (2024). PyRenew: A package for bayesian renewal modeling with JAX and NumPyro. https://github.com/CDCgov/PyRenew
Johnson, K., Morris, D., Abbott, S., Bernal Zelaya, C., Vega Yon, G. G., Bayer, D., Magee, A., & Olesen, S. (2024). Wwinference: Jointly infers infection dynamics from wastewater data and epidemiological indicators. https://github.com/cdcgov/ww-inference-model/
Meyer, D., & Vega Yon, G. G. (2023). epiworldR: Fast agent-based epi models. https://cran.r-project.org/package=epiworldR
Meyer, D., & Vega Yon, G. G. (2023). epiworldRShiny: Shiny interface for epiworldR. https://cran.r-project.org/package=epiworldRShiny
Vega Yon, G. G. (2023). Defm: Estimation and simulation of multi-binary response models. htps://cran.r-project.org/package=defm
Vega Yon, G. G. (2022). Aphylo: Statistical inference of annotated phylogenetic trees. https://cran.r-project.org/package=aphylo
Vega Yon, G. G. (2022). epiworld: A flexible and general agent based model engine. https://github.com/UofUEpiBio/epiworld
Vega Yon, G. G. (2021). Netplot: Beautiful graph drawing. https://cran.r-project.org/package=netplot
Vega Yon, G. G., & de la Haye, K. (2020). Ergmito: Exponential random graph models for small networks. https://cran.r-project.org/package=ergmito
Vega Yon, G. G. (2020). Barry: Your to-go motif accountant. https://github.com/USCbiostats/barry
Vega Yon, G. G. (2020). Fmcmc: A friendly MCMC framework. https://CRAN.R-project.org/package=fmcmc
Vega Yon, G. G. (2020). Pruner: Implementing the felsenstein’s tree pruning algorithm. https://github.com/USCbiostats/pruner
Vega Yon, G. G. (2020). Rgexf: Build, import and export GEXF graph files. https://CRAN.R-project.org/package=rgexf
Vega Yon, G. G. (2020). slurmR: A lightweight wrapper for ’slurm’. https://CRAN.R-project.org/package=slurmR
Vega Yon, G. G., & Valente, T. (2020). netdiffuseR: Analysis of Diffusion and Contagion Processes on Networks. https://doi.org/10.5281/zenodo.1039317
Vega Yon, G. G., & Quistorff, B. (2019). Parallel: Stata module for parallel computing. https://github.com/gvegayon/parallel
Vega Yon, G. G. (2017). googlePublicData: Working with google’s ’public data explorer’ DSPL metadata files. https://CRAN.R-project.org/package=googlePublicData
Vega Yon, G. G., & Muñoz, E. (2017). ABCoptim: Implementation of artificial bee colony (ABC) optimization. https://CRAN.R-project.org/package=ABCoptim
Vega Yon, G. G. (2016). twitterreport: Out-of-the-box analysis and reporting tools for twitter (Version v0.16) [Computer software]. Zenodo. https://doi.org/10.5281/zenodo.44528