Decoding Viral Diffusion: High-Resolution Modeling of COVID-19’s First Waves of Expansion

Bryan White
Mar 6
24 min read

Researcher in lab coat and mask analyzing a map on a computer screen showing colorful data patterns in a tech-filled lab setting.

Introduction

The emergence and rapid dissemination of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) presented the global scientific community with an unprecedented challenge: tracking a highly transmissible, rapidly mutating pathogen across vast, heterogeneous geographic landscapes. While traditional epidemiological surveillance has historically relied on temporal epidemic curves—plotting the raw number of new cases against time—this unidimensional approach fails to capture the physical movement of the disease across a map. An epidemic is not a stationary phenomenon that uniformly intensifies and subsides; rather, it behaves as a physical wave, originating from distinct epicenters and propagating outward through susceptible populations. Understanding the exact spatiotemporal dynamics of these waves is critical for evaluating the effectiveness of public health interventions, predicting future local surges, and optimizing the deployment of medical resources.

Recent advancements in computational epidemiology have provided the mathematical frameworks necessary to transition from temporal tracking to high-resolution spatiotemporal modeling. Specifically, researchers have successfully quantified the precise wavefront speed and areal expansion rates of the first two major epidemic waves of SARS-CoV-2 infections across the contiguous United States.¹ By synthesizing multiple streams of public health data through advanced Bayesian nowcasting models and projecting these estimates onto a standardized hexagonal grid, it is possible to measure the daily geographic expansion of the virus in square kilometers and its forward velocity in kilometers per day.¹

The resulting analysis reveals a complex narrative of viral diffusion. It highlights how the inherent biological characteristics of differing viral variants, the decentralized and often contradictory nature of public health policies, and the predictable patterns of human mobility combined to dictate the physical speed of the pandemic.¹ This report provides an exhaustive, advanced analysis of these spatiotemporal dynamics, detailing the computational methodologies utilized, the biological context of the viral variants, and the profound regional heterogeneities that defined the first two years of the American COVID-19 experience.

The Biological and Epidemiological Context of the Initial Outbreaks

To accurately interpret the spatial movement of the SARS-CoV-2 virus, one must first establish the biological and immunological landscapes in which these waves propagated. The first two major waves analyzed in these spatial models were not caused by identical pathogens operating in identical environments. They were defined by the emergence of distinct viral variants of concern, acting upon populations with vastly different levels of baseline immunity and operating under different sets of non-pharmaceutical interventions.

The First Wave: A Naive Population and Wild-Type Transmission

The surge designated as Wave 1 in contemporary spatial analyses began its ascent in mid-September 2020 and maintained a high level of geographic expansion through February 2021.¹ From an immunological standpoint, this period represented the most vulnerable phase of the pandemic. The American population was entirely naive to the virus; vaccines would not become available to frontline healthcare workers until late December 2020 and were not widely available to the general public until the spring of 2021.¹ Consequently, population immunity was alarmingly low, derived exclusively from individuals who had survived prior infections with the initial ancestral strains of the virus.¹

During the initial stages of Wave 1, the predominant pathogens in circulation were the original wild-type SARS-CoV-2 lineages and early mutated offshoots.¹ As the wave progressed into late 2020 and early 2021, the epidemiological landscape began to shift with the introduction and subsequent spread of the Alpha variant, scientifically designated as lineage B.1.1.7.¹ The Alpha variant contained several consequential mutations, particularly in the spike protein, which afforded it a distinct fitness advantage and increased transmissibility compared to the ancestral strains.⁷ However, genomic surveillance indicates that other lineages, such as B.1.1.420, were also circulating and competing during this timeframe, exhibiting positively selected mutations at key positions (such as L452 and P681) that would later become hallmarks of even more infectious variants.⁷

Despite the lack of pharmaceutical defenses, the physical spread of Wave 1 was actively contested by widespread, albeit decentralized, public health interventions. Throughout the fall and winter of 2020, many jurisdictions maintained strict restrictions on human mobility. These non-pharmaceutical interventions included travel advisories, the closure of indoor dining and entertainment venues, mandatory quarantine periods for interstate travelers, and widespread masking mandates.¹ Consequently, the virus's ability to traverse the geographic landscape was heavily dependent on localized breakdowns in these protocols and specific behavioral vectors that circumvented the established mitigation strategies.

The Second Wave: Viral Evolution and the Delta Variant

The epidemiological environment of Wave 2, which emerged in early July 2021 and peaked in early September 2021, was drastically different.¹ By the summer of 2021, the United States had executed a massive national vaccination campaign. Millions of citizens had received mRNA or viral vector vaccines, theoretically establishing a robust wall of population immunity that standard epidemiological models predicted would severely dampen the basic reproduction number of the virus and halt its spatial expansion.¹

However, this anticipated suppression was entirely neutralized by the evolutionary trajectory of the virus. Wave 2 was driven almost exclusively by the Delta variant, or lineage B.1.617.2.¹ First identified in late 2020, the Delta variant possessed profound evolutionary advantages over both the wild-type strains and the Alpha variant. Epidemiological estimates indicate that Delta was highly contagious, boasting a basic reproduction number (R0) estimated at 5.08, placing its innate transmissibility far higher than previous variants and higher than many other historic viral infections.⁹

The primary driver of this extreme transmissibility was a massive increase in viral load. Quantitative genomic analyses suggest that individuals infected with the Delta variant generated, on average, 6.2 times more viral RNA copies per milliliter of sample than those infected with the Alpha variant during their respective emergence periods.¹⁰ Specifically, testing of the ORF1ab target gene revealed that the concentration of the Delta variant was ten times higher than that of the wild type and twice as high as the Alpha variant.¹¹ This intensely elevated viral load directly influenced the levels of transmission and infectivity, extending the duration of viral shedding and drastically increasing the probability of transmission during human contact.⁹

Furthermore, research utilizing high-resolution datasets on human social interactions demonstrated that the enhanced viral load of the Delta variant influenced the correlation of viral loads within transmission chains.¹² In a transmission pair, the viral load of the infector at the exact time of transmission directly influences the subsequent infectiousness of the newly infected individual.¹³ The high baseline shedding of the Delta variant ensured that initial infectious doses were massive, leading to shorter generation intervals and a rapid acceleration of localized outbreaks.¹²

This massive increase in transmissibility manifested in state-level epidemiological metrics. For example, analyses estimating the variant-specific effective reproductive numbers (Rt) for states in New England found that the multiplicative increase in Rt for Delta was substantially greater than that for Alpha across all jurisdictions.¹⁰ This viral advantage, however, exhibited regional heterogeneity; the multiplicative increase in the effective reproductive number for Delta was estimated to be highest in Maine (a 1.99-fold increase) and lowest in Massachusetts (a 1.45-fold increase), suggesting that underlying population density, existing immunity, and local behaviors heavily modulated the pathogen's biological potential.¹⁰

Compounding the biological threat of the Delta variant was a massive shift in sociological behavior. Driven by widespread pandemic fatigue and an overreliance on the newly available vaccines, local and state governments across the United States rapidly rolled back non-pharmaceutical interventions throughout the spring and early summer of 2021.¹ Masking policies were largely abandoned, capacity limits on indoor venues were lifted, and domestic travel returned to near pre-pandemic levels.¹ This explosive combination—a highly transmissible, high-viral-load variant unleashed into a highly mobile, socially unconstrained population—created the perfect environment for unprecedented spatiotemporal expansion.

Surmounting Data Limitations: The Shift to Nowcasting Models

Attempting to measure the precise geographic speed of an epidemic requires pristine, continuous data. Unfortunately, the raw notification data generated by the American public health apparatus—the daily counts of confirmed cases reported by county and state health departments—is inherently flawed and largely unsuitable for high-resolution spatial modeling.

The Flaws of Raw Notification Data

Raw case reporting is subject to a multitude of systemic biases that distort the true progression of an outbreak. First, raw data is heavily influenced by the administrative rhythms of the healthcare system; testing centers frequently close on weekends, resulting in artificial dips in case counts on Sundays and Mondays, followed by artificial surges on Tuesdays and Wednesdays as backlogs are processed.¹⁵ Second, raw case counts are entirely dependent on local testing capacity and the varying test-seeking behaviors of different demographic groups.⁸ Finally, reporting delays create a persistent downward bias in the most recent data; a decline in reported cases over a given week may not indicate a true epidemiological decline, but merely a delay in laboratory processing or bureaucratic reporting.¹⁵

If spatial epidemiologists were to use this raw notification data to calculate the speed of a viral wavefront, their results would measure the speed of bureaucratic reporting rather than the true biological diffusion of the virus.

Bayesian Evidence Synthesis and the COVIDestim Framework

To overcome these profound data limitations, researchers utilize advanced mathematical techniques known as "nowcasting".¹ Unlike forecasting, which attempts to predict the future, nowcasting utilizes complex statistical models to adjust incomplete, real-time data based on historical reporting patterns and delays, thereby generating a highly accurate estimate of the true, current state of disease transmission.¹⁵

For the exhaustive analysis of the initial U.S. infection waves, researchers relied on temporally and spatially resolved estimates of SARS-CoV-2 infections generated by the COVIDestim project.¹ The COVIDestim package is a robust Bayesian statistical model that does not rely solely on confirmed case counts. Instead, it synthesizes multiple streams of epidemiological evidence, jointly analyzing reported COVID-19 cases, hospital admissions, and mortality reports.¹

By anchoring the model with severe outcomes like hospitalizations and deaths—which are far less susceptible to ascertainment bias and test-seeking behavioral changes than mild cases—the COVIDestim framework can mathematically back-calculate the true number of infections occurring in a community.¹⁸ The model rigorously accounts for the natural history delays inherent in disease progression (the time from initial infection to symptom onset, to hospitalization, and to death) as well as the administrative reporting delays inherent in the surveillance system.¹⁸

The application of this Bayesian synthesis reveals the staggering extent of under-ascertainment during the early pandemic. Estimates derived from the model indicate that from the introduction of SARS-CoV-2 in the United States until January 1, 2021, only 22.4 percent of actual infections were identified and reported by the public health system.¹⁸ By utilizing these smoothed, highly accurate estimates of true daily infections, researchers ensure that their subsequent spatial calculations are based on the biological reality of the pathogen's spread, free from the artifacts of human administrative delays.

Geometric Standardization: The Transition to Hexagonal Grids

With accurate, smoothed estimates of daily infections secured, researchers must project this data onto a geographic map. Historically, spatial epidemiology in the United States has relied on the county as the fundamental unit of analysis. However, utilizing counties for calculating the physical velocity of a moving wavefront introduces severe geometric distortions.¹

The United States is composed of over 3,000 counties, representing highly irregular polygons that vary massively in physical size, shape, and population density. A single county in states like Nevada or Wyoming may encompass a geographic area larger than several entire states in the Northeast, yet contain only a fraction of their population.¹ Attempting to measure the distance a virus travels per day across a map comprised of wildly divergent shapes leads to mathematically unstable calculations. Furthermore, the extreme variance in population sizes creates severe statistical instability; as noted in public health literature, a raw death rate or infection rate in a county with only 800 residents can appear artificially high based on a mere two or three independent cases.¹⁹

Ultra-High-Resolution Population Mapping

To solve this spatial challenge, researchers abandoned the irregular political boundaries of counties in favor of a standardized, mathematically uniform canvas: a continuous hexagonal grid spanning the contiguous United States.¹

To populate this grid with accurate infection data, epidemiologists first utilized the COVIDestim daily infection rates calculated at the county level.¹ They then utilized 2019 United States population estimates, incorporating data from the U.S. Census Bureau covering all Census Block Groups, alongside ultra-high-resolution population size estimates provided by Meta, which map human population density down to a resolution of 30 square meters.¹

By overlaying the uniform hexagonal grid onto this high-resolution population map, researchers mathematically distributed the county-level infection estimates into the individual hexagons. This distribution was calculated based on the exact fraction of a county's population contained within the boundaries of each specific hexagon, operating under the assumption that the estimated infection rates per 100,000 persons were distributed equally across the populated areas within a given county.¹

This critical transformation standardizes the spatial geometry of the entire country. Regardless of whether a calculation is tracking the virus through the dense urban corridors of the Northeast or the sparse rural expanses of the Great Plains, the distance between the center points of adjacent geometries remains perfectly uniform, allowing for precise, consistent calculations of wavefront velocity and areal expansion.¹

Advanced Mathematical Modeling: The BYM2 Spatial Framework

Even with accurately nowcasted infection estimates distributed across a standardized hexagonal grid, the daily data remains subject to localized statistical noise. To isolate the true signal of the epidemic's physical expansion, the data must be smoothed across both space and time. Standard statistical methods, such as basic Difference-in-Differences (DID) approaches commonly used to estimate effects from observational panel data, are inadequate for this task because they typically ignore the vital element of spatial dependence—the reality that an outbreak in one hexagon fundamentally influences the risk in adjacent hexagons.²¹

To accurately capture this geographic contagion, researchers applied a highly sophisticated Bayesian hierarchical spatial model: a reparameterized version of the Besag, York, and Mollié (BYM) model, commonly referred to as BYM2.¹

The Architecture of the BYM Model

The original BYM model, developed in the early 1990s, is a foundational tool in disease risk mapping. It is fundamentally a lognormal Poisson model that estimates the unknown logarithmic relative risk for a specific spatial zone given the observed number of cases in that zone.²³

The model calculates this risk by combining an overall baseline risk level (a fixed global intercept) with explanatory spatial covariates, and crucially, two distinct types of random effects.²³

The Spatially Structured Component: This component is modeled using an Intrinsic Conditional Auto-Regressive (ICAR) structure. It mathematically enforces spatial dependence, operating on the epidemiological assumption that neighboring geographic units will exhibit similar disease risk profiles. If an outbreak is raging in one hexagon, the ICAR component inherently raises the expected risk in all adjacent hexagons, allowing the model to "borrow strength" across the map and smooth out isolated spatial anomalies.²³
The Unstructured Random Effects Component: This is an ordinary random effect, often modeled as independent and identically distributed Gaussian noise. It accounts for non-spatial heterogeneity and over-dispersion in the data—essentially, it captures the highly localized, independent events or localized vulnerabilities that cause a single hexagon to spike independently of its neighbors.²³

The BYM2 Reparameterization and Penalized Complexity Priors

While the original BYM model is powerful, it suffers from a fundamental structural issue: the spatially structured component (the ICAR) and the unstructured component cannot be viewed independently, leading to severe confounding and making it exceptionally difficult to define prior distributions for the model's hyperparameters.²²

To resolve this, the spatiotemporal analysis of the U.S. infection waves utilized the BYM2 reparameterization, implemented using the Integrated Nested Laplace Approximation (R-INLA) package.¹ The BYM2 modification elegantly scales the spatial component, solving the confounding issue by specifying the total variance of the combined spatial and non-spatial random effects, along with the specific proportion of that total variance that is strictly spatial.¹⁹

This separation allows for vastly improved parameter control and the assignment of interpretable, robust hyperpriors.²² Specifically, the researchers utilized Penalized Complexity (PC) priors for the BYM2 random effects.¹⁹ PC priors are a mathematically rigorous method of preventing the model from overfitting; they penalize the model for unnecessary complexity based on the information-theoretic distance from a simpler, base model, ensuring that the estimated spatial smoothing is driven entirely by the genuine signal in the data rather than statistical artifacts.¹⁹

The model can be descriptively summarized as calculating the daily SARS-CoV-2 infections per 100,000 persons in a specific hexagon as the sum of a global intercept, the random effects assigned via the BYM2 prior distribution, and an independent residual error term.¹ By fitting this complex hierarchical model separately for every single day from March 2020 through December 2021, the researchers generated a seamless, mathematically stabilized sequence of daily spatial distributions across the entire North American continent, providing the definitive canvas required to calculate the speed of the advancing viral wavefronts.¹

Defining and Quantifying Wavefront Dynamics

With the heavily processed, spatially smoothed infection data projected onto the hexagonal grid, the final methodological step is to definitively establish the boundaries of the epidemic waves and mathematically calculate their expansion.

To define a region as actively encompassed by a wave, the researchers applied a strict quantitative threshold. Rather than relying on the very first reported case—which is highly unstable and subject to extreme outlier behavior—they established a wavefront boundary corresponding to an incidence of greater than 190 infections per 100,000 persons per day.¹ While this threshold produces a calculated wavefront that slightly lags behind the absolute leading edge of the earliest index cases, it provides a far more stable, epidemiologically meaningful measure of sustained community transmission.⁵ To ensure the robustness of their findings, the researchers also conducted rigorous sensitivity analyses, rerunning the entire spatial calculation using alternative thresholds of 85 and 300 infections per capita, which yielded highly consistent overall results regarding wave behavior.³

With the boundaries of the waves defined for every single day of the analysis, two distinct spatial metrics were calculated to quantify the spatiotemporal dynamics:

Areal Expansion Rate: This metric quantifies the sheer volume of geographic territory consumed by the advancing epidemic. It is defined simply as the absolute daily change in the total area encompassed by the wave, calculated by determining the total square kilometers of new hexagons recruited into the high-transmission boundary each day.¹
Wavefront Speed: This metric calculates the direct linear velocity of the virus's advance across the map. For every single vertex along the jagged boundary of the hexagonal wavefront on a given day (day d), the researchers calculated the shortest orthogonal distance to the nearest point on the newly expanded boundary line on the subsequent day (day d+1).¹ This distance, measured in kilometers per day, provides a high-resolution measure of how fast the virus is marching forward. By averaging these thousands of individual vertex speeds, they derived the daily mean speed of the entire advancing wave.⁵

Spatiotemporal Patterns: The First Wave and Contagion Externalities

Applying this exhaustive computational framework to the data from the fall of 2020 reveals the striking spatial origins and subsequent diffusion of Wave 1. The data decisively illustrates that major national epidemics do not originate spontaneously across all geographies; they are ignited in specific locales by distinct human behavioral events.

The Sturgis Motorcycle Rally: A Spatial Catalyst

The spatial tracking isolated the definitive origin of the Wave 1 epicenter to a specific cluster spanning parts of central South Dakota, eastern North Dakota, and northeastern Montana in September 2020.⁵ The emergence of this massive geographic epicenter perfectly coincides in both time and space with one of the most significant superspreading events of the entire pandemic: the Sturgis Motorcycle Rally.⁵

Held over ten days in August 2020 in the small municipality of Sturgis, South Dakota, the event attracted an estimated 462,000 attendees.⁶ Crucially, genomic and spatial tracking indicated that these attendees traveled from an astonishing 61 percent of all counties across the United States.²⁶ The epidemiological environment of the rally was optimized for viral transmission; attendees participated in densely packed indoor and outdoor activities in an environment where neither the city, the county, nor the state had instituted any face covering requirements or substantive business restrictions, and physical distancing recommendations were broadly ignored.²⁶

The immediate local impact was severe. Between August 1 and September 15, the 14-day coronavirus testing volume in Meade County, South Dakota, spiked by 199 percent, and the local test positivity rate rapidly escalated from 5 percent to 8 percent.²⁶ The demographic profile of the primary identified patients heavily skewed toward individuals aged 40 to 59 years old (over half of the infections), with a majority being male (60 percent) and predominantly White (84 percent).²⁶

However, the true devastation of the Sturgis rally was not local, but spatial. In epidemiological terms, the event generated a massive "contagion externality".²⁷ The hundreds of thousands of infected attendees subsequently mounted their motorcycles and dispersed across the continent, taking the virus back to their home communities. Intensive contact tracing revealed that South Dakota and its immediate bordering states—Minnesota, Montana, North Dakota, Nebraska, and Wyoming—accounted for 56 percent of all early rally-related infections.²⁶ In Minnesota alone, public health authorities traced secondary and tertiary COVID-19 infections linked to the rally to one-third of all counties in the state.²⁶

The Sturgis rally acted as a massive spatial catalyst, seeding localized outbreaks across a massive geographic radius. These independent secondary epicenters rapidly expanded, merging together to form the massive, contiguous wavefront of Wave 1 that the BYM2 models tracked rolling outward from the Midwest to eventually consume the rest of the contiguous United States. By mathematically defining the boundaries of this expanding wave, the analysis categorized Wave 1 as actively propagating from September 17, 2020, through February 19, 2021.¹

Spatiotemporal Patterns: The Second Wave and Fragmented Emergence

The spatial progression of Wave 2 presented a radically different topological pattern. While Wave 1 expanded relatively contiguously from a massive midwestern epicenter, the Delta-driven second wave exhibited a highly decentralized, chaotic emergence pattern, reflecting the complex interaction between a hyper-transmissible variant and a highly mobile population operating under relaxed public health protocols.

The Ozarks Epicenter and the Impact of Relaxed Mandates

The spatial modeling identified the initial origin point of Wave 2 in the Ozarks region, primarily centered in southern Missouri, in early July 2021.⁵ The selection of this specific region as the beachhead for the Delta variant was not biologically random; it was a direct consequence of local socio-political environments.

During this period, municipalities in the Ozarks—most notably high-traffic tourist destinations like Branson, Missouri—were aggressively pursuing economic reopening strategies.¹ Indoor music venues and entertainment theaters were reopening to full-capacity crowds, and local government officials, including mayors characterized in subsequent analyses as actively "anti-mask," had aggressively relaxed or entirely rescinded local masking policies.¹ The introduction of the highly contagious Delta variant into this specific environment—characterized by high throughput of regional tourists and an absence of physical mitigation—allowed the virus to rapidly establish a dominant foothold.¹

Rapid Secondary Seeding via Domestic Travel

However, the spatial spread of Wave 2 did not rely solely on contiguous, outward expansion from the Ozarks. Shortly following the intense outbreak in Missouri, the BYM2 spatial models detected the near-simultaneous appearance of massive secondary epicenters of elevated SARS-CoV-2 infections thousands of miles away in the Pacific Northwest.⁵

This fractured, multi-focal origin pattern indicates that the spatiotemporal dynamics of the Delta variant were heavily accelerated by modern human mobility. Previous epidemiological investigations have consistently implicated both domestic and international travel networks in contributing to the transmission of SARS-CoV-2 across the United States.⁵ Because the Delta variant generated viral loads over six times higher than previous variants, leading to substantially shorter generation intervals, infected individuals traveling through aviation hubs were highly efficient vectors.¹⁰ A tourist infected in the unmasked indoor venues of the Ozarks could easily board a flight and seed a completely independent cluster in the Pacific Northwest before the original infection was ever detected.⁵

The decentralized emergence of multiple, high-intensity epicenters during the initial days of Wave 2 (categorized by the spatial models as actively propagating from July 3, 2021, to September 4, 2021) set the stage for a phenomenally rapid geographic expansion.¹

Comparative Analysis: The Velocity of Viral Expansion

By applying the rigorous definitions of areal expansion and wavefront speed to the smoothed daily spatial data, researchers generated definitive quantitative comparisons between the two waves. The resulting data clearly demonstrates that the Delta-driven Wave 2 was a significantly faster and more geographically aggressive biological event than the wild-type/Alpha wave that preceded it.¹

Territorial Acquisition: Areal Expansion Rates

The rate at which an epidemic consumes geographic territory is a critical metric for understanding its impact on regional healthcare systems. The areal expansion rate calculations reveal the explosive nature of the second wave.

Table 1: Comparative Areal Expansion Rates of SARS-CoV-2 Epidemic Waves

Spatial Metric	Wave 1 (Fall/Winter 2020)	Wave 2 (Summer 2021)
Predominant Viral Variant	Wild-type, transitioning to Alpha	Delta (B.1.617.2)
Initial Area Affected	134,200 square kilometers	23,100 square kilometers
Maximum Area Encompassed	6,515,300 square kilometers	7,573,500 square kilometers
Mean Daily Areal Expansion	101,287 km²/day	119,848 km²/day
Median Daily Areal Expansion	64,900 km²/day	69,300 km²/day
Interquartile Range (IQR)	20,350 to 199,650 km²/day	29,700 to 192,500 km²/day

Data reflects territorial acquisition based on the >190 infections per 100,000 threshold across the standardized hexagonal grid.¹

As demonstrated in Table 1, while both waves eventually consumed millions of square kilometers of the contiguous United States, Wave 2 acquired territory at a significantly faster pace. The mean areal expansion rate of Wave 2 exceeded that of Wave 1 by over 18,500 square kilometers every single day.⁴ This accelerated territorial acquisition is a direct mathematical consequence of the multiple seeding events described earlier. An epidemic expanding contiguously from a single central point is limited by the circumference of that singular expanding circle. In contrast, when the Delta variant seeded multiple independent epicenters in the Ozarks, the Pacific Northwest, and elsewhere, it effectively created multiple expanding perimeters simultaneously. The aggregate addition of these multiple expanding boundaries resulted in a vastly larger daily increase in total high-transmission area.

Linear Velocity: Measuring Wavefront Speed

While areal expansion measures the volume of the outbreak, wavefront speed measures the direct linear velocity of the virus's advance toward uninfected communities. Once again, the spatial data definitively establishes the superior speed of the Delta variant.

Table 2: Overall Mean Wavefront Velocities

Velocity Metric	Wave 1	Wave 2
Overall Mean Wavefront Speed	20.3 km/day	25.9 km/day
Overall Median Wavefront Speed	20.6 km/day	20.6 km/day
Overall Interquartile Range (IQR)	0 to 35.6 km/day	0 to 35.6 km/day
IQR of Daily Mean Speeds	4.4 to 33.4 km/day	7.0 to 32.9 km/day
Max Length of High-Speed Wavefront	7,017 kilometers	9,157 kilometers

Data represents the calculated orthogonal advance of the high-transmission boundary across all spatial geometries during the analysis periods.¹

Across all tracked geographies and timeframes, Wave 2 advanced at an overall mean speed of 25.9 kilometers per day, significantly outpacing the 20.3 kilometers per day calculated for Wave 1.¹ Furthermore, the sheer physical size of the high-velocity portion of the wave was far larger during the second surge. To quantify the intensity of the advancing edge, researchers calculated the total length of the wavefront that was physically moving at a rate greater than or equal to the median of the daily maximum estimated speeds. At its most aggressive point—18 days before its national infection peak—Wave 1 featured a contiguous high-speed wavefront stretching 7,017 kilometers.¹ In stark contrast, 9 days before its national peak, the Delta-driven Wave 2 maintained a high-speed boundary spanning an astonishing 9,157 kilometers across the continent.¹

The Acceleration Profile of Epidemic Surges

Epidemic waves do not travel at a constant velocity; they gather momentum as the sheer number of infectious hosts increases, providing exponentially more vectors for forward transmission. To track this acceleration, researchers partitioned the timeline leading up to the absolute peak of each respective wave into distinct 21-day chronological intervals.

Table 3: Wavefront Acceleration Approaching the Infection Peak

Chronological Interval	Wave 1 Mean Speed (km/day)	Wave 2 Mean Speed (km/day)
63 to 43 Days Prior to Peak	3.8	6.8
42 to 22 Days Prior to Peak	17.0	19.8
21 to 0 Days Prior to Peak	33.9	40.8

Data reflects the sequential increase in mean daily velocity as the outbreaks matured toward their maximum incidence levels.¹

Table 3 illustrates a severe exponential acceleration curve for both epidemics. However, it conclusively demonstrates that during every single time interval leading up to the apex of the infection curve, Wave 2 was moving physically faster across the geographic landscape than Wave 1.¹ In the final three weeks before the national crest, the Wave 2 wavefront was advancing at nearly 41 kilometers per day.⁵ This represents an aggressive, unrelenting diffusion through American communities, demonstrating that a pathogen optimized for high viral shedding can physically outpace earlier iterations of the same virus, even when facing a population with a much higher degree of baseline immunity.

Methodological Limitations and Alternative Surveillance Frameworks

While the application of the BYM2 spatial smoothing model to the COVIDestim nowcast data provides unprecedented insights into the physical movement of the SARS-CoV-2 virus, the methodology is not without limitations. Rigorous academic peer review of these spatial modeling techniques has identified several areas where the mathematical definitions may obscure underlying complexities.

Criticisms of Global Speed Metrics and Orthogonal Assumptions

One primary limitation highlighted in scientific discourse is that calculating a single, global "mean daily wavefront speed" can be overly simplistic and may fail to fully capture the extreme regional heterogeneity of the epidemic.⁶ The United States is a vast, geographically diverse nation; a global mean speed calculation mathematically averages the rapid spread occurring in densely populated urban corridors with the significantly slower spread occurring across natural barriers or sparse rural counties. While global metrics are useful for broad comparisons between variants, they do not always provide actionable intelligence regarding localized events.²⁸

Furthermore, the geometric calculation of wavefront speed relies on measuring the straight-line, orthogonal distance between the wavefronts of consecutive days.²⁸ Critics point out that this mathematical definition can occasionally produce inconsistencies when compared to total elapsed time. For instance, if an overall median speed suggests an advance of only a few kilometers per day, this mathematically conflicts with the observed reality that the virus traversed the entire continent in approximately 45 days, suggesting that the orthogonal component measurement may underestimate the rapid jumps in transmission facilitated by long-distance travel networks.²⁸ Finally, peer reviewers have noted that defining the waves entirely through a spatial boundary threshold (e.g., >190 infections per 100,000) while describing them purely in temporal terms (e.g., September 2020 to February 2021) creates a slight semantic disconnect; a truly comprehensive definition should integrate both the spatial and temporal parameters simultaneously.³

The Future of Spatial Tracking: Wastewater-Based Epidemiology

As traditional public health notification data becomes increasingly unreliable due to the widespread adoption of at-home rapid antigen testing (which is rarely reported to health departments), spatial epidemiologists are increasingly relying on alternative data streams to fuel their models.

Wastewater-based epidemiology has emerged as a powerful complementary tool for tracking spatiotemporal disease dynamics.²⁹ By continuously monitoring municipal wastewater treatment plants, researchers can quantify the exact concentration of SARS-CoV-2 RNA in a given geographic area, entirely bypassing the biases of human test-seeking behavior and clinical reporting delays.²⁹

Studies evaluating wastewater data across multiple regional epidemic waves have demonstrated that regression models can accurately quantify the relationship between viral RNA concentrations in sewage and the true prevalence of the disease in the community above.²⁹ For example, comprehensive evaluations in Saxony, Germany, revealed that while median viral loads per diagnosed case differed among specific treatment sites, the inclusion of specific lag and lead times, flow discharge rates, and log-log-transformed data allowed models to achieve highly satisfactory goodness-of-fit, enabling accurate back-estimation of true disease prevalence.²⁹ Integrating this high-resolution, geographically anchored wastewater data into advanced Bayesian spatial models like BYM2 represents the next frontier in quantifying epidemic wavefronts, providing a continuous, objective stream of spatial data immune to the bureaucratic delays that plague traditional surveillance.

Synthesis and Implications for Pandemic Preparedness

The exhaustive quantification of the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 in the United States fundamentally advances the science of epidemiology. By abandoning flawed raw case counts in favor of Bayesian nowcasting frameworks, and utilizing standardized hexagonal population grids paired with sophisticated BYM2 spatial smoothing models, researchers have successfully extracted the true physical velocity of a moving pathogen.¹

The resulting data shatters the assumption that population immunity alone dictates the speed of an epidemic. The comparison between Wave 1 and Wave 2 conclusively demonstrates that the physical speed and areal expansion of a virus are determined by a complex interplay between the pathogen's evolving biological characteristics and the sociological environment it exploits.¹ Wave 1, fueled by wild-type and Alpha lineages, demonstrated how singular mass gatherings like the Sturgis Motorcycle Rally can generate massive contagion externalities, acting as spatial catalysts that seed expanding wavefronts across multiple state lines.¹

However, Wave 2 revealed a far more aggressive dynamic. The biological evolution of the Delta variant—which generated an unprecedented 6.2-fold increase in viral shedding compared to Alpha—coincided directly with a decentralized, fragmented rollback of public health mandates across the United States.⁶ This convergence of a hyper-transmissible variant with relaxed masking policies and restored domestic travel networks resulted in a decentralized wave origin, with simultaneous epicenters emerging from the Ozarks to the Pacific Northwest.⁵ The quantitative result was an epidemic surge that acquired geographic territory significantly faster than the first wave, achieving wavefront speeds exceeding 40 kilometers per day as it approached its apex, despite the widespread availability of vaccines.¹

Understanding these spatiotemporal dynamics is not merely a retrospective academic exercise; it represents the foundation of a proactive, predictive public health infrastructure. If epidemiologists can accurately calculate the current speed and trajectory of an advancing viral wavefront, public health responses can transition from reactive, universal lockdowns to highly targeted, spatial interventions. By preemptively deploying medical resources, increasing testing capacity, and instituting temporary mitigation protocols strictly in the geographic zones immediately ahead of the advancing wave, authorities can effectively blunt the areal expansion of future pathogens, minimizing both loss of life and broad societal disruption.¹

Works cited

Quantifying the spatiotemporal dynamics of the first two epidemic ..., accessed March 5, 2026, https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1013983
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States | medRxiv, accessed March 5, 2026, https://www.medrxiv.org/content/10.1101/2025.01.08.24319433v4
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States - medRxiv.org, accessed March 5, 2026, https://www.medrxiv.org/content/10.1101/2025.01.08.24319433v3.full.pdf
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC12959703/
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States - medRxiv, accessed March 5, 2026, https://www.medrxiv.org/content/10.1101/2025.01.08.24319433v4.full.pdf
Article Summary Line: We identified, visualized, and estimated the speed and spatial extent of the two largest US SARS-CoV-2 wav - medRxiv.org, accessed March 5, 2026, https://www.medrxiv.org/content/10.1101/2025.01.08.24319433v1.full.pdf
The early SARS-CoV-2 epidemic in Senegal was driven by the local emergence of B.1.416 and the introduction of B.1.1.420 from Europe - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC8971539/
COVID-19 pandemic in the United States - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC7451131/
reproductive number of the Delta variant of SARS-CoV-2 is far higher compared to the ancestral SARS-CoV-2 virus | Journal of Travel Medicine | Oxford Academic, accessed March 5, 2026, https://academic.oup.com/jtm/article/28/7/taab124/6346388
Comparative transmissibility of SARS-CoV-2 variants Delta and Alpha in New England, USA, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC8913280/
Viral shedding patterns of symptomatic SARS-CoV-2 infections by periods of variant predominance and vaccination status in Gyeonggi Province, Korea - Epidemiology and Health, accessed March 5, 2026, https://www.e-epih.org/journal/view.php?doi=10.4178/epih.e2023008
Detecting changes in generation and serial intervals under varying pathogen biology, contact patterns and outbreak response - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC10990235/
Correlation of viral loads in disease transmission could affect early estimates of the reproduction number - Royal Society Publishing, accessed March 5, 2026, https://royalsocietypublishing.org/rsif/article/20/202/20220827/90395/Correlation-of-viral-loads-in-disease-transmission
A Timeline of COVID-19 Developments in 2020 - AJMC, accessed March 5, 2026, https://www.ajmc.com/view/a-timeline-of-covid19-developments-in-2020
Behind the Model: Nowcasting | CFA - CDC, accessed March 5, 2026, https://www.cdc.gov/cfa-behind-the-model/php/data-research/nowcasting.html
Baseline nowcasting methods for handling delays in epidemiological data., accessed March 5, 2026, https://wellcomeopenresearch.org/articles/10-614
covidestim: A Vital Tool for Pandemic Response | Yale School of Public Health, accessed March 5, 2026, https://ysph.yale.edu/research/research-centers-and-initiatives/public-health-modeling-unit/analyses/covidestim-a-vital-tool-for-pandemic-response/
Reconstructing the course of the COVID-19 epidemic over 2020 for US states and counties: Results of a Bayesian evidence synthesis model - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC9467347/
An introduction to bayesian spatial smoothing methods for disease mapping: modeling county firearm suicide mortality rates - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC12096304/
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States | PLOS Computational Biology, accessed March 5, 2026, https://journals.plos.org/ploscompbiol/article/figures?id=10.1371/journal.pcbi.1013983
Spatial Difference-in-Differences with Bayesian Disease Mapping Models - ResearchGate, accessed March 5, 2026, https://www.researchgate.net/publication/395403833_Spatial_Difference-in-Differences_with_Bayesian_Disease_Mapping_Models
An intuitive Bayesian spatial model for disease mapping that accounts for scaling - Ovid, accessed March 5, 2026, https://www.ovid.com/journals/smmr/fulltext/10.1177/0962280216660421~an-intuitive-bayesian-spatial-model-for-disease-mapping-that
Bayesian Hierarchical Spatial Models: Implementing the Besag York Mollié Model in Stan - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC6830524/
Full article: Spatial statistical modelling of insurance risk: a spatial epidemiological approach to car insurance - Taylor & Francis, accessed March 5, 2026, https://www.tandfonline.com/doi/full/10.1080/03461238.2019.1576146
Week 6 Disease Mapping III: Introduction to Fully Bayesian mapping | EPI 563: Spatial Epidemiology, Fall 2023 - GitHub Pages, accessed March 5, 2026, https://mkram01.github.io/EPI563-SpatialEPI/disease-mapping-iii-introduction-to-fully-bayesian-mapping.html
News Scan for Apr 29, 2021 - CIDRAP, accessed March 5, 2026, https://www.cidrap.umn.edu/news-scan-apr-29-2021
Emergence of an early SARS-CoV-2 epidemic in the United States - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC8313480/
Quantifying the spatiotemporal dynamics of the first two epidemic waves of SARS-CoV-2 infections in the United States | PLOS Computational Biology - Research journals, accessed March 5, 2026, https://journals.plos.org/ploscompbiol/article/peerReview?id=10.1371/journal.pcbi.1013983
Regional and temporal differences in the relation between SARS-CoV-2 biomarkers in wastewater and estimated infection prevalence – Insights from long-term surveillance - PMC, accessed March 5, 2026, https://pmc.ncbi.nlm.nih.gov/articles/PMC9554318/