How socalwhales.org collects data and scores conditions
This page explains where the numbers on socalwhales.org come from, how they are collected, and how the "Good Day to Go" score is calculated. Transparency about data sourcing and scoring is especially important on a site with affiliate booking links — you should be able to verify independently that the score is based on objective signals, not commercial arrangements.
Sighting counts come from the public trip-report pages published by Southern California whale-watching operators. Each operator posts a daily or per-trip summary on their website describing which species were spotted and in what numbers. An automated scraper reads these pages once per day and extracts the sighting data.
Operators monitored:
For each operator, the scraper captures: the date of the trip report, the species observed (blue whale, humpback, gray whale, fin whale, common dolphin, bottlenose dolphin, and others as reported), and the sighting count per species. Where an operator lists multiple trips per day, counts are summed across all trips for that date.
No private data or paid APIs are used. The scraper reads only information that is already publicly visible on each operator's website — the same trip reports a visitor would read manually. If a source page changes its format, the scraper may temporarily produce incomplete data until the extraction logic is updated.
Data is baked in at build time, not fetched live. When the site is rebuilt each day, the latest scraped data is embedded directly into the static HTML. There is no server-side code running on each page load — the numbers you see were current at the time of the most recent build. This means there can be a gap of up to 24 hours between a new trip report being published by an operator and appearing on this site.
Species identification follows the operator's own report language. The scraper maps common variations ("blue whales", "blues", "BW") to a canonical species ID. Where an operator reports "whales seen" without specifying a species, the count is recorded under an unclassified category and included in total sighting counts but not attributed to a specific species.
Marine and wind forecast data comes from the Open-Meteo marine API (open-meteo.com). Open-Meteo is a free, open-source weather service that provides marine forecast data derived from the GFS and ICON global forecast models. No account, API key, or subscription is required to use it.
The scraper requests a 7-day hourly forecast for a grid point in Newport Beach / Southern California offshore waters. The forecast grid is fixed to approximately 33.6°N, 118.0°W — offshore of Newport Beach in the Southern California Bight. This represents typical conditions for whale-watching trips departing from Newport Beach, Dana Point, and Long Beach. Conditions in San Diego harbor may differ from this grid point.
Variables pulled from the marine forecast:
The scraper calls two separate Open-Meteo endpoints: the marine API for wave height and wave period, and the standard weather API for wind speed. The hourly results from both calls are merged on the timestamp field to produce a single combined hourly dataset. The daily summary values displayed on the site (e.g. "wave height: 1.2 m") are derived from the midday hours of each forecast day.
Wave period is collected but is not currently used in the Good Day to Go score — it is retained for potential future use as a comfort indicator (short choppy periods are less comfortable than long rolling swells even at the same wave height).
The Good Day to Go indicator is a composite score that combines three objective signals into a single 0–100 number. It is designed to answer one practical question: "Is today — or this coming weekend — a good time to book a whale-watching trip in Southern California?"
1. Recent sightings (last 48 hours) — Higher is better. The total number of whale and dolphin sightings reported by all monitored operators in the 48 hours prior to the current build date. This is the strongest signal in the score: if multiple operators have reported large numbers of animals in the last two days, the animals are almost certainly still in the area. A long gap without any operator reporting is a meaningful negative signal even in conditions where the weather is good.
2. Wave height — Lower is better. Wave height in meters at the Newport Beach offshore grid point, forecast for the target date. Whale-watching vessels operate comfortably in seas under 1.5 m. Above 2.0 m, many trips are uncomfortable for passengers; above 3.0 m, some operators cancel. The score applies a penalty that increases non-linearly as wave height rises past a comfort threshold.
3. Wind speed — Lower is better. Wind speed in km/h at 10 m above sea level, forecast for the target date. Calm or light winds (under 20 km/h) have negligible impact on the score. Moderate winds (20–35 km/h) apply a mild penalty. Strong winds above 35 km/h can make spotting difficult due to surface chop and whitecaps, and apply a larger penalty.
The composite score is expressed on a 0–100 scale and mapped to four plain-English tiers:
The score is an estimate based on recent patterns and weather forecasts — it is not a guarantee. Several factors can cause the score to be misleading:
If you notice the score behaving unexpectedly — for example, showing "Great" during a period when no operators are reporting sightings — please get in touch so the scraper logic can be reviewed.