You push a deploy at 2pm on a Tuesday. Everything looks fine. The build passes, the tests are green, and your team moves on. But by Thursday, users are complaining that the checkout page feels sluggish. Your average load time has climbed from 1.1 seconds to 3.4 seconds - and nobody noticed until real people started bouncing.
This is exactly the scenario that website performance monitoring with synthetic tests is designed to prevent. And it's more common than most teams admit.
What Synthetic Monitoring Actually Is
Synthetic monitoring runs scripted, automated tests against your site at regular intervals - before and after deploys, and continuously in the background. Unlike real-user monitoring (RUM), which collects data only when actual visitors arrive, synthetic tests run on a fixed schedule regardless of traffic. That means you get consistent, comparable data every time.
Think of it as a robot that visits your site every 60 seconds, loads specific pages, measures every timing metric, and tells you immediately if anything changes. It doesn't care if it's 3am on a Sunday. It checks anyway.
Synthetic vs. Real-User Monitoring: Both Matter
RUM is great for understanding how diverse real-world conditions - slow mobile connections, old browsers, geographic distance - affect your actual users. But it has a serious blind spot: it can't tell you something is broken until users experience it.
Synthetic monitoring fills that gap. It creates a controlled, repeatable baseline. When a regression happens, you know exactly when it started and - because the test environment is consistent - you can isolate the cause much faster.
A good approach uses both. Synthetic tests catch regressions early. RUM data tells you the real-world impact.
The Metrics Worth Tracking in a Synthetic Test
Not all performance metrics are equally useful for catching regressions. These are the ones that actually signal a problem:
- Time to First Byte (TTFB): If this spikes, the problem is almost always server-side - a slow database query, a new uncached API call, or a misconfigured server directive. A TTFB above 200ms on a cached page is worth investigating immediately.
- Largest Contentful Paint (LCP): This is Google's primary measure of how fast a page feels. A change in LCP between two deploys usually points to a new render-blocking resource, a heavier hero image, or a missing preload hint. We covered the relationship between LCP and your hosting setup in detail in How Largest Contentful Paint Scores Expose the Real Bottlenecks in Your Hosting Setup.
- Total page weight: A sudden increase here - even 200KB - can point to an uncompressed asset, a new third-party script, or a misconfigured image pipeline.
- Number of requests: If your request count jumps after a deploy, something new is loading that wasn't there before.
- JavaScript execution time: A heavier JS bundle, a new library, or a poorly written render-blocking script will show up here before it kills your Core Web Vitals scores.
Setting Up a Synthetic Monitoring Baseline
The most common mistake teams make is setting up synthetic monitoring after a problem occurs. By then, you have nothing to compare against.
Run your synthetic tests for at least a week before you treat any result as a regression baseline. This smooths out outliers - a blip from a transient network issue, for example - and gives you a realistic picture of your normal performance range.
What Your Test Suite Should Cover
Cover more than just your homepage. The pages that matter most for regressions are usually the ones doing the most work:
- Your highest-traffic landing pages
- Any page with a database-heavy query (search, category pages, account dashboards)
- Checkout or conversion flows
- Any page that recently shipped new functionality
For each of these, run tests from at least two geographic locations - one close to your server and one farther away. The delta between those results tells you how well your CDN is actually performing.
How to Use Synthetic Data to Pinpoint a Regression
When a test fires an alert, the first thing to do is look at the timestamp and compare it to your deploy history. Most regressions are introduced by code changes, not infrastructure problems.
A Practical Debugging Workflow
Here's a workflow that works well in practice:
- Confirm the regression is consistent. Run the test 3-5 times manually. If it's flaky, you might be chasing a network anomaly rather than a real problem.
- Identify which metric changed. A TTFB spike points to the server. An LCP change points to the frontend or assets. A jump in total requests points to new resources being loaded.
- Check your deploy timeline. Did anything ship in the last 2-4 hours that touches the affected page? Start there.
- Use a waterfall chart. Tools like WebPageTest give you a full request-by-request breakdown. Find the new or changed request that correlates with the regression.
- Test your fix before deploying. Run the synthetic test against your staging environment first. This is the whole point - catching the regression before it ships again.
For teams working with staging environments, this loop becomes much tighter. You can run the same synthetic test suite against staging as you do against production, and compare results side by side before a single line of code goes live.
Tools for Website Performance Monitoring With Synthetic Tests
There are several solid tools worth knowing about:
- WebPageTest: Free, incredibly detailed, and the gold standard for one-off performance analysis. Its scripting capability lets you test multi-step flows like login and checkout.
- Lighthouse CI: Google's open-source tool integrates directly into your CI/CD pipeline. It runs a Lighthouse audit on every pull request and fails the build if scores drop below your thresholds. This is the closest you can get to preventing a regression from ever reaching production.
- Calibre, SpeedCurve, or Sentry Performance: Paid tools that add historical tracking, alerting, and team dashboards on top of the synthetic test engine. Worth the cost for larger teams where individual developers aren't checking scores manually.
- Datadog Synthetics: Enterprise-grade option with strong alerting integrations. Useful if you're already using Datadog for infrastructure monitoring and want everything in one place.
Integrating Synthetic Tests Into Your CI/CD Pipeline
The most powerful configuration runs synthetic tests as a mandatory gate in your deployment pipeline. Here's a minimal Lighthouse CI config that blocks a deploy if performance drops:
// lighthouserc.js module.exports = { ci: { collect: { url: ['https://staging.yoursite.com/', 'https://staging.yoursite.com/checkout/'], numberOfRuns: 3, }, assert: { assertions: { 'categories:performance': ['error', { minScore: 0.85 }], 'first-contentful-paint': ['warn', { maxNumericValue: 2000 }], 'largest-contentful-paint': ['error', { maxNumericValue: 2500 }], 'total-blocking-time': ['error', { maxNumericValue: 300 }], }, }, upload: { target: 'temporary-public-storage', }, }, };This runs 3 Lighthouse audits on your staging URLs, averages the results, and throws an error if LCP exceeds 2.5 seconds or performance score drops below 85. The deploy doesn't proceed until those thresholds pass.
Continuous Website Performance Monitoring After Deploy
Even with pre-deploy checks in place, production can behave differently from staging. Real CDN configurations, live database load, and third-party scripts all introduce variables that a staging environment can't perfectly replicate.
That's why continuous post-deploy monitoring matters just as much. Set your synthetic tests to run every 5-10 minutes on your critical pages, and configure alerts to fire if any metric exceeds 20% of its baseline. A 20% threshold is tight enough to catch genuine regressions but loose enough to avoid alert fatigue from minor fluctuations.
On the server side, it also helps when your infrastructure is actively watched. We track performance peaks and uptime events at the server level, so spikes that originate from the infrastructure layer - rather than your code - are visible separately from your application-level synthetic results. That separation makes it much easier to know where to look when something goes wrong.
If you want to understand more about how server-level performance signals connect to your broader monitoring strategy, Setting Up Real-Time Performance Alerts Before Your Users Notice a Problem is a good companion read.
The Real Value: Confidence at Deploy Time
Synthetic monitoring doesn't just catch regressions. It changes how your team deploys. When every pull request runs a performance check and every production deploy is watched by automated tests, you stop shipping changes into the dark.
Teams that do this consistently report fewer performance incidents, faster mean time to resolution when something does go wrong, and - perhaps most usefully - a shared language for talking about performance. When everyone can see the same metrics trend over time, performance stops being a vague feeling and starts being a measurable property of the codebase.
That shift in culture is, honestly, worth more than any individual optimization. For more on the broader landscape of website speed optimization and what changes actually make a difference, that post is worth your time.
Start with a baseline. Pick two or three critical pages. Set a threshold. Let the robots watch so you can sleep.