How MCP-Enabled Testing Tools Could Let Startups Ship as Fast as Google

Written by Jahdunsin Osho | June 24, 2025

Google performs up to 70 launches per week.

Here's how they define a launch:

"Any new code that introduces an externally visible change to an application."

So shipping isn't just pushing code to production, but delivering observable value to users.

With roughly 40,000 engineers, 70 launches weekly means Google achieves about 1.75 launches per hour across the entire company. Scale that down to a typical feature team of 5-8 engineers, and you're looking at one visible change every 6-8 weeks to match Google's per-developer pace.

That’s achievable, even for a small startup.

While rapid releases at scale depend on several factors, such as capacity planning, team coordination, and infrastructure automation, this article focuses on software testing. It's one aspect of software delivery that AI is increasingly becoming capable of handling.

Geoffrey Huntley is already demonstrating what's possible, using multi-agent coding workflows to deploy production-ready applications without touching code, generating entire weeks' work in hours.

Agentic Shift Left Testing is another piece of this puzzle. It involves moving QA tasks from the CI/CD pipeline to the coding stage and delegating more work to developers and AI coding agents. This frees up CI/CD bottlenecks, which results in quicker releases.

Model Context Protocol (MCP) is the integration layer that makes this possible. So, we’ll discuss three ways you can ship features quickly by having AI agents:

Write and manage E2E tests
Fix flaky tests
Fix security vulnerabilities

Let's start with the first.

1. E2E Tests With MCP

Many growth startups use end-to-end automation tools with recording features.

Playwright is a popular option that lets you build a feature, launch a browser, record actions, and generate test scripts. But this approach consumes time.

You'll need to add data-testid or ARIA labels so selectors don't break when your React app changes class names during builds or refactor tests into reusable functions or page objects. Scale that cleanup across the team and multiple features, and those "couple of minutes" become hours of delayed releases.

MCP testing tools remove these frictions.

Playwright now has an MCP interface which allows AI coding editors access to the live app, with DOM snapshots enabling accurate test generation without manual cleanup.

So, rather than build, launch browser, record actions and generate scripts, the workflow now looks like this:

Here's the workflow:

Write a short instruction: "Given I'm on the login page, when I enter valid credentials, then I should see the dashboard."
AI agent fetches live DOM snapshot using Playwright's MCP server.
AI agent generates and runs the test script in a headless browser using run_suite

Now every feature ships with its E2E test, catching regressions before production, and saving hours of test management. For a small team, that reclaimed time could mean shipping an extra feature you might have postponed till next sprint.

2. Fix Flaky Tests with MCP

Flaky tests create a compounding delay problem, especially with multiple developers working on the same project.

The typical workflow becomes:

Developer pushes feature branch
CI runs, test fails
Developer spends 15-30 minutes investigating and retrying builds
Realises it's flaky, hits "rerun failed tests"
Maybe it passes, maybe it fails again
Repeat until it passes or the developer forces a merge

Github found that flaky builds slowed their team down by 18x. For a startup racing to reach product-market fit, this could mean missing critical feature deadlines.

The traditional approach has been catching flaky tests in the CI pipeline which enables the "retry till pass" practice. However, with CircleCI's recently released MCP server, you can move flaky test detection to the coding stage, enabling developers to fix them before committing code.

Here's how it works:

AI agent queries CircleCI's MCP server by calling the find_flaky_tests MCP method
CircleCI returns flaky tests
AI agent analyses error patterns and suggests code fixes to resolve it

Now, instead of 15-30 minutes investigating and retrying builds, you get a fix in under 2 minutes with a more reliable CI/CD pipeline.

3. Fix Security Vulnerabilities with MCP

Developers spend 3.5 hours on average manually reviewing security scanning findings. Hours that could be spent actually building and delivering features to users.

For instance, when trying to resolve dependency vulnerability issues, developers spend hours on manual application scan reviews, context switching, and secrets detection. Often spending more time trying to determine which vulnerabilities are actually exploitable vs theoretical risks.

When there's pressure to ship quickly, security gets deferred, filed as "Fix Later" security debt. So they're forced to choose: ship quickly or spend time identifying and fixing vulnerabilities.

Security tools like Snyk address this by enabling vulnerability scan integration into CI/CD pipelines to help you scan, identify and report vulnerability issues. The problem is that this still causes release delays at the CI/CD stage.

However, Snyk's recently released MCP server solves this. It enables you to move security scanning from the CI pipeline into your coding environment, freeing up the CI/CD pipeline as a release bottleneck.

Here's how it works:

AI coding agent calls the Snyk MCP server to assess vulnerabilities affecting the codebase
AI agent identifies issues and fixes them in the codebase

This could reduce your security review time by 80-90% while improving security outcomes.

In the last year alone, Snyk customers reported an average time savings of 20,729 hours. Now, instead of 3.5 hours manually reviewing security findings, you get fixes in 2 minutes.

Your Tests Could Fix Themselves While Coding

Right now, using Playwright with agentic coding tools enables developers to update tests as the UI changes. But imagine tests that update themselves as you make changes to the UI, updated by a dedicated test self-healing agent.

That might be possible if MagicPod's self-healing function were integrated with MCP. And they seem to already be on their way to doing that with their release of a beta MCP server.

Agentic software development will only improve with increasing AI capabilities and MCP-enabled tools to extend its functionalities.

I see a future where developers, testers, and QAs don’t have to execute themselves but instead guide these autonomous agents using their experience and domain knowledge.

Then, even releasing 70 features per week would be possible.

References

View full post