πŸ“Œ Practice project: VWO Login used as a practice target based on Pramod Dutta's Playwright Automation Mastery 2026 course. No internal systems accessed. All bugs are simulated defects for STLC demonstration only.
I ran a full STLC cycle using AI.
Twice. Once manually. Once with MCP.
Here's what I built recently and what it taught me about the future of QA:

I applied all 6 STLC phases to VWO Login Dashboard β€” manually first, then using Playwright MCP + JIRA MCP.

The difference was stark. Manual took ~90 minutes. MCP took ~20 minutes β€” and found 43 elements vs 8 from the PRD.

But here's the thing nobody talks about: you can't use MCP well if you don't understand STLC manually first. AI amplifies your thinking. It doesn't replace it.

What I built β†’ published at GitHub ↓
43
elements via MCP
6
stlc phases
4.5Γ—
faster with MCP
11
commits, live repo
#SDET #Playwright #MCP #QAAutomation #AIAgents #GitHub #TypeScript #STLC
Interactive Concept Explorer
How MCP Works
Model Context Protocol β€” the architecture behind AI-assisted testing
πŸ§‘β€πŸ’»
You
QA Lead
natural language
✦
Claude Desktop
LLM + MCP Client
tool calls
🎭
Playwright MCP
MCP Server
browser control
🌐
VWO Login
Live Page
✦
Claude Desktop
LLM + MCP Client
tool calls
πŸ“‹
JIRA MCP
MCP Server
creates tickets
πŸ›
KAN-1
Bug Ticket
Key insight: MCP is a standardised protocol. The AI does not hardcode API calls β€” it discovers available tools at runtime and decides which to call based on your natural language instruction. This is what makes it an Agent, not just a chatbot.
3 Components MCP
Host β€” Claude Desktop, the application running the LLM
MCP Client β€” built into Claude Desktop, manages server connections
MCP Server β€” Playwright or JIRA, exposes tools the AI can call
Tools Available Live
browser_navigate β€” go to any URL
browser_snapshot β€” extract full DOM accessibility tree
browser_click, browser_fill β€” interact with elements
createJiraIssue β€” log bugs directly to JIRA board
LLM vs AI Agent
What changes when you connect an LLM to tools
LLM only text in β†’ text out
Answers questions about Playwright
Generates test case templates
Explains STLC concepts
Cannot navigate a real browser
Cannot create a real JIRA ticket
Cannot read live DOM structure
AI Agent (LLM + MCP) acts in the world
Navigates to app.vwo.com/#/login
Extracts 43 real elements from live DOM
Creates KAN-1 bug ticket in JIRA
Generates locators from actual page structure
Runs STLC phases using real tool calls
Decides which tool to call based on intent
The formula: Agent = LLM + Tools + Decision loop.
MCP is the standard that makes connecting tools to LLMs reliable and scalable. Without MCP, each tool integration required custom code. With MCP, any compatible tool connects through the same protocol.
// Without MCP β€” you write this
const response = await fetch('https://api.atlassian.com/jira/issues', {
  method: 'POST', headers: { Authorization: 'Bearer token' },
  body: JSON.stringify({ fields: { summary: '...' } })
});

// With MCP β€” Claude decides and calls
// You just say: "Create a bug for the password validation issue"
// Claude calls: createJiraIssue({ cloudId, projectKey, summary, ... })
REST API vs MCP
Two ways to connect software β€” fundamentally different philosophies
REST API code-to-service
Your code calls a specific endpoint
You must know the exact URL and parameters
Response format is fixed β€” JSON or XML
You write the integration logic
Each service has different auth patterns
Error handling is your responsibility
MCP AI-to-tool
AI discovers available tools at runtime
AI decides which tool to call from intent
Standardised protocol across all tools
AI writes the integration logic dynamically
Single connection pattern for any MCP server
AI handles sequencing of multiple calls
Analogy: REST API is like calling a specific department by dialling their direct number β€” you need to know the number. MCP is like telling a smart assistant "arrange a meeting" β€” it figures out which departments to call, in what order, and handles the back-and-forth.
JIRA via REST
# Step 1 β€” get project ID
GET /rest/api/3/project

# Step 2 β€” get issue type ID
GET /rest/api/3/issuetype

# Step 3 β€” create issue
POST /rest/api/3/issue
{ fields: { project, issuetype... } }
JIRA via MCP
# You say:
"Create a High priority bug in VWO Login
STLC project β€” password accepts abc"

# Claude calls in sequence:
getAccessibleAtlassianResources()
getVisibleJiraProjects(...)
getJiraProjectIssueTypesMetadata(...)
createJiraIssue(...)
Manual vs MCP β€” Side by Side
Same STLC. Same project. Measured difference.
Manual (Block A)
MCP-Assisted
Req. Analysis
30 min Β· 8 elements
2 min Β· 43 elements
Test Planning
20 min
5 min
Test Case Design
30 min Β· 5 TCs
10 min Β· 8 TCs
Bug Reporting
10 min Β· manual JIRA
1 min Β· JIRA MCP
Total
~90 min
~20 min Β· 4.5Γ— faster
The important caveat: MCP found 43 elements vs 8 in the PRD β€” including 4 hidden forms the documentation never mentioned. But you cannot validate these findings without understanding what good test cases look like. Manual first. MCP second. Always.
STLC β€” 6 Phases Applied to VWO Login
Each phase produces a real artifact. Each artifact is traceable.
PHASE 01
Requirement Analysis
β†’ 43 elements via MCP snapshot
PHASE 02
Test Planning
β†’ Scope, risks, entry/exit criteria
PHASE 03
Test Case Design
β†’ 8 TCs with exact locators
PHASE 04
Test Execution
β†’ POM + 13 Playwright tests
PHASE 05
Defect Reporting
β†’ KAN-1 via JIRA MCP
PHASE 06
Test Closure
β†’ Report + comparison
What makes this different: Block A ran all phases manually using the VWO PRD. The STLC MCP Project ran the same phases using live MCP tools. Both are documented side by side in the GitHub repo β€” making the comparison concrete and verifiable.
# The complete pipeline
PRD Read (Manual)
  β†’ Live DOM Snapshot (Playwright MCP)
    β†’ Test Plan β†’ 8 Test Cases β†’ POM Spec
      β†’ Bug KAN-1 (JIRA MCP)
        β†’ Closure Report β†’ GitHub βœ“
The Portfolio Repository
github.com/somasaic/sdet-stlc-portfolio
Block_A_Manual/ Traditional
01_Requirement_Analysis.md
02_Test_Plan.md
03_Test_Cases.md
04_Bug_Report.md
05_Severity_Priority.md
06_Regression_Retesting.md
docs/Block_B_Automation.md
STLC_MCP_Project/ AI-Assisted
01_Requirement_Analysis/vwo_live_elements.md
02_Test_Plan/test_plan.md
03_Test_Cases/test_cases.md
04_Test_Execution/pages/LoginPage.ts
04_Test_Execution/tests/vwo_login.spec.ts
05_Defect_Reports/BUG_Login_PWD001.md
06_Test_Closure/closure_report.md
⬑
somasaic/sdet-stlc-portfolio
STLC applied to real projects β€” Manual QA + Playwright MCP + JIRA MCP
Playwright
TypeScript
MCP
JIRA
CI/CD
11
commits
13
playwright tests
5
browsers
⬑ GitHub Repo β†— in LinkedIn β†— πŸ› KAN-1 JIRA β†—
5 Approaches β€” Side by Side
Same VWO login. Same 6 STLC phases. Completely different execution. Each approach adds a skill the previous couldn't demonstrate.
APPROACH 1
Block_A_Manual
PRD read β€” 8 elements found
Test cases hand-written
Bug report in Word doc
~90 min total
No CI/CD pipeline
Skill: QA process thinking
APPROACH 2
STLC_MCP_Project
Live DOM β€” 43 elements
AI writes test cases
KAN-1 via JIRA MCP
~20 min Β· 4.5Γ— faster
5-browser CI pipeline
Skill: AI agent orchestration
APPROACH 3
Standard CLI
POM β€” getByRole locators
18/18 β€” 3 browsers
codegen for selectors
GitHub Actions CI green
HTML report artifact
Skill: pure engineering
APPROACH 4
Playwright CLI
UI + API in one project
request fixture β€” no browser
testData.ts β€” typed inputs
20/20 Β· 14Γ— API speed
KAN-2 via JIRA MCP
Skill: framework depth + API
LATEST
APPROACH 5
AI Agents
Planner β†’ Generator β†’ Healer
AI plans + writes tests
Self-healing on failure
3/3 visual regression
seed.spec.ts bootstrap
Skill: autonomous AI testing
Dimension Manual MCP Standard CLI Playwright CLI AI Agents
Tool None Claude + MCP servers npx playwright npx playwright + request planner + generator + healer
Test types None UI UI UI + API UI + Visual Regression
Speed ~90 min ~20 min ~90s CI run 3.9s API Β· 54s UI 48s visual Β· auto-generated
Bugs logged Word doc KAN-1 via JIRA MCP KAN-1 reference KAN-2 via JIRA MCP KAN-3 healer-caught
Who writes tests You (manually) You (with AI assist) You (pure code) You (framework) AI agents (autonomous)
New skill added Process AI orchestration POM + CI/CD API testing + edge cases Autonomous gen + visual reg
The key insight: The STLC phases never change β€” Requirement Analysis, Test Planning, Test Design, Execution, Bug Reporting, Closure. What changes is the execution mode. Manual tests your judgment. MCP tests your process. Standard CLI tests your engineering. Playwright CLI tests your framework depth. AI Agents tests whether you can let the AI work and know when to intervene. An SDET needs to operate fluently in all five.
API Testing β€” From Zero to SDET Level
What it is, why it matters, and how Playwright handles it natively
UI Test browser required
Playwright opens a real browser (Chromium)
Loads app.vwo.com in that browser
Finds DOM elements, clicks, fills
Asserts on what the user sees
5 to 30 seconds per test
Fragile to CSS/DOM changes
API Test no browser at all
request fixture β€” direct HTTP to server
No browser launched, no page loaded
Sends HTTP request, reads JSON response
Asserts on status code + body + schema
200 to 500ms per test β€” 14Γ— faster
Stable β€” tests API contract not visuals
THREE ASSERTION LEVELS β€” every API test needs all three
LEVEL 1 β€” Status Code (always)
expect(response.status()).toBe(200);
LEVEL 2 β€” Body Fields (always)
const body = await response.json();
expect(body.token).toBeDefined();
LEVEL 3 β€” Schema / Types (2yr level)
expect(typeof body.data.id).toBe('number');
expect(Array.isArray(body.data)).toBe(true);
CRITICAL β€” 204 DELETE RULE
204 No Content = no body. Never call response.json() on DELETE. It throws because the body is empty.
expect(response.status()).toBe(204);
// do NOT call response.json() here
STATUS CODE RANGES
2xx β€” success (200 OK, 201 Created, 204 No Content)
4xx β€” client error YOUR fault (400, 401, 403, 404)
5xx β€” server error THEIR fault (500, 502, 503)
404 as PASS β€” negative tests assert 404 intentionally
Why API testing is the market gap: 80% of SDET job descriptions ask for API testing. Most candidates with 1-2 years experience only have UI automation. The request fixture in Week 2 closes this gap entirely β€” same Playwright framework, same TypeScript, same CI pipeline. One project that proves both.
page fixture vs request fixture
The most important Playwright distinction for SDET interviews
πŸ§ͺ
test()
Playwright runner
injects
πŸ“„
{ page }
browser opens
β†’ DOM
🌐
VWO Login
~5-30s per test
β€” OR β€”
πŸ§ͺ
test()
Playwright runner
injects
πŸ“‘
{ request }
NO browser at all
HTTP direct
⚑
reqres.in API
~200-500ms per test
{ page } fixture UI layer
Opens real Chromium/Firefox/WebKit browser
Navigates to URL, waits for DOM ready
Interacts: fill(), click(), hover()
Asserts: toBeVisible(), toHaveText()
Slower β€” 5-30s Β· sensitive to UI changes
test('login smoke', async ({ page }) => {
  await page.goto('/#/login');
  await expect(emailInput).toBeVisible();
});
{ request } fixture API layer
No browser. Zero. Nothing launched.
Sends HTTP directly: GET POST PUT DELETE
request.get() Β· request.post({ data })
Asserts: status() Β· json() Β· body fields
14Γ— faster Β· stable Β· environment-agnostic
test('login API', async ({ request }) => {
  const res = await request.post('/api/login',
    { data: apiData.validLogin });
  expect(res.status()).toBe(200);
});
testData.ts β€” why no hardcoded strings in tests
// Week 1 β€” hardcoded, fragile
await loginPage.login('test@wingify.com', 'wrongpass');

// Week 2 β€” testData.ts, typed, DRY
import { apiData, uiData } from '../../data/testData';
await request.post(endpoints.login, { data: apiData.validLogin });
// One file change updates every test that uses this credential
Interview answer: "page fixture opens a real browser and tests the DOM layer β€” what users see and interact with. request fixture makes direct HTTP calls with no browser β€” it tests the API contract: status codes, response schemas, and error handling. API tests run 14Γ— faster. I use both in the same project because they test different layers of the same feature."
Three CLI Tools β€” npx playwright vs MCP vs @playwright/cli
Playwright has three distinct execution modes β€” each serves a different purpose
TOOL 1 Β· STANDARD
npx playwright
Ships with @playwright/test
Test runner β€” runs spec files
codegen β€” record interactions
show-report, show-trace
--grep --project --debug --ui
CI/CD focused, one-shot per run
Used in: Week 1b + Week 2
TOOL 2 Β· AI AGENT
Playwright MCP
@playwright/mcp β€” JSON-RPC over stdio
AI calls browser_snapshot, browser_click
Snapshots injected INTO context window
~115K tokens per 30 actions
Per-call browser lifetime
Best for: live interactive exploration
Used in: Week 1a (STLC_MCP_Project)
TOOL 3 Β· AI EFFICIENT Β· NEXT
@playwright/cli
Microsoft's new AI agent CLI
playwright-cli open Β· snapshot Β· click
Snapshots saved to DISK as YAML/PNG
~25K tokens Β· 4.6Γ— MCP savings
Persistent daemon via Unix socket
Best for: complex multi-step AI automation
Planned: Week 3/4 AI_Agentic project
TOKEN USAGE COMPARISON β€” per 30 actions
Playwright MCP
~115,000 tokens (context window)
@playwright/cli
~25,000 tokens (disk snapshots) Β· 4.6Γ— saving
npx playwright
0 tokens β€” traditional test runner, no LLM
Why MCP burns tokens
Every browser_snapshot call injects the full page accessibility tree directly into the LLM context window. After 15+ steps, the context carries 90K+ tokens of stale snapshots from pages the agent already left. The model loses track of what is current.
Why @playwright/cli solves it
Snapshots write to disk as YAML/PNG files. The context window never sees them unless the agent explicitly reads a specific file. The model only loads what it needs right now. Persistent Unix socket sessions mean the browser stays alive between commands β€” no re-launch overhead.
The progression logic: Standard CLI (Week 1b) β†’ MCP (Week 1a) β†’ @playwright/cli (Week 3/4). Each mode has a clear use case. Real SDET teams use all three depending on context: standard CLI for CI/CD, MCP for interactive exploration, @playwright/cli for AI agent automation at scale.
# @playwright/cli β€” AI agent commands
playwright-cli open https://app.vwo.com/#/login
playwright-cli snapshot             # writes YAML to disk, NOT context window
playwright-cli click e15            # element ref from snapshot
playwright-cli fill e22 "test@wingify.com"
playwright-cli screenshot         # saves PNG to disk

# Token cost: ~25K vs ~115K for MCP β€” same task, 4.6Γ— cheaper
LLM β€” What It Can and Cannot Do
Understanding the boundaries is what separates an SDET from someone who just prompts
What LLMs do well text in β†’ text out
Understand natural language instructions precisely
Generate code, test cases, docs from a description
Reason about text β€” compare, summarise, classify
Pattern-match from billions of training examples
Produce structured output (JSON, Markdown, TypeScript)
Chain reasoning steps β€” think before answering
Hard limitations without tools
No memory β€” every conversation starts blank. No state between sessions.
No tools β€” cannot open a browser, read a file, call an API by itself
No real-time data β€” knowledge has a cutoff date, cannot fetch live DOM
No execution β€” can write code but cannot run it and see the output
No persistence β€” cannot save files, write to disk, modify state
Context limit β€” finite window. Too much input = early content dropped
THE MEMORY PROBLEM β€” WHY IT MATTERS IN TESTING
No short-term memory
Within one session the LLM sees everything in the context window. But it cannot "remember" what it clicked 10 steps ago unless that snapshot is still in context.
No long-term memory
Close the session, start again β€” zero memory. The LLM has no idea it already explored VWO login yesterday. Every run starts from scratch.
Solution: external memory
Agents compensate by writing to disk β€” specs/, snapshots, test files. The filesystem becomes the LLM's long-term memory. This is exactly what the planner does.
Why this matters for SDET work: An LLM alone is a text transformer. It can describe a test β€” it cannot run one, verify a selector exists, or confirm a button is actually clickable. The moment you add tools (MCP, browser control, file I/O), you convert the LLM from a text generator into an agent that acts on the real world. That gap between "generating test ideas" and "generating verified, runnable tests" is exactly what Playwright AI Agents bridge.
AI Agent Architecture β€” Think, Act, Observe
What makes something an agent rather than just an LLM call
🧠
LLM (Brain)
Receives the prompt + tool results. Reasons about what to do next. Decides which tool to call and with what arguments. Produces the plan or code output.
Claude Sonnet / GPT-4
πŸ”§
Tools (Hands)
Browser control, file read/write, API calls, terminal commands. Tools are the only way the LLM can affect the outside world. Without tools it can only produce text.
MCP servers, browser_*, file I/O
πŸ’Ύ
Memory (State)
Context window (short-term) + file system (long-term). The agent writes its discoveries to disk so later steps can read them. Specs, screenshots, test files are all memory.
specs/, tests/, snapshots/
THE AGENT LOOP β€” THINK β†’ ACT β†’ OBSERVE β†’ REPEAT
STEP 1
Think
LLM reads prompt + context, decides next action
β†’
STEP 2
Call Tool
browser_snapshot(), browser_click(), write_file()
β†’
STEP 3
Observe
Tool result injected into context. LLM reads it.
β†’
STEP 4
Decide
Done? β†’ Output. Not done? β†’ back to Step 1.
Not an agent β€” single call
// Ask Claude to write a test β€” one shot
"Write a Playwright test for VWO login"
// β†’ Claude produces text. Done.
// No browser opened, no selector verified,
// no guarantee it actually works.
Agent β€” tool loop
// Planner agent loop
planner_setup_page() β†’ runs seed.spec.ts
browser_snapshot() β†’ reads live DOM
browser_click("Forgot Password")
browser_snapshot() β†’ reads new state
write_file("specs/plan.md", plan)
// Verified against real page. Saved to disk.
The formula: Agent = LLM + Tools + Memory + Loop. Remove any one of the four and you no longer have an agent β€” you have a text generator. The Playwright AI Agents (planner, generator, healer) implement all four: Claude is the LLM, MCP tools are the hands, specs/ and tests/ are the memory, and the planner β†’ generator β†’ healer sequence is the loop.
Playwright AI Agents β€” Planner, Generator, Healer
Microsoft's built-in agent system for autonomous test creation and self-healing
AGENT 1 β€” PLANNER
Explores β†’ Plans
Calls planner_setup_page β†’ runs seed.spec.ts
browser_snapshot β†’ reads live DOM structure
Navigates all flows β€” login, errors, edge states
Writes human-readable Markdown test plan
Input: seed.spec.ts + your prompt
Output: specs/vwo_login_plan.md
AGENT 2 β€” GENERATOR
Plan β†’ Code
Reads specs/vwo_login_plan.md
Calls generator_setup_page β†’ opens browser
Verifies every selector live before writing
Writes TypeScript spec files with assertions
Input: specs/vwo_login_plan.md
Output: tests/login/*.spec.ts
AGENT 3 β€” HEALER
Fails β†’ Fixes
Receives failing test name + error output
Replays failing steps in live browser
Inspects current DOM β€” finds correct selector
Patches the spec file and re-runs until green
Input: failing test + error message
Output: patched passing spec file
THE SEED FILE β€” MOST MISUNDERSTOOD CONCEPT
seed.spec.ts is NOT a test β€” it is a browser bootstrap. Before the planner or generator starts exploring, it calls planner_setup_page which runs seed.spec.ts first. This opens a browser, navigates to the target URL, and then calls page.pause() β€” handing the live browser session to the agent.
Without page.pause(), the browser closes as soon as the test ends. The agent has nothing to explore. The pause keeps the session alive and transfers control.
// seed.spec.ts β€” the handshake
test('seed', async ({ page }) => {
  await page.goto('/#/login');
  await page.waitForLoadState('networkidle');

  // confirm page is ready
  await expect(emailInput).toBeVisible();

  await page.pause();
  // ↑ agent takes control here
  // browser stays open
  // agent starts exploring
});
Why init-agents? Running npx playwright init-agents --loop=claude writes three Markdown files into .claude/agents/. These are agent definition files β€” they contain the system prompts and tool lists that tell Claude Code how to behave as a planner, generator, or healer. Claude Code reads them automatically when you open the project. You never edit them β€” regenerate when Playwright is updated.
Playwright Agents vs Playwright MCP β€” Why They Are Different
Both use MCP under the hood β€” but they solve completely different problems
Playwright MCP Week 1a β€” exploration
Purpose: Let an AI agent explore a live app interactively
YOU give a natural language instruction per step
Claude Desktop calls browser_snapshot, browser_click
Snapshot injected into context window each call
Output: you read the response and decide next step
~115K tokens per 30 actions β€” context fills fast
No structured output β€” conversational, ad hoc
Used for: Phase 1 requirement extraction, JIRA tickets
Playwright AI Agents Week 3/4 β€” autonomous
Purpose: Autonomously plan, generate, and heal tests
YOU give ONE high-level prompt β€” agent decides all steps
Agent orchestrates planner_setup_page + browser tools
Agent loop: Think β†’ Tool call β†’ Observe β†’ Repeat
Output: structured files β€” specs/*.md + tests/*.spec.ts
Agent definitions in .claude/agents/ guide behaviour
Deterministic β€” same input β†’ same structured output
Used for: all 6 STLC phases, fully automated
THEY BOTH USE MCP β€” SO WHAT'S DIFFERENT?
Playwright MCP is a server β€” it exposes browser control tools (browser_snapshot, browser_click, browser_fill) via the MCP protocol. Any MCP client can use it.

Playwright AI Agents are clients with structured roles. The planner agent calls planner_setup_page which internally uses the same MCP browser tools β€” but wraps them in a deliberate loop with a defined output format (Markdown plan). The generator similarly uses generator_setup_page to produce TypeScript files.

Analogy: MCP is electricity. The agents are appliances. The planner is a camera that uses electricity to take a structured photo. The generator is a printer that uses electricity to produce a document. Both use the same power source β€” but they do completely different jobs.
MCP alone
Interactive, conversational. You drive every step. Flexible but manual. Good for exploration and one-off tasks.
Agents using MCP
Autonomous, structured. Agent drives all steps. Consistent output format. Good for repeatable workflows like STLC.
Both together
Use MCP for interactive exploration (Week 1a), then agents for systematic generation (Week 3/4). Different phases of the same STLC.
The interview answer: "Playwright MCP is a browser control server β€” it exposes tools any AI can call. Playwright AI Agents are structured workflows built on top of MCP. The planner agent uses MCP browser tools internally but wraps them in a deliberate loop that produces a Markdown test plan. The generator converts that plan into verified TypeScript tests by checking every selector live. The healer uses the same tools to replay failures and patch broken locators. They are not alternatives β€” they are layers. MCP is the infrastructure. Agents are the application built on it."
Visual Regression Testing β€” toHaveScreenshot()
The Week 3/4 key addon β€” pixel-level UI verification that no previous approach covers
Functional test what it can't catch
Login button text changed from "Sign in" to "Log in"
Error message colour changed from red to orange
Input field border disappeared in a CSS deploy
Password field moved 20px to the right on mobile
VWO logo replaced with placeholder image
All functional tests still PASS despite these issues
Visual regression what it catches
Pixel-level diff β€” any visual change triggers failure
Baseline PNG stored in repo β€” version controlled
Diff image shows exactly what changed in red
Runs in CI on every push β€” catches regressions before merge
Clips to stable elements β€” excludes dynamic backgrounds
OS + browser tagged β€” chromium-win32.png, chromium-linux.png
TWO PHASES β€” HOW toHaveScreenshot() WORKS
PHASE 1 β€” BASELINE CREATION (first run)
No PNG exists yet. Playwright takes a screenshot and saves it to tests/visual/login_visual.spec.ts-snapshots/. Test "fails" with message "snapshot doesn't exist, writing actual". This is correct β€” run --update-snapshots to promote to baseline.
PHASE 2 β€” COMPARISON (every run after)
Baseline exists. Playwright takes a new screenshot and compares pixel by pixel against the stored PNG. If difference exceeds maxDiffPixels: 200, test FAILS with a diff image showing changed pixels highlighted in red/pink.
THE VWO ANIMATED BACKGROUND PROBLEM β€” AND HOW WE SOLVED IT
THE PROBLEM
VWO login has a CSS animated background that changes every render. Full-page screenshots showed 65,000–69,000 pixel diffs between runs taken seconds apart β€” not because the UI changed, but because the background animation was at a different frame.
THE FIX β€” clip to form bounding box
const form = page.locator('form').first();
const box = await form.boundingBox();
await expect(page).toHaveScreenshot({
  clip: box ?? undefined,
  maxDiffPixels: 200,
});
Result: only the login form is captured. The animated background is outside the clip rectangle β€” it never appears. 3/3 tests now pass stably across runs. This is documented engineering decision-making β€” not just "it works now."
TC-VR-01
Default login state
vwo-login-default-chromium-win32.png
TC-VR-02
Error state after bad login
vwo-login-error-state-chromium-win32.png
TC-VR-03
Email field filled state
vwo-login-email-filled-chromium-win32.png
Why visual regression is the right Week 3/4 addon: Week 2 closed the API testing gap. Week 3/4 closes the visual regression gap. Together: functional UI tests (Weeks 1-3), API contract tests (Week 2), visual regression (Week 3/4). That is a complete test pyramid. No previous approach in this portfolio covers what a pixel-level regression looks like β€” and 80% of SDET job descriptions for product companies mention it.
Portfolio Progression β€” 5 Approaches
WEEK 0 Β· APPROACH 1
Block_A_Manual
Manual STLC β€” all 6 phases on VWO Login PRD. No automation. Pure QA process thinking.
Manual STLC PRD
Done
WEEK 1A Β· APPROACH 2
STLC_MCP_Project
Playwright MCP + JIRA MCP. 43 DOM elements. 13 tests, 5 browsers. KAN-1 via MCP. 4.5Γ— speed.
MCP JIRA AI Agent
Done
WEEK 1B Β· APPROACH 3
STLC_Standard_CLI
Standard Playwright CLI. POM + TypeScript. 18/18 tests, 3 browsers, GitHub Actions CI. KAN-1.
POM TypeScript CI/CD
Done
WEEK 2 Β· APPROACH 4
Playwright_CLI
UI + API testing. request fixture. testData.ts. Dual config. 20/20 tests. KAN-2 via JIRA MCP.
API Testing CRUD KAN-2
Done
LATEST
WEEK 3/4 Β· APPROACH 5
Playwright_AI_Agents
Planner + Generator + Healer agents. Visual regression. seed.spec.ts. 3/3 VR tests passing.
AI Agents Visual Reg Self-Heal
In Progress
Week 1B β€” STLC Standard CLI
APPROACH 3 Β· VWO LOGIN Β· PLAYWRIGHT + TYPESCRIPT
STLC_Standard_CLI/
POM Β· 3 Browsers Β· GitHub Actions Β· 18/18 passing
18/18
tests passed
3
browsers
KAN-1
bug logged
6
STLC phases
PHASE 01
Requirement Analysis
12 REQs from VWO login. RTM. EP + BVA test design coverage mapped.
PHASE 02
Test Planning
Scope, 5 risks, entry/exit criteria. retries:1 for CI flakiness mitigation.
PHASE 03
Test Case Design
6 TC IDs β€” TC-01 to TC-06. EP partitions + BVA boundaries. Full format.
PHASE 04
Test Automation
POM + getByRole locators. fixture injection. beforeEach. 3 browser projects.
PHASE 05
Bug Reporting
KAN-1: password field no visibility toggle. Severity Medium, Priority Low.
PHASE 06
Test Closure
18/18 passing. Chromium + Firefox + WebKit. CI green. RTM 100% traced.
What this proves: You can build a production-grade Playwright framework from a blank folder β€” no generator, no plugin, no AI assistance. Every file written with full understanding of why each line exists. The RTM chain is complete: requirement β†’ test case β†’ automated test β†’ HTML report row β†’ CI green.
⬑ STLC_Standard_CLI/ β†— πŸ› KAN-1 β†—
Week 2 β€” Playwright CLI (UI + API)
APPROACH 4 Β· VWO LOGIN UI + REQRES API Β· PLAYWRIGHT + TYPESCRIPT
Playwright_CLI/
request fixture Β· testData.ts Β· dual config Β· 20/20 passing Β· JIRA KAN-2
20/20
tests passed
3.9s
API suite (10 tests)
14Γ—
API vs UI speed
KAN-2
bug via JIRA MCP
UI SUITE β€” tests/ui/vwo_login.spec.ts
βœ“ TC-UI-01 β€” smoke: all elements visible
βœ“ TC-UI-02 to 04 β€” EP: valid/invalid credentials
βœ“ TC-UI-05 to 06 β€” BVA: empty/partial inputs
βœ“ TC-UI-07 β€” SQL injection input (edge)
βœ“ TC-UI-08 β€” 500-char boundary string
βœ“ TC-UI-09 β€” special chars in password
βœ“ TC-UI-10 β€” whitespace-only inputs
10 tests Β· 53.9s Β· page fixture Β· browser
API SUITE β€” tests/api/ (reqres.in)
βœ“ TC-API-01 β€” POST /login valid β†’ 200 + token
βœ“ TC-API-02 β€” POST /login missing password β†’ 400
βœ“ TC-API-03 β€” POST /login wrong creds β†’ 400
βœ“ TC-API-04 β€” POST /register valid β†’ 200 + id
βœ“ TC-API-05 β€” POST /register missing pw β†’ 400
βœ“ TC-API-06/07 β€” GET users list + single
βœ“ TC-API-08/09/10 β€” 404 Β· PUT Β· DELETE 204
10 tests Β· 3.9s Β· request fixture Β· no browser
NEW CONCEPTS IN WEEK 2 (vs Week 1)
request fixture GET POST PUT DELETE testData.ts interfaces extraHTTPHeaders dual project config dotenv + GitHub Secrets 3-level assertions 204 no-content rule schema validation KAN-2 via JIRA MCP
KAN-2 β€” logged via JIRA MCP: POST /api/register returns 200 instead of 201. Per RFC 7231, resource creation should return 201 Created. The bug test intentionally FAILS β€” that is the correct result. It proves the bug exists. The companion test PASSES and documents actual behaviour. Same JIRA MCP approach used for KAN-1 in Week 1.
// page fixture β€” browser opens, tests DOM
test('TC-UI-01', async ({ page }) => {
  await loginPage.navigate();
  await expect(loginPage.emailInput).toBeVisible();
});

// request fixture β€” NO browser, direct HTTP
test('TC-API-01', async ({ request }) => {
  const response = await request.post('/api/login', {
    data: apiData.validLogin
  });
  expect(response.status()).toBe(200);          // Level 1
  expect(body.token).toBeDefined();              // Level 2
  expect(typeof body.token).toBe('string');      // Level 3 schema
});
⬑ Playwright_CLI/ β†— πŸ› KAN-2 β†— ⬑ Full Portfolio β†—
Week 3/4 β€” Playwright AI Agents + Visual Regression
APPROACH 5 Β· VWO LOGIN Β· PLANNER + GENERATOR + HEALER + VISUAL REGRESSION
Playwright_AI_Agents/
Built-in AI agents Β· toHaveScreenshot() Β· seed.spec.ts Β· self-healing loop
3/3
visual tests
3
AI agents
PNG
baselines committed
TBD
agent-gen tests
AGENT 1
Planner
Navigates live DOM via planner_setup_page. Runs seed.spec.ts first. Writes Markdown test plan to specs/.
Input: seed + prompt β†’ Output: vwo_login_plan.md
AGENT 2
Generator
Reads plan, opens browser via generator_setup_page. Verifies selectors live. Writes TypeScript spec files.
Input: vwo_login_plan.md β†’ Output: tests/login/*.spec.ts
AGENT 3
Healer
Replays failing steps. Inspects current DOM. Patches locator or assertion. Re-runs until passing. Self-healing automation.
Input: failing test β†’ Output: patched passing test
seed.spec.ts is not a regular test. Before the planner or generator explores the browser, it runs seed.spec.ts via the planner_setup_page and generator_setup_page tools. The seed navigates to the target and calls page.pause() β€” keeping the browser alive and handing the session to the agent to explore. Without pause() the browser closes immediately.
VISUAL REGRESSION β€” toHaveScreenshot() β€” KEY ADDON
TC-VR-01
Default page state β€” form clipped baseline
vwo-login-default-chromium-win32.png
TC-VR-02
Error state after invalid login β€” form baseline
vwo-login-error-state-chromium-win32.png
TC-VR-03
Email field filled β€” form baseline
vwo-login-email-filled-chromium-win32.png
VWO has a dynamic animated background β€” clipping to form bounding box gives stable baselines. 3/3 passing. PNG files committed to repo. CI compares on every push.
# How visual regression works β€” two phases

# Phase 1 β€” create baselines (first run)
npx playwright test tests/visual/ --update-snapshots
# β†’ saves vwo-login-default-chromium-win32.png to snapshots/

# Phase 2 β€” comparison (every run after)
npx playwright test tests/visual/
# β†’ compares pixel-by-pixel against baseline
# β†’ fails with diff image if VWO changes their UI

# seed.spec.ts β€” hands browser to agent
test('seed', async ({ page }) => {
  await page.goto('/#/login');
  await page.waitForLoadState('networkidle');
  await page.pause(); ← agent takes control here
});
⬑ Playwright_AI_Agents/ β†— β†— Playwright Agents Docs