
Written by: Chief Operating Officer
Anastasiia SokolinskaPosted: 23.06.2025
15 min read
Most test automation projects break down in year two, not year one. At 50 tests, the setup looks manageable. By the time the suite hits 300, auth logic is buried in every beforeEach, environment configs have drifted across three separate files, and CI takes 40 minutes without a clear reason why. The framework wasn't designed badly. It was built reactively, one test at a time, without architectural intent from the start.
This guide covers how to set up a Playwright testing framework the way senior engineers do it: starting with the decisions that determine long-term maintainability, then building each layer — config, structure, POM, fixtures, and CI — so nothing needs to be rebuilt at 200 tests.
Before you run npm init — decisions that lock in your architecture
The npm init playwright@latest command takes 30 seconds. The decisions you make before running it shape whether your framework survives to 300 tests.
Three choices deserve explicit answers before you write a single line of code.
TypeScript or JavaScript? TypeScript, always, for any team with more than one contributor maintaining tests over time. The reason isn't aesthetics. Type safety in fixture composition catches mismatches between fixture definitions and their consumers at compile time, not at 3am when CI fails. IDE support for page object refactoring means renaming a method in BasePage.ts propagates automatically instead of breaking 40 test files silently. For mid-market SaaS teams specifically, the maintenance overhead of untyped fixture chains compounds fast.
Standalone repo or monorepo? The realistic options are a dedicated qa-automation repo or a tests/ directory inside the main application repo. A standalone repo gives clearer ownership, simpler CI wiring, and prevents test engineers from accidentally importing application internals. A tests/ directory inside the app repo makes environment variables accessible without extra configuration and tightens the feedback loop when UI and test code change together. The trade-off is proximity to source versus separation of concerns. For teams where QA and development run on different release cycles, the standalone repo wins. For teams where tests and features ship together, co-location reduces friction.
One playwright.config.ts or many? One, always. Separate config files per environment (staging.config.ts, prod.config.ts) seem like a clean abstraction until they diverge. Someone updates the timeout in staging's config and forgets production. A new projects entry gets added to one file and not the other. A single environment-aware config driven by process.env.TEST_ENV gives you one source of truth and makes every environment switch a one-variable change. The rest of this article builds on that assumption.
If you've just taken over a repository where these decisions were never made explicitly, you'll find the consequences scattered across the codebase: magic strings where base URLs should be, beforeEach blocks handling auth that should live in a fixture, and a CI config that runs playwright test with no environment context. That's the shape of a framework assembled without a plan. Everything that follows is what a plan looks like.
Starting a new Playwright project and want the architecture right before the first test is written? DeviQA's automation engineers set up production-ready frameworks as part of our QA service engagements.
Book a strategic QA consultation
Project structure: what goes where and why
Every framework tutorial shows a folder tree. Few explain what breaks when you ignore it.

Here's the structure that holds up past 300 tests:
playwright-framework/
├── tests/ # Spec files only — no logic, no data setup
├── pages/ # Page Object Model classes, one per route or feature area
├── fixtures/ # Custom fixture definitions: auth, API seeding, page composition
├── utils/ # Pure functions: data generators, date helpers, API clients
├── config/ # Environment maps, base URL resolution, feature flag toggles
├── auth/ # Saved auth state (gitignored — never commit credentials)
├── .env.staging # Environment-specific secrets — never committed
├── playwright.config.ts # Single source of truth for all environments
└── package.json
The separation of concerns principle here is specific: spec files should read like test descriptions, not setup scripts. If a test file imports an API client directly, creates its own data, and handles its own auth, it's doing four jobs at once. That's the pattern you'll find in a suite with auth scattered across every beforeEach — and it's also why adding a new authenticated role to that suite means touching 40 files instead of one fixture.
The anti-patterns that break frameworks at scale follow directly from ignoring this structure. Test data hardcoded in spec files makes it impossible to run the same test against staging and production without manual edits. No fixtures/ directory means auth logic lives in beforeEach hooks that get copied, drifted, and eventually contradict each other. Mixing utils/ with pages/ means a refactoring task that should touch five files touches twenty.
The folder tree is not bureaucracy. It's the scaffolding that makes the fixture composition section below possible.
playwright.config.ts: the right way to wire it for multiple environments
This is where most frameworks make their first significant mistake: a single hardcoded base URL, no environment switching, and retry settings that are the same locally and in CI.
Here's a production-grade config that handles all three environments without duplication:
// playwright.config.ts
import { defineConfig, devices } from '@playwright/test';
import * as dotenv from 'dotenv';
const ENV = process.env.TEST_ENV ?? 'staging';
dotenv.config({ path: `.env.${ENV}` });
const BASE_URLS: Record<string, string> = {
dev: 'http://localhost:3000',
staging: 'https://staging.yourapp.com',
production: 'https://yourapp.com',
};
export default defineConfig({
testDir: './tests',
fullyParallel: true,
// Prevent focused tests from being accidentally committed to CI
forbidOnly: !!process.env.CI,
// Retries in CI catch genuine flakiness. Locally, they mask real failures.
retries: process.env.CI ? 2 : 0,
// workers: 1 in CI for stability in non-sharded runs.
// See the CI section for when to flip to sharding instead.
workers: process.env.CI ? 1 : undefined,
// blob reporter is mandatory for shard merging — do not use html in CI
reporter: process.env.CI ? 'blob' : 'html',
// Global setup runs once before all tests to generate auth/user.json
globalSetup: './global-setup',
use: {
baseURL: BASE_URLS[ENV] ?? BASE_URLS['staging'],
// Default auth state for all tests — overridable per test via fixture
storageState: 'auth/user.json',
// Capture trace on first retry so you have Trace Viewer data for failures
trace: 'on-first-retry',
screenshot: 'only-on-failure',
},
projects: [
{
// Primary CI target
name: 'chromium',
use: { ...devices['Desktop Chrome'] },
},
{
// Regression suite
name: 'firefox',
use: { ...devices['Desktop Firefox'] },
},
{
// Optional — add when WebKit coverage is a product requirement
name: 'webkit',
use: { ...devices['Desktop Safari'] },
},
],
});
A few decisions worth naming explicitly. Setting retries: 0 locally matters: if a test is flaky in your local environment, you want it to fail, not quietly pass on the second attempt. A flaky rate above 2% signals that CI trust is already eroding — retries in local dev make that threshold invisible until the problem is much larger.
The reporter: 'blob' setting in CI is not optional if you're using sharding. Blob reporter outputs a binary format that the merge-reports command combines across shards. Switch to html in CI and your merge step will fail. Allure is a reasonable alternative reporter for teams that want richer dashboards — but it requires a separate merge pipeline and is worth adding only after the base setup is stable.
Switch environments by changing a single variable: TEST_ENV=production npx playwright test. Nothing else changes.
If your team is already maintaining separate config files per environment, that's the first thing worth fixing before the suite grows further.
Talk to a Playwright engineer
Page Object Model: what it actually looks like in TypeScript
The standard POM tutorial shows a LoginPage with a username field, a password field, and a submit button. That tells you the concept, not the pattern.
Real page objects in a SaaS application share behavior: every page needs to wait for network activity, assert toast notifications, and handle navigation. Without a shared base class, those methods get copied into every page object, diverge slightly over time, and require the same fix applied in twelve places when the toast component changes.
Start with a BasePage that every other page class inherits from:
// pages/BasePage.ts
import { Page, expect, Locator } from '@playwright/test';
export class BasePage {
constructor(protected page: Page) {}
async navigate(path: string): Promise<void> {
await this.page.goto(path);
await this.page.waitForLoadState('networkidle');
}
async assertToastMessage(expected: string): Promise<void> {
const toast: Locator = this.page.getByRole('alert');
await expect(toast).toContainText(expected);
}
async waitForNetworkIdle(): Promise<void> {
await this.page.waitForLoadState('networkidle');
}
}
Then extend it for each feature area:
// pages/DashboardPage.ts
import { Page, expect, Locator } from '@playwright/test';
import { BasePage } from './BasePage';
export class DashboardPage extends BasePage {
private readonly welcomeBanner: Locator;
private readonly projectList: Locator;
constructor(page: Page) {
super(page);
this.welcomeBanner = this.page.getByTestId('welcome-banner');
this.projectList = this.page.getByRole('list', { name: 'projects' });
}
async assertWelcomeMessage(name: string): Promise<void> {
await expect(this.welcomeBanner).toContainText(name);
}
async expectProjectCount(count: number): Promise<void> {
await expect(this.projectList.getByRole('listitem')).toHaveCount(count);
}
}
Note what's not here: no goto() calls inside spec files, no locators defined inline in tests, no page.locator('.dashboard-list > li') scattered across a dozen test files. When the projects list component gets refactored, you update DashboardPage.ts and nothing else breaks.
One note on when not to use POM: simple smoke tests or single-assertion health checks don't benefit from the abstraction overhead. If a test file will never exceed three assertions and has no shared state with other tests, writing a page object for it creates more maintenance than it prevents. Use the pattern where it earns its cost.
Fixtures: the layer most teams skip and then rebuild
If there's one architectural decision that separates a tutorial-level framework from a production one, it's fixture composition. Teams that skip fixtures end up with auth logic copy-pasted across beforeEach blocks, test data leaking between tests, and spec files that read like setup scripts rather than test descriptions.
Three fixture patterns belong in any SaaS application framework from day one.
The storageState auth fixture
The goal is to log in once, save the session to auth/user.json, and inject that authenticated state into every test that needs it without a single beforeEach. The mechanism is a global setup script combined with a fixture that reads the saved state:
// global-setup.ts
import { chromium } from '@playwright/test';
async function globalSetup(): Promise<void> {
const browser = await chromium.launch();
const context = await browser.newContext();
const page = await context.newPage();
await page.goto(process.env.BASE_URL + '/login');
await page.getByLabel('Email').fill(process.env.TEST_USER_EMAIL!);
await page.getByLabel('Password').fill(process.env.TEST_USER_PASSWORD!);
await page.getByRole('button', { name: 'Sign in' }).click();
await page.waitForURL('**/dashboard');
// Save session cookies and localStorage to disk
await context.storageState({ path: 'auth/user.json' });
await browser.close();
}
export default globalSetup;
The playwright.config.ts line globalSetup: './global-setup' runs this once before the test suite starts. The use: { storageState: 'auth/user.json' } line in the config applies that saved state to every test by default.
If you've inherited a suite where 80 tests each call page.goto('/login') and fill credentials inside beforeEach, replacing all of that with this pattern is one of the highest-leverage refactors available.
The API seeding fixture
Flaky tests are often a state problem, not a timing problem. When tests share data that one test creates and another test expects to find, execution order matters and parallelism breaks things. The fix is test-scoped data that gets created before the test and destroyed after it:
// fixtures/seed.fixture.ts
import { test as base } from '@playwright/test';
type SeedFixtures = {
seededUser: { id: string; email: string };
};
export const test = base.extend<SeedFixtures>({
seededUser: async ({ request }, use) => {
// Create isolated test data via internal API
const response = await request.post('/api/test/users', {
data: { email: `seed-${Date.now()}@example.com`, role: 'member' },
});
const user = await response.json();
// Hand the data to the test
await use(user);
// Teardown runs automatically after the test completes
await request.delete(`/api/test/users/${user.id}`);
},
});
The use() call is the boundary: everything before it is setup, everything after it is teardown. Playwright guarantees the teardown runs even when the test fails. This is the pattern that eliminates the category of failures caused by dirty state from a previous test run.
Composing fixtures into a single import
This is where the architecture pays off. Rather than importing authFixture and seedFixture and DashboardPage separately in each spec file, you compose them once:
// fixtures/index.ts
import { test as base, expect } from '@playwright/test';
import { DashboardPage } from '../pages/DashboardPage';
type MyFixtures = {
authenticatedPage: import('@playwright/test').Page;
seededUser: { id: string; email: string };
dashboardPage: DashboardPage;
};
export const test = base.extend<MyFixtures>({
authenticatedPage: async ({ browser }, use) => {
const context = await browser.newContext({ storageState: 'auth/user.json' });
const page = await context.newPage();
await use(page);
await context.close();
},
seededUser: async ({ request }, use) => {
const res = await request.post('/api/test/users', {
data: { email: `seed-${Date.now()}@example.com` },
});
const user = await res.json();
await use(user);
await request.delete(`/api/test/users/${user.id}`);
},
// dashboardPage depends on authenticatedPage — Playwright resolves the dependency graph
dashboardPage: async ({ authenticatedPage }, use) => {
await use(new DashboardPage(authenticatedPage));
},
});
export { expect };
Every spec file then imports just two things:
import { test, expect } from '../fixtures';
test('dashboard shows correct project count', async ({ dashboardPage, seededUser }) => {
await dashboardPage.navigate('/dashboard');
await dashboardPage.assertWelcomeMessage(seededUser.email);
});
The spec file declares what it needs. The fixture layer handles everything else.
This is the pattern that separates a framework you can hand to five engineers from a framework only one person understands. Spec files should not orchestrate setup.
Wiring up GitHub Actions: treat CI as day-one infrastructure
A framework that isn't CI-ready from the first commit will be retrofitted later. Retrofitting CI means touching every configuration assumption made during development and finding out which ones break at scale. Start with a production-grade CI setup before you have enough tests to need it.
The minimal working workflow: npm ci, npx playwright install --with-deps, npx playwright test, artifact upload. That covers the first 4 weeks.
The production version adds sharding. Browser binary downloads run 200 to 400MB depending on the browser; caching them against the Playwright version hash cuts 60 to 90 seconds from every CI run. GitHub Actions Ubuntu runners have 2 CPU cores, which matters when you're deciding how many workers to run per shard.
# .github/workflows/playwright.yml
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
name: "Tests — shard ${{ matrix.shardIndex }}/${{ matrix.shardTotal }}"
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
shardIndex: [1, 2, 3, 4]
shardTotal: [4]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Cache Playwright browsers
uses: actions/cache@v4
id: playwright-cache
with:
path: ~/.cache/ms-playwright
key: playwright-${{ hashFiles('**/package-lock.json') }}
- name: Install dependencies
run: npm ci
- name: Install Playwright browsers
if: steps.playwright-cache.outputs.cache-hit != 'true'
run: npx playwright install --with-deps
- name: Install system dependencies only
if: steps.playwright-cache.outputs.cache-hit == 'true'
run: npx playwright install-deps
- name: Run tests
run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
env:
TEST_ENV: staging
TEST_USER_EMAIL: ${{ secrets.TEST_USER_EMAIL }}
TEST_USER_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}
- name: Upload blob report
if: always()
uses: actions/upload-artifact@v4
with:
name: blob-report-${{ matrix.shardIndex }}
path: blob-report
retention-days: 1
merge-reports:
name: Merge reports
if: always()
needs: [test]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 20
- name: Install dependencies
run: npm ci
- name: Download blob reports
uses: actions/download-artifact@v4
with:
path: all-blob-reports
pattern: blob-report-*
merge-multiple: true
- name: Merge into HTML report
run: npx playwright merge-reports --reporter html ./all-blob-reports
- name: Upload HTML report
uses: actions/upload-artifact@v4
with:
name: html-report
path: playwright-report
retention-days: 14
The workers: 1 setting in playwright.config.ts applies to non-sharded runs for stability. When you switch to sharding, each shard runs its portion of the test suite and can use multiple workers internally. The practical rule: keep workers: 1 in CI until your serial run time exceeds 10 minutes, then introduce sharding on 4 shards. The overhead of spinning up four GitHub Actions runners is not worth it below that threshold.
The blob reporter is non-negotiable in the sharded setup. Each shard writes a binary blob to blob-report/. The merge-reports job collects all four blobs and produces a single HTML report. Switch any shard to html reporter and the merge step breaks silently.
One hard requirement that belongs in your CI environment and not in the config: forbidOnly: true. This prevents a test.only() call from accidentally making it into the main branch and suppressing the entire test suite in CI. It has happened to every team that didn't enforce it.
Framework readiness checklist: how to know you're done
A framework is ready to hand off to a team when it passes all seven of these conditions — not six, not "most of them":
npx playwright test runs clean on a fresh clone with no manual steps beyond npm ci and setting env variables documented in a .env.example file.TEST_ENV=production.storageState fixture — no beforeEach block in any spec file performs a login.trace: 'on-first-retry' — engineers can diagnose failures without rerunning locally.forbidOnly: true and retries: 2 are active in the CI environment.pages/, fixtures/, and utils/ directories each contain at least one real, populated class — not empty placeholder files that signal the structure was created but not followed.
Point seven is worth naming explicitly. The most common failure mode after a framework setup article is teams who create the folder structure, then proceed to write every test in tests/ without ever populating fixtures/ or pages/. The structure is only useful if the conventions are followed. A readiness review before a team starts writing tests in earnest is worth an hour of your time.
The framework is the foundation, not the feature
Frameworks assembled reactively — test by test, problem by problem — accumulate decisions that contradict each other. The config has three different base URL strings. Auth lives in beforeEach and also in a fixture someone added in month three. CI was added last and runs serially because nobody had time to wire sharding properly.
The cost of that approach isn't visible at 80 tests. By 300, it owns your maintenance schedule.
The architectural decisions covered here — TypeScript from the start, a single environment-aware config, fixture-driven auth and data seeding, CI as day-one infrastructure — each cost an hour at setup time. Together, they determine whether your framework is something your team trusts and extends, or something they work around.
If you're inheriting a framework that doesn't match any of this — or starting from scratch and want the architecture right the first time — DeviQA's test automation engineers build production-ready Playwright setups as part of our QA service engagements.
Book a strategic QA consultation

About the author
Chief Operating Officer
Anastasiia Sokolinska is the Chief Operating Officer at DeviQA, responsible for operational strategy, delivery performance, and scaling QA services for complex software products.