Testing – Arc Rules

Strategy

MUSTWrite unit tests for core logic.
SHOULDWrite integration tests for features that cross boundaries.
MUSTWrite E2E tests with Playwright for critical user flows.
SHOULDCo-locate test files with source or use __tests__ directories consistently.
SHOULDRun pnpm test before merging.

MUSTUse Vitest for unit and integration tests.
SHOULDUse Browser Mode (@vitest/browser-playwright) for component tests that need a real DOM.
SHOULDUse expect.element() with toBeInViewport() for visibility assertions in browser mode.

MUSTAlways await or return promises in tests. Forgetting causes tests to exit before assertions run (silent false pass).
MUSTUse vi.hoisted() for variables referenced inside vi.mock() — mock calls are hoisted above imports, so normal const declarations aren't available yet.
MUSTUse vi.mocked(fn) to access mock methods with full TypeScript types instead of casting.
SHOULDUse happy-dom over jsdom for component tests — significantly faster, sufficient for most cases.
SHOULDUse vi.useFakeTimers() for time-dependent code (debounce, throttle, setTimeout). Call vi.useRealTimers() in afterEach.
SHOULDUse expect.assertions(N) in async tests to catch cases where assertions never execute.
SHOULDUse // @vitest-environment jsdom comment to override environment per file when most tests use node.
SHOULDUse --shard=1/N in CI to distribute tests across parallel runners.

MUSTUse data-testid attributes for E2E selectors.
MUSTUse kebab-case for test IDs, matching component filenames. See react.md.
NEVERSelect by text content, CSS classes, or DOM structure — these change frequently.
SHOULDUse semantic locators (getByRole, getByLabel) for accessible elements.
SHOULDPrefix child element test IDs with the parent component name.

MUSTWait for hydration before interacting in Next.js apps. Clicking before hydration completes causes missed event handlers. Use page.waitForFunction(() => document.readyState === 'complete') or wait for a known interactive element.
MUSTUse --trace on in CI for failed test debugging. Trace viewer shows timeline, screenshots, DOM snapshots, and network — essential for diagnosing CI-only failures.
SHOULDAuthenticate via API calls in globalSetup, not UI login flows. API auth takes ~100ms vs 2-5s for UI login per worker.
SHOULDStore auth state with storageState and load it per worker for parallel test isolation.
SHOULDUse --shard=1/N to distribute E2E tests across CI machines.
SHOULDBlock unnecessary requests (analytics, tracking pixels, images) with page.route() + route.abort() to speed up tests.
SHOULDUse expect.soft() for non-blocking assertions when you want to collect multiple failures in one run.

Tests that hit real external APIs MUST run — don't skip them because "no live API". Use fail-fast patterns to control cost:

MUSTRun E2E tests against real APIs for critical flows. Mocks hide real failures.
MUSTUse aggressive timeouts (15s max for API calls, 30s max per test).
MUSTRun AI/LLM-dependent tests serially (test.describe.configure({ mode: "serial" })).
MUSTSet retries: 0 for API-dependent tests — no burning credits on flaky upstream.
SHOULDInclude an API health check as the first test to abort early if service is down.
SHOULDCentralize timeout constants (TIMEOUT.API_RESPONSE, TIMEOUT.PAGE_LOAD).

Mock at system boundaries. Never mock your own code.

Litmus test: Would a different implementation producing the same behavior still pass this test? If not, you're testing implementation.

Boundary	Mock Tool	Example
External HTTP APIs	MSW (`http.get(...)`)	Third-party REST/GraphQL services
Database	Test database or in-memory adapter	Postgres, Redis, SQLite
Time	`vi.useFakeTimers()`	Debounce, expiry, scheduled jobs
File system	`memfs` or temp directories	File uploads, log writing
Randomness	Seeded values or `vi.spyOn(Math, 'random')`	UUIDs, tokens, shuffling
Environment	`vi.stubEnv()`	`NODE_ENV`, feature flags

Don't Mock	Do This Instead
Your own modules (`vi.mock('./utils')`)	Import and call the real code
Internal collaborators	Use dependency injection, test through the public API
Simple data transformations	Test input → output directly
Framework internals (React, Next.js)	Use testing-library, render real components

MUSTMock only at system boundaries — external APIs, databases, time, file system, randomness.
NEVERMock your own modules or internal collaborators. If you need vi.mock('./my-module'), your design needs dependency injection instead.
SHOULDDesign APIs as SDK-style interfaces ({ getUser, createOrder }) that accept a client parameter, not hardcoded fetch calls.
SHOULDAccept dependencies as parameters — functions that take a db or client argument are trivially testable with real or fake implementations.
SHOULDPrefer fakes (simplified real implementations) over mocks when a boundary is complex. A fake in-memory store is more trustworthy than vi.fn() with .mockResolvedValue().