Nadia Heng · 2026-01-18

Triage language that calms flaky UI wars

culture automation

Teams often treat intermittent UI failures like weather: everyone complains, nobody measures. In our Marina Square cohorts we ask learners to attach three artefacts before a meeting starts: the last green commit, the failing trace or video, and the dataset fingerprint used in the run.

That discipline sounds bureaucratic until you see how quickly conversations shorten. Developers stop guessing about selectors; testers stop defending “the environment was fine on my machine.” The article walks through phrases we ban (“random failure”) and replacements that point to verifiable checks.

We also share a lightweight rubric for deciding when to quarantine a suite versus when to fix data setup. The goal is not perfect green builds overnight; it is a fair process where both sides know what evidence counts.

If you are preparing for a Playwright-heavy programme, bring one painful example from work—we shape the rubric around real language your company already uses, then gently tighten it over two facilitated sessions.