OpenOnco quality control: testing a closely integrated diagnostics database and codebase: LLM Data Review + UI Regression Testing. OpenOnco grew from prototype to production in about a month: 80+ diagnostic tests, complex filtering, PDF export, comparison tools. 12K lines of code. Manual QA stopped working, fortunately some smart software folks advised us. Here's our system: (1) Multi-LLM Data Verification Before each deploy, I run the full database through Claude, Grok, GPT-5, and Gemini 3. Each model reviews test data for: → Inconsistencies between related tests → Outdated info vs. current clinical guidelines → Missing fields that should be populated → Logical errors (FDA-approved test with no approval date) Different models catch different things. Claude finds logical inconsistencies. GPT-5 catches formatting. Grok flags outdated clinical data. Gemini spots missing cross-references. (2) Automated UI Regression Testing Regression testing: "Did my changes break something that was working?" For us this means testing actual user workflows — clicking buttons, filling forms, navigating between pages — and verifying the interface behaves correctly every time. We test the actual UI, not just components in isolation: → Filter interactions: Click "IVD Kit" filter → verify correct tests appear → click "MRD" category → verify intersection is correct → clear filters → verify all tests return → Test card workflows: Click test card → modal opens with correct data → click "Compare" → test added to comparison → open comparison modal → verify all fields populate → Search behavior: Type "EGFR" → verify matching tests surface → clear search → verify full list returns → Direct URL testing: Navigate to /mrd?test=mrd-1 → verify modal auto-opens with correct test → Navigate to /tds?compare=tds-1,tds-2,tds-3 → verify comparison modal loads with all three → PDF export: Generate comparison PDF → verify page count matches content → verify no repeated pages (caught a real bug where Page 1 rendered on every page) → Mobile responsiveness: Run full suite at 375px, 768px, 1024px, 1440px breakpoints We run these tests using Playwright — an open-source browser automation framework. It launches real browsers (Chrome, Firefox, Safari), executes user actions, and asserts outcomes. Tests run on every push via GitHub Actions; deploy is blocked if anything fails. Full suite takes ~4 minutes 🤯🤯🤯 The combination of LLM data review + real UI regression testing catches what unit tests miss: so far, hundreds of issues 👍🏼👍🏼👍🏼
Managing Major Regressions in Software Testing
Explore top LinkedIn content from expert professionals.
Summary
Managing major regressions in software testing means identifying and addressing scenarios where new changes unintentionally break software features that previously worked, which can disrupt user experience and business operations. By focusing on targeted regression tests and automation, teams can quickly catch issues, maintain quality, and avoid unnecessary work.
- Automate key workflows: Use automated tools to regularly test critical user flows so you can catch broken features before they reach customers.
- Prioritize high-risk areas: Focus your regression testing on parts of the software most likely to be impacted by recent changes, rather than retesting everything.
- Use production data: Incorporate real user behavior and logs from live environments into your tests to ensure you’re covering scenarios that matter most.
-
-
I’ve never seen a team work this hard… They ran 2,500 tests every month… and found nothing. Four engineers. One massive app. And barely any bugs to show for it. In six months, just three issues. Not one made it to production. On paper, it looked like a win. But something felt off. Releases were slipping. Budgets were tightening. And the team? Stretched thin. That’s when the Application Director called me. We sat down with the business: Claims, Billing, Underwriting. Mapped their workflows. Tracked the risks. And asked a tough question. Why are we testing all of this? We rebuilt the regression suite from scratch: Grouped tests into Platinum, Gold, and Silver Defined when each tier should run (e.g., only Platinum for hotfixes) Automated the Platinum tier, fast, reliable, focused Plugged real coverage gaps Three months later, the numbers told the story: 1. Regression load cut in half 2. QA effort down 60% 3. Defects found: 5x more 4. Still zero production issues They weren’t just testing smarter. They were finally testing what mattered. Sometimes, less testing means better quality. You just have to be brave enough to change the rules. What part of your process feels "safe" but isn't working?
-
🤖 Your test suite might look solid… until production tells a different story. Every release, something slips through. Not because teams aren’t testing enough, but because they’re not testing what actually happens in production. In this episode, I sat down with Tanvi Mittal to break down how production logs can be turned into real regression tests using AI. Instead of guessing edge cases, you’re working with real user behavior, real failures, and real flows that already happened. We walk through how teams can take messy logs, cluster them with AI, and convert them into Gherkin scenarios that plug directly into your automation suite. It’s a practical way to close the gap between what you test and what users actually do. 🎧 Listen here: https://lnkd.in/eMUGtnuM #SoftwareTesting #AutomationTesting #QA #DevOps #AIinTesting #ShiftLeft #QualityEngineering #TestAutomation #TestGuildPodcast
-
"We have to run a full regression test suite on every build!" First: you don’t *have* to do anything. There is no law of nature, nor any human regulation, that says you must repeat any particular test. You *choose* to do things. In the Rapid Software Testing namespace, we say that regression testing is *any testing motivated by change to a previously tested product*. When a product changes, risk is not evenly distributed. A brief pause for some analysis of where the changes might affect things can help focus your testing on plausible risk. Test — that is, challenge — the idea that a change had only the desired effects in the area it was made, and didn’t introduce undesirable effects. Test things that might be connected to or influenced by that change. It might make sense to do *some* testing in places where you believe risk is low — to reveal hidden risks. If you want to find problems that matter, though, diversifying your techniques, tools, and tactics is essential. Rote repetition can limit you badly. Obsession with looking for problems in exactly the same way as you've already looked displaces two things: 1) Your ability to find problems that were there all along, but that your testing has missed all along; and 2) Your ability to find new problems introduced by a change that your existing set of tests won’t cover. Don’t fixate on tests you’ve done before. Consider the gap between what you thought you knew about the system before the change, and what you need to know about the product as it is now. It’s the latter that’s the most important bit, and your old tests might not be up to the task. Need help convincing management of this? Let me know.
-
A bug hits production. Silence. Panic. And then the blame. Testers get the heat. As if they wrote the bug. As if it passed through everyone else undetected. Let’s be clear. Testing didn’t put the bug there. And blaming testers won’t stop the next one. The presence of a bug is not proof of testing failure. It’s a crack in the system. And if your first instinct is to find who missed it instead of how it slipped through, you’ve already failed quality. Here’s what to do instead: When a production bug appears: 1. Acknowledge it - no blame. 2. Fix it - no shortcuts. 3. Investigate it - no ego. 4. Strengthen your process - not your defense. Run a collaborative RCA. Not to find a scapegoat, but to learn, adapt, and prevent. Ask the right questions: On Risk and Coverage • Was a risk analysis done? • Did tests address those risks? • Were critical scenarios missed? On Testing • Were unit and integration tests solid? • Did flaky or failed tests get ignored? • Was regression enough or rushed? On Communication • Did everyone understand the problem? • Did everyone have the same understanding? • Were last-minute changes tested? On Deployment • Was post deploy validation done? • Was there enough monitoring in production? On Culture • Were concerns raised and addressed? • Did deadlines pressure quality? And once the RCA is done, don’t just document it. Change your habits. Change your defaults. Next time a bug hits prod, don’t ask “Who missed it?” Ask “Where did we all go wrong?” Then ask “What are we doing about it now?” So how do you handle bugs in production? #softwaretesting #productionbug #rca #qualityengineering #teamexcellence #brijeshsays
-
I’m inspired by this story on AI-powered process innovation around QA from Thorsten Ott and his SiteWatch team at Fueled. 🤩 https://lnkd.in/gGeinzNY Anyone who’s built or maintained large digital properties knows regression testing is essential... but often a tedious time sink relative to its value. Visual Regression Testing (VRT) tools promised to streamline QA by showing heat maps of visual differences before and after a code change or update. In reality, I consistently found two big issues that made them maddening: (1) Most “flags” are just expected content changes, like new headlines, ad swaps, or personalized components, so you mostly waste time pouring over false positives. 😑 (2) Even small, intentional design tweaks (like typography adjustments) can flood the screen with red highlights, obfuscating real problems. 😣 When I was involved in testing, I often found myself giving up on VRT and relying on error prone manual checks. Fueled’s new homegrown tool changes the equation: it uses AI to automatically review every flagged difference, separating real breakages from harmless updates, then summarizes issues in plain language right inside Slack. 🤖🧠 This solution not only makes VRT faster, it also makes it more effective. By filtering out noise and focusing on true regressions, teams can expand their test coverage and review more pages with each release without exponentially driving up time and effort. Meaning: the AI saves time *and* reduces the likelihood of missing real quality issues. It’s a perfect example of how smart AI integrations can improve efficiency *and* quality.
-
Playwright visual regression testing using your existing automation framework with minimal custom code. 🏗️ HOOK-DRIVEN ARCHITECTURE Phase 1: Specialized Agents - visual-regression-agent: Handles baseline capture, comparison, and management - url-change-detector-agent: Maps code changes to affected URLs using git diff analysis - playwright-baseline-agent: Manages scrolling capture and segmented storage Phase 2: Git Hooks Integration - post-commit hook: Triggers change detection and selective visual testing - pre-push hook: Validates all baselines are current before deployment - post-merge hook: Updates baselines after approved changes Phase 3: Rule-Based Automation - visual-regression-rules.md: Defines when/how visual tests trigger - baseline-storage-rules.md: Governs folder vs file storage logic - change-detection-rules.md: Maps file patterns to affected URLs Phase 4: Minimal Custom Code - URL mapping config (JSON): File patterns → affected URLs - Baseline storage config (JSON): Page → storage strategy - Integration scripts: Glue code to connect hooks → agents → rules 🛠️ IMPLEMENTATION COMPONENTS Agents (Leveraging existing Task tool) .claude/agents/ ├── visual-regression.md # Baseline management ├── url-change-detector.md # Change impact analysis └── playwright-baseline.md # Capture automation Hooks (Extending existing hook system) .claude/hooks/ ├── post-commit-visual.sh # Trigger after commits ├── pre-push-baseline.sh # Validate before push └── visual-test-runner.sh # Execute selective tests Rules (Auto-loaded by keyword) .claude/rules/ ├── visual-regression.md # Testing workflow rules ├── baseline-management.md # Storage and versioning └── change-detection.md # Impact analysis rules Minimal Code (Configuration-driven) playwright/ ├── visual-config.json # URL mappings & storage rules ├── baseline-runner.js # Lightweight test executor └── change-detector.js # Git diff → URL mapper 🔄 AUTOMATED WORKFLOW 1. Code Change → post-commit hook detects changes 2. Hook → launches url-change-detector-agent 3. Agent → applies change-detection rules to identify affected URLs 4. Hook → launches visual-regression-agent with affected URL list 5. Agent → runs selective Playwright tests with baseline comparison 6. Results → automatically update baselines or report failures 🎯 BENEFITS - ✅ 90% automation through existing hook/agent/rule system - ✅ Minimal custom code - mostly configuration - ✅ Self-managing - hooks handle trigger logic - ✅ Rule-driven - easy to modify behavior without code changes - ✅ Agent-powered - leverage existing Task tool capabilities
Explore categories
- Hospitality & Tourism
- Productivity
- Finance
- Soft Skills & Emotional Intelligence
- Project Management
- Education
- Leadership
- Ecommerce
- User Experience
- Recruitment & HR
- Customer Experience
- Real Estate
- Marketing
- Sales
- Retail & Merchandising
- Science
- Supply Chain Management
- Future Of Work
- Consulting
- Writing
- Economics
- Artificial Intelligence
- Employee Experience
- Healthcare
- Workplace Trends
- Fundraising
- Networking
- Corporate Social Responsibility
- Negotiation
- Communication
- Engineering
- Career
- Business Strategy
- Change Management
- Organizational Culture
- Design
- Innovation
- Event Planning
- Training & Development