The hidden costs of poor QA in fintech. Real cases, real numbers

Written by: Senior AQA Engineer

Posted: 05.05.2026

10 min read

Every fintech engineering budget has a line item for QA. What it rarely has is a line item for not doing QA well.

That second number is almost always larger, and almost always invisible until it's too late. A missed edge case in a payments flow. A validation rule that silently fails under load. A compliance check that passes in staging but breaks in production. Each of these is a delayed invoice, and fintech companies pay it eventually, in fines, in customer churn, in incident response hours, and in reputational damage that doesn't show up on any balance sheet.

Industry research consistently shows that bugs found in production cost between 10x and 100x more to fix than those caught during development. In fintech, where regulatory scrutiny is constant and customer trust is the product, that multiplier extends well beyond engineering costs.

This article examines three real-world failure patterns, payments outages, compliance breaches, and silent data bugs, with documented costs and extractable lessons. The goal is not to make QA feel urgent. It should already feel urgent. The goal is to make the financial argument undeniable.

Case 1. The payments processor outage

The challenge

In 2012, Knight Capital Group deployed a new trading system without properly retiring legacy code. The old system had a dormant flag called "Power Peg", unused for years, that was accidentally reactivated by the new deployment. There was no automated test validating the interaction between the new and legacy components, and no kill-switch testing in staging.

Within 45 minutes of markets opening, Knight's systems sent erroneous orders at a rate of 10,000 per minute. By the time the team identified the source, $440 million in losses had accumulated. Knight Capital nearly ceased to exist.

What failed in QA

The failure was multi-layered, but the QA breakdown was specific:

No regression testing on the interaction between new and legacy system flags
No canary deployment or phased rollout with monitored validation
No automated circuit-breaker testing to verify the kill-switch behavior under abnormal order volume

None of these are exotic practices. They are table stakes for any system handling financial transactions at scale.

The result

Knight Capital survived only through an emergency $400M capital infusion from outside investors, diluting existing shareholders by roughly 75%. The firm was eventually acquired by Getco in 2013. The direct cost of the outage, $440M in 45 minutes, remains one of the most cited examples of what absent deployment validation actually costs.

The takeaway

For payments infrastructure, the QA question isn't "did we test the happy path?" It's "did we test every interaction between new code and every existing component it touches?" Legacy code doesn't forgive new deployments that ignore it.

Case 2. The compliance breach

The challenge

In 2020, Starling Bank was fined £29 million by the UK Financial Conduct Authority (FCA), not for fraud, not for a data breach, but for failures in its financial crime controls system. Specifically, the bank had automated sanctions screening that failed to cover a significant portion of customer accounts. The screening tool had gaps in its logic that weren't caught during testing, and the bank continued onboarding customers against an incomplete controls framework for years.

This wasn't a case of deliberate evasion. It was a QA problem. The validation logic had untested conditional paths that, under certain customer onboarding flows, bypassed the screening step entirely.

What failed in QA

Compliance logic is notoriously undertested in fintech for a simple reason: most of the time, it works. The failure paths are edge cases, specific combinations of customer attributes, account types, and transaction flags, that rarely appear in standard test suites.

The gaps that led to Starling's fine followed a familiar pattern:

Test coverage focused on the positive path (customer screened, result returned) rather than the boundary conditions (what happens when screening is skipped, delayed, or returns null)
No independent compliance QA layer, testing was done by the same teams building the feature, without a separate validation pass against regulatory requirements
Integration testing didn't cover all onboarding flows, the gaps existed specifically at the junction between the new automated screening tool and legacy onboarding journeys that hadn't been updated

The result

The £29M fine was the headline. The operational cost, remediating thousands of account records, conducting retrospective screening, rebuilding the controls framework, and managing the FCA relationship, was separately estimated to run into tens of millions more in internal engineering and compliance hours.

The reputational cost in B2B fintech circles is harder to quantify but real: enterprise clients evaluating banking infrastructure partners scrutinize regulatory history closely.

The takeaway

Compliance logic requires its own QA discipline. It cannot be treated as a feature and tested like one. A dedicated compliance testing layer, with test cases derived directly from regulatory requirements, not from product specs, is not optional in a regulated fintech environment. The FCA (and equivalent bodies globally) don't accept "we didn't know the edge case existed" as mitigation.

A practical framework for compliance QA:

Map every regulatory requirement to at least one negative test case (what happens when the requirement is not met)
Require sign-off from a compliance officer on test coverage before any controls-related release
Run automated regression tests against regulatory scenarios on every deployment, not just major releases

Case 3. The silent data bug

The challenge

In 2016, a major European retail bank discovered that its interest calculation engine had been applying incorrect rounding logic to a subset of fixed-term savings accounts. The bug had been introduced 18 months earlier during a migration to a new core banking platform. It wasn't a crash bug, accounts still opened, transactions still processed, statements still generated. Everything appeared to work.

What was wrong: a rounding rule that should have applied at the transaction level was instead being applied at the batch level, producing minor discrepancies in interest accrual per account, anywhere from a few pence to several pounds per year, depending on account balance.

The bank discovered the bug during an internal audit, not through customer complaints. By the time it was caught, approximately 1.2 million accounts had been affected. The remediation required:

Recalculating 18 months of interest accrual across 1.2 million accounts
Manual review of a sample set to validate the recalculation logic
Proactive customer communication to affected accounts
A voluntary £36M remediation payment to customers
Regulatory notification and subsequent supervisory review

What failed in QA

The migration QA process had focused on functional parity, does the new system produce the same outputs as the old one? It used a representative sample of test accounts, not an exhaustive dataset. The rounding discrepancy only manifested at a specific account balance threshold that wasn't represented in the test set.

Key gaps:

Data-level testing was sampling-based, not exhaustive. In a migration context, sampling is insufficient for calculation logic.
No financial reconciliation testing, no automated check that total interest liability matched expected aggregate calculations after each batch run
No anomaly detection in production, a simple statistical monitor on average interest rates per product tier would have flagged the anomaly within weeks

The result

The £36M remediation payment was the direct cost. Add to that an estimated £8–12M in internal engineering and operations effort, plus regulatory scrutiny that required third-party audit validation of the fix.

The harder cost: the bank's internal review found that the same calculation engine had six other product types using similar logic. Each one required separate audit and validation, a further six months of engineering time.

The takeaway

Silent bugs are the most expensive category in fintech. They don't alert. They don't break dashboards. They accumulate liability quietly over months or years, and the longer they run, the more expensive they become to remediate.

Prevention requires testing at the data layer, not just the application layer:

For any migration, use production-equivalent data volumes with synthetic edge cases inserted
Run financial reconciliation checks as part of every release pipeline, total debits must equal total credits, aggregate calculations must match expected theoretical outputs
Build anomaly detection into core financial calculation jobs, not just transactional APIs

The pattern across all three

Different companies, different products, different failure modes. But the QA breakdown follows a consistent structure across all three cases:

1. Testing covered the designed flow, not the failure modes. Knight Capital tested whether the new system worked. Nobody tested what happened when it interacted with deactivated legacy logic. Starling tested whether screening ran. Nobody tested what happened when it didn't. The bank tested whether interest was calculated. Nobody tested what happened when the calculation was slightly wrong at scale.

2. Test environments didn't reflect production reality. Staging environments with limited data, simplified configurations, and clean account states don't surface the bugs that production finds. The gap between "works in staging" and "fails in production" is almost always an environment fidelity problem.

3. QA was a phase, not a function. In all three cases, QA happened before release and then stopped. There was no continuous validation in production, no statistical monitoring of financial outputs, no automated regression layer watching for behavioral drift. Modern fintech QA isn't a pre-release gate, it's a permanent operational function.

The number that should change every CFO's mind

IBM's Systems Sciences Institute documented the cost-per-bug across the software development lifecycle. The numbers have been updated repeatedly across the industry, but the ratios remain consistent:

Stage bug is found

Relative cost to fix

Design / Requirements

Development

QA / Testing

10x

Production (minor)

25–50x

Production (compliance/regulatory)

100–1000x

The Starling fine illustrates the far end of that range. A validation logic gap that would have taken a compliance QA engineer a day to identify and a developer two days to fix cost £29M in regulatory fines plus the remediation overhead.

That's not a QA cost. That's a QA investment that wasn't made.

For engineering and product leaders, the framing question is simple: what is the expected cost of a production compliance failure in our environment, and what is the annual cost of the QA function that would prevent it? In regulated fintech, that math almost never favors underinvesting in QA.

What good QA investment actually buys:

Shift-left testing infrastructure that catches logic bugs during development, not deployment
Dedicated compliance QA with regulatory test suites maintained alongside the product
Production observability for financial calculation outputs, not just uptime and latency
Chaos and boundary testing as standard practice for any system handling money movement
Data-layer validation for all migrations, with reconciliation as a first-class deliverable

Conclusion: QA is risk management

The three cases in this article cost a combined total north of £500M in direct financial impact, fines, remediation payments, and emergency capital raises. All three were preventable with QA practices that exist, are well understood, and cost a fraction of the liability they would have eliminated.

Poor QA in fintech isn't a technical problem. It's a risk management problem that gets labeled technical after the fact.

Senior engineering leaders who treat QA as a cost center to be optimized down are making a bet: that their system won't fail in the expensive ways that other systems have failed. That bet has poor odds and asymmetric downside. The companies that treat QA as a core financial risk function, with dedicated investment, independent authority, and continuous operation, are the ones whose names don't appear in FCA enforcement notices or post-mortem case studies.

The hidden costs of poor QA aren't hidden at all. They're just paid later, by someone who didn't budget for them.

Weak QA is expensive. In fintech, it's catastrophic. DeviQA helps teams build QA systems that prevent outages, compliance gaps, and costly data errors before they reach production.

Your dev team need a solid QA partner

About the author

Ievgen Ievdokymov

Senior AQA engineer

Ievgen Ievdokymov is a Senior AQA Engineer at DeviQA, focused on building efficient, scalable testing processes for modern software products.