Dmitry Reznik

Written by: Chief Technology Officer

Dmitry Reznik

Posted: 30.04.2026

25 min read

At a certain scale, monolithic architecture stops being an asset and starts being a constraint. Deployment windows become coordination nightmares. A bug in one module can take down the whole system. Scaling means scaling everything, even the parts that don't need it.

Microservices solve this by breaking applications into independent, deployable services — each owning a specific business function, its own data store, and its own release cycle. Teams ship faster. Systems scale surgically. A failure in one service doesn't cascade into a system-wide outage.

But that independence comes with a cost that most teams underestimate until they're deep in it: testing a distributed system is fundamentally harder than testing a monolith. The failure modes multiply. Dependencies become invisible until they break. A passing test suite is no longer a guarantee that the system works.

This guide covers what you actually need to know — the testing strategies, automation approaches, tools, and best practices that make microservices quality achievable at scale.

Get a QA architecture review for your microservices environment

The unique testing challenges of microservices architecture

Microservices give your team independence, speed, and the ability to scale individual components without touching the rest of the system. But that same independence is exactly what makes testing hard.

When you move from a monolith to microservices, your test suite doesn't just get bigger, it gets structurally different. The failure modes change. The dependencies multiply. And the strategies that worked perfectly at 3 services start breaking down at 30.

Before you build your automation strategy, you need to understand the specific challenges you're solving for. Here are the five that derail most microservices testing.

1. Distributed tracing: when a test fails, you can't tell why

In a monolith, a failed test points to a line of code. In a microservices system, a failed request has traveled through 4, 6, maybe 12 services before something went wrong. Your test output tells you the end state — it doesn't tell you where in the chain the failure originated.

What this looks like in practice: An end-to-end checkout test fails with a generic 500 error. Was it the payment service? The inventory check? A timeout in the notification queue? Without distributed tracing instrumented across your test environments, your engineers spend more time diagnosing failures than fixing them.

The fix: Instrument every service with a correlation ID that propagates through the full request chain. Tools like Jaeger, Zipkin, and OpenTelemetry let you trace a single test execution across service boundaries and pinpoint the exact service, method, and timestamp where the failure occurred. In practice, this means your test pipeline should capture and log the full trace ID for every failed test run — not just the error message.

What to implement: Add trace context propagation to all inter-service calls. Configure your CI pipeline to output a direct trace link on any test failure. Make distributed tracing a first-class requirement in your test environment setup, not an afterthought.

2. Test data management across services: consistency at scale is harder than it looks

In a monolith, you seed one database and you're done. In microservices, each service owns its own data store. A single user journey — registration, purchase, delivery confirmation — might touch 5 separate databases across 5 separate services. Getting those data stores into a consistent, known state before every test run is one of the most underestimated problems in microservices QA.

What this looks like in practice: Your order service test creates a test order, but the inventory service database still has the item marked as out-of-stock from a previous test run that didn't clean up correctly. The test fails — not because of a bug, but because of dirty state from a previous execution. This is the source of a significant share of flaky tests in distributed systems.

The fix: Two approaches work in combination. First, adopt service-owned test data — each service is responsible for seeding and tearing down its own data sets, completely independent of what other services do. Second, use synthetic data generation tools (Faker, Mimesis, custom factories) to create isolated, non-colliding test data per run rather than relying on shared fixtures.

Approach
Best for
Avoid when
Shared fixture database
Small teams, early-stage
Tests run in parallel
Service-owned seed scripts
Independent services
Tight cross-service dependencies
Synthetic data generation
High parallelism
Services require relational consistency
Data virtualization
Regulated data (PII, PCI)
Performance is the primary concern

For integration and E2E tests, ephemeral environments provisioned fresh per test run (using Kubernetes namespaces or Docker Compose) eliminate dirty-state failures entirely — at the cost of higher infrastructure overhead.

3. Environment parity: your staging environment is lying to you

Your tests pass in the CI environment. They fail in staging. They pass in staging. They fail in production. Environment parity — keeping your test environments genuinely representative of production — is harder to maintain in microservices than in any other architecture.

What this looks like in practice: Service A runs version 2.3 in production. Your staging environment is still running 2.1 because the last deployment was manually blocked pending a review. Your tests pass against 2.1 and give you false confidence. The release goes out, and an integration breaks that your tests should have caught.

The specific failure modes to watch for:

  • Version drift between services across environments

  • Configuration differences (environment variables, feature flags, secrets) that aren't version-controlled

  • Infrastructure differences — production uses Redis Cluster, staging uses a single Redis node

  • Third-party service differences — you mock Stripe in staging but hit the real API in production

The fix: Treat your test environment configuration as code. Infrastructure-as-Code tools (Terraform, Pulumi) ensure your staging environment is provisioned identically to production on every run. Service mesh configurations, feature flag states, and environment variables should all live in version control with environment-specific overrides tracked explicitly. For cross-service version compatibility, use consumer-driven contract tests (covered in the next section) rather than relying on a staging environment being perfectly current.

4. API versioning and contract drift: the silent breaking change

Microservices live and die by their APIs. When Team A changes the response schema of their service — even a small, seemingly backward-compatible change — every downstream consumer that depends on that API becomes a potential failure point. At scale, with dozens of services evolving independently, managing this dependency web manually is impossible.

What this looks like in practice: The user service team adds a new required field locale to the /users/{id} response. They test their service, it passes. They deploy. The recommendation service, which consumes that endpoint, was never updated to handle the new required field — and starts throwing null pointer exceptions in production at 11 PM on a Friday.

The fix: Consumer-driven contract testing closes this gap. Here's how it works:

1.

The consumer (recommendation service) defines a contract — a formal specification of exactly what it expects from the provider's API: which fields, which data types, which response codes.

2.

The provider (user service) runs those consumer contracts as part of its own test suite before every deployment.

3.

If the provider's changes break any consumer contract, the build fails — before the change is ever deployed.

Pact is the standard tool for this pattern. Pactflow extends it for enterprise-scale broker management across many services. Spring Cloud Contract is the preferred option in Java/Spring ecosystems.

The key rule: no service should be able to deploy a breaking API change without a failing test that catches it first. If your current pipeline doesn't guarantee this, contract testing is the highest-leverage investment you can make in your microservices QA strategy.

5. Flaky tests at scale: the trust problem that compounds over time

A flaky test — one that passes and fails non-deterministically — is an annoyance on a monolith. On a microservices system, flaky tests are a systemic risk. When your test suite has hundreds of services and thousands of tests, even a 2% flakiness rate produces dozens of unreliable signals per run. Engineers stop trusting the pipeline, start overriding failures, and gradually the test suite loses its entire purpose.

What causes flakiness in microservices specifically:

  • Timing dependencies: Service B isn't fully initialized when Service A's test fires its first request

  • Shared state pollution: Tests running in parallel write to the same database records

  • Network instability: Test environments use shared infrastructure that introduces real latency variation

  • External service dependencies: Tests that hit real third-party APIs are subject to rate limits, downtime, and response variation

  • Event ordering: Asynchronous message queues don't guarantee delivery order, so event-driven test assertions fail intermittently

The fix is a combination of four practices:

Retry logic with quarantine: Automatically retry failed tests once before marking them as failures. Flag tests that fail-then-pass as flaky and quarantine them in a separate tracked report. Don't let them block deployments — but don't ignore them. Set a zero-tolerance policy: a quarantined test gets fixed within one sprint.

Service virtualization: Replace real external dependencies with stubbed, deterministic equivalents in your test environment. WireMock and Mountebank let you define exact request/response behavior for dependent services, eliminating network-induced flakiness entirely.

Test isolation: Each test owns its setup and teardown. Tests should never depend on execution order or on state left behind by a previous test. This is non-negotiable in a parallel execution environment.

Observability in your test pipeline: Track flakiness metrics over time. A test that has failed intermittently 7 times in the last 30 runs is a structural problem, not bad luck. Tools like BuildPulse and Trunk Flaky Tests give you per-test failure rate history and trend data.

The underlying pattern across all five challenges: microservices testing problems are almost always distributed systems problems in disguise. Flaky tests are a timing and isolation problem. Data inconsistency is a state management problem. Breaking changes are a dependency visibility problem. Your testing strategy needs to address the distribution head-on — not paper over it with more tests.

None of these challenges are unsolvable. But they do require a deliberate strategy — one that accounts for the specific failure modes of distributed systems rather than scaling up monolithic testing practices and hoping they hold. That strategy is what the rest of this guide covers.

Talk to an engineer about your microservices testing gaps

What is the microservices testing strategy?

Given the principles of decentralization and autonomy, microservices-based apps require a specialized testing strategy to ensure their reliability, stability, performance, and seamless work. Complex inter-service communication, data consistency, as well as the distributed nature of microservices necessitate comprehensive testing at all levels.

Individual microservices, interactions between them, and the software as a whole should undergo meticulous testing to ensure confidence that each service operates as intended and contributes to the seamless work of the entire application. Here's the best approach to effective testing of microservices-based apps:

Unit testing: Uniting testing is executed at the development stage, and therefore this is the responsibility of developers. Still, it plays a significant role by checking particular classes, methods, or functions. Unit testing aims to ensure that individual code units work as expected. Unit testing cuts off all dependencies of a unit with the help of fakes, stubs, mocks, dummies, and spies.

Unit testing lets developers maintain decent code quality and facilitates early bug detection, preventing defects from evolving into more consequential problems later.

The selection of tools for automated unit testing is defined by the tech stack. Thus, JUnit and Mockito can be utilized for Java while Pytest is a perfect match for Python.

Component testing: Component testing ensures that each microservice works flawlessly and efficiently on its own. Therefore, each service is tested in isolation from the rest. To isolate the testing scope, dependencies on DBs, APIs, or other services are mocked or stubbed. Component testing can be executed either in-process or out-of-process.

Such libraries as PowerMock, WireMock, and Mockito, are widely used to mock external dependencies in the course of component testing.

Contract testing: Contract testing checks interactions between microservices against predefined contracts for APIs that outline the anticipated inputs, outputs, and behaviors. Contact testing checks whether particular microservices can communicate with each other properly by testing APIs that ensure interactions between those microservices.

Spring Cloud Contract, Pact, and Pactflow are great tools for automating contract testing within microservices environments.

Integration testing: Integration testing checks whether independently developed microservices work seamlessly and correctly when they are connected. It focuses on catching defects related to communication, data consistency, and overall system integration.

Quite a lot of time and effort are required to create and run integration tests. Mocha, Supertest, Hoverfly, and Jest are widespread test automation solutions for integration testing.

End-to-end testing: End-to-end testing, also known as system testing, examines a microservices-based app from A to Z. At this final testing level, the entire user flow is rigorously tested to ensure that the application functions as expected, successfully achieving its business goals.

As long as this test suggests spinning up and trying to connect several microservices, automation and maintenance is a daunting task.

Feel free to use Selenium, Cypress, or Playwrights as a test automation framework for this purpose.

How to adopt automated testing for microservices?

Microservices architecture perfectly aligns with the DevOps philosophy where automated testing is vital. The adoption of automated testing for microservices requires a strategic approach to ensure the smooth integration of QA practices into the SDLC. Here are a few steps to take for the efficient implementation of automated testing in a microservices environment:

1. Carefully study microservices architecture

Gain a complete understanding of microservices architecture, including its principles, communication patterns, and dependencies. This knowledge will let you design appropriate automated microservices testing strategies.

2. Design your automated microservices testing strategy

Draw up a testing strategy specific to microservices, considering various levels of testing, such as unit testing, component testing, contract testing, integration testing, and e2e testing. Clearly outline the scope of testing, taking into account regression testing, smoke testing, performance testing, security testing, etc.

3. Select appropriate microservices testing tools

Choose those automated testing tools that support the tech stack used in your microservices architecture. This may include unit testing tools, API testing tools, e2e testing tools, performance testing tools, and others.

4. Ensure efficient data management

Introduce best practices for efficient test data management to ensure that relevant and consistent data is available for testing while production data is not impacted.

5. Set up a testing environment

Establish a special testing environment that closely imitates the production environment. We strongly recommend leveraging IaC for consistent testing environments and streamlined deployment processes.

6. Create comprehensive test suites

Develop stable, well-structured, and comprehensive test suites covering various aspects of microservices.

7. Integrate tests into CI/CD pipelines

Integrate your automated tests into CI/CD pipelines to ensure their consistent execution on every code change and establish a continuous feedback loop. This not only streamlines the QA process but also improves the overall efficiency and reliability of your software delivery lifecycle.

8. Implement robust analytics

By incorporating advanced analytics into your testing framework, you gain valuable insights into test outcomes, enabling the early recognition of trends, patterns, and potential areas for improvement. This implementation goes beyond a mere assessment of pass and fail rates. It empowers your QA team to delve deeper into the data, gaining valuable insights that can foster continuous improvement of your QA strategy.

9. Embrace a culture of continuous improvement

Regularly review and update your testing practices, strategies, and processes based on feedback, test results, modifications of your microservices app, and technology advancements.

Best practices for microservices test automation

Additionally, We’d like to draw your attention to some best practices, the incorporation of which can maximize the benefits of automated testing for microservices apps too:

  • Shift-left testing: Introduce testing as early in the development process as possible to catch bugs at the initial stages of development before they turn into critical issues.

  • Logging and monitoring: Implement robust logging and monitoring to get insights into the soundness of microservices and foster prompt debugging and issue resolution. Use modern monitoring tools like Grafana. Prometheus or ELK to track the state of your microservices app in real time and collect various metrics.

  • Version control: Maintain efficient version control to manage updates and ensure backward compatibility. Git is one of the most useful tools to be utilized for this purpose.

  • Documentation and reporting: Document test cases, scenarios, and results comprehensively. Introduce clear reporting mechanisms to identify successes and failures.

  • Capability for scaling: Develop strategies for scaling testing efforts as the number of microservices increases. This includes managing dependencies and orchestrating complex testing scenarios.

  • Parallel test execution: Leverage parallel test execution to significantly speed up test runs and enhance test efficiency, especially as the number of microservices grows.

  • Infrastructure-as-code (IaC): This DevOps practice opens the door to providing and managing various infrastructure resources, for example, virtual machines, containers, networks, etc., programmatically and reproducibly. As a result, you can take advantage of a swift and dependable setup of test environments with all needed configurations. Terraform or AWS CloudFormation are tools that can help you to realize this.

  • Test data management: Microservice testing is always associated with sophisticated test data management. Data virtualization and synthetic data generation are two efficient techniques that can be used to ensure efficient test data management.

  • Service virtualization: Service virtualization imitates the work of dependent microservices. This way testing can be decoupled from dependent services, allowing continued testing even if some of your services are not available or undergo updates. WireMock, Mountebank, and Hoverfly are modern solutions that provide service virtualization capabilities.

  • Efficient collaboration and communication: Foster close collaboration between development, QA, and ops teams to build a culture of shared responsibility for quality and ensure that everyone is aligned on testing goals and strategies.

  • Ongoing training and skill development: Make sure that the members of your QA team possess the necessary skill sets to design, implement, and maintain automated tests for microservices. Provide ongoing training as needed.

By following the presented steps and implementing these best practices, one can successfully adopt automated testing for microservices, ensuring a reliable and efficient QA process.

What tools are best for microservices test automation?

Choosing the right tool for each layer of your testing pyramid matters. The wrong tool creates maintenance overhead; the right one becomes invisible infrastructure your team stops thinking about.

Here's a consolidated view of the tools most commonly used across microservices testing layers, what they're best at, and where they fall short.

Tool
Test layer
Language / ecosystem
Best for
Watch out for
JUnit 5
Unit
Java
Standard Java unit testing, parameterized tests
Java-only
Mockito
Unit
Java
Mocking dependencies in Java services
Verbose setup for complex dependency graphs
Pytest
Unit
Python
Clean, minimal unit testing for Python services
Limited built-in parallelism
Jest
Unit / Integration
JavaScript / TypeScript
Node.js services, fast test runner
Memory-heavy on large suites
xUnit
Unit
.NET
.NET microservices, parallel test execution
.NET ecosystem only
WireMock
Component
Language-agnostic
HTTP service stubbing, request matching
Static stubs need maintenance as APIs evolve
Testcontainers
Component / Integration
Java, Go, .NET, Node
Spinning up real DBs and dependencies in Docker
Slower than pure mocks; needs Docker runtime
Mountebank
Component
Language-agnostic
Multi-protocol virtualization (HTTP, TCP, SMTP)
Steeper learning curve than WireMock
MSW (Mock Service Worker)
Component
JavaScript / TypeScript
API mocking in Node and browser environments
JavaScript ecosystem only
Pact
Contract
Multi-language
Consumer-driven contract testing, independent deployability
Requires Pact Broker for team-scale use
Pactflow
Contract
Multi-language
Managed Pact Broker, bi-directional contracts
Paid service for enterprise features
Spring Cloud Contract
Contract
Java / Spring
Contract testing in Spring Boot ecosystems
Tightly coupled to Spring stack
Supertest
Integration
JavaScript / TypeScript
HTTP integration testing for Node.js APIs
Limited to HTTP; no messaging protocol support
REST Assured
Integration
Java
Fluent HTTP integration testing for Java APIs
Java-only
Karate
Integration
Language-agnostic
API testing + contract + performance in one framework
DSL has a learning curve
Hoverfly
Integration
Language-agnostic
Traffic simulation, latency injection, service virtualization
Less adoption than WireMock
Playwright
E2E
Multi-language
Modern E2E, parallel execution, built-in tracing
Overkill for pure API services
Cypress
E2E
JavaScript / TypeScript
Developer-friendly E2E, real-time debugging
Single-tab limitation; no multi-origin support
Selenium
E2E
Multi-language
Cross-browser E2E, legacy compatibility
Slowest E2E option; high maintenance
k6
Performance
JavaScript
Load testing, CI-friendly, scripted scenarios
No built-in UI; results need Grafana or similar
Gatling
Performance
Scala / Java
High-throughput load testing, detailed HTML reports
Scala DSL is unfamiliar to many teams
Chaos Monkey
Resilience
JVM
Random instance termination in production-like environments
Netflix-origin; complex setup outside AWS
Gremlin
Resilience
Language-agnostic
Controlled chaos engineering, attack library
Paid platform

Here's the full section — proprietary framework, practitioner voice, built to be cited and linked.

The DeviQA microservices testing maturity model

Most engineering teams know their testing isn't where it should be. Fewer know exactly where they stand — or what the next step actually looks like.

We've built and audited testing programs across 300+ software products. The pattern that emerges is consistent: teams don't fail at testing randomly. They get stuck at predictable points, for predictable reasons. This maturity model maps those points and gives you a clear line of sight from where you are to where you need to be.

Five levels. Each one builds on the last.

Level 1 — Ad hoc

The signal: Testing happens, but nobody owns it.

At this level, testing is reactive. Engineers write tests when they feel like it or when a bug forces them to. There's no defined strategy, no coverage targets, and no shared understanding of what "tested" means. QA — if it exists at all — is manual, sprint-end, and bottlenecked.

In a microservices context, this typically means each service team tests in isolation with no coordination. Integration gaps are discovered in staging, or worse, in production.

What's present:

  • Manual smoke tests before deployment

  • No CI integration

  • Test coverage below 30%, untracked

  • No contract or integration test layer

  • Post-release bug discovery is normal

The cost: Every deployment is a calculated risk. Incidents are frequent, fixes are rushed, and confidence in releases is low. Teams compensate with longer freeze periods and manual release checklists — which slow velocity without actually improving quality.

Level 2 — Reactive automation

The signal: Automation exists, but it's fragile and trusted by nobody.

The team has invested in test automation — usually at the E2E or UI layer because that's the most visible. Unit tests exist in some services but not others. The test suite runs in CI, but fails so often for non-code reasons (flaky tests, environment drift, test data issues) that engineers have learned to re-run pipelines rather than investigate failures.

This is the most common level we find when teams come to us. It's the most dangerous — because it creates the appearance of quality coverage without the substance.

What's present:

  • E2E tests covering core flows, often brittle

  • Unit tests in some services, inconsistent coverage

  • CI pipeline exists but has high flakiness rate (>10%)

  • No contract testing — integration issues caught in staging

  • Test environments manually maintained, frequently out of sync

The cost: False confidence. The suite is green, the deployment goes out, production breaks. Trust in the test suite erodes. The team starts shipping with crossed fingers and calling it "acceptable risk."

Level 3 — Structured and reliable

The signal: The test pyramid has a shape. Failures mean something.

At Level 3, the team has made deliberate architectural decisions about testing. Unit tests are the foundation. Component tests validate each service in isolation. Contract tests are in place for the most critical service interactions. The CI pipeline is stable — a red build means a real problem, not a flaky environment.

This is the level where testing starts returning measurable value in terms of release confidence and reduced incident rate.

What's present:

  • Unit test coverage >70% across all services

  • Component tests for all microservices (dependencies stubbed)

  • Contract tests covering primary service-to-service interfaces

  • CI flakiness rate below 3%

  • Test environments provisioned via IaC, consistent across stages

  • Dedicated QA involvement in sprint planning

The cost of staying here: The strategy is solid, but manual decisions still govern test scope and coverage gaps. New services get added without automatic test coverage requirements. Performance and security testing are periodic — not continuous.

Level 4 — Proactive and integrated

The signal: Quality is built into the delivery process, not added at the end.

At Level 4, testing is genuinely shift-left. QA engineers are involved in API design, not just implementation review. Contract tests cover all inter-service communication. Performance testing runs on every significant release. Security testing is automated and integrated into the pipeline. Observability tools (distributed tracing, structured logging, real-time dashboards) give the team immediate visibility into both test failures and production health.

What's present:

  • Full test pyramid implemented across all services

  • Contract testing covers 100% of service-to-service interfaces

  • Automated performance baselines — regressions block deployment

  • Security scanning integrated into CI (SAST, DAST, dependency scanning)

  • Distributed tracing instrumented across all services

  • Feature flags enabling safe gradual rollouts

  • QA metrics tracked and reviewed in sprint retrospectives

The cost of staying here: Resilience is still untested. The system is well-covered under normal conditions, but its behavior under failure — network partitions, cascading timeouts, service unavailability — is assumed rather than proven.

Level 5 — Fully automated, observable, and chaos-resilient

The signal: The system is tested under conditions that reflect production reality, including failure.

Level 5 teams don't just test that their microservices work — they test that their microservices fail gracefully. Chaos engineering is a scheduled practice, not an emergency drill. The team runs controlled failure experiments — killing instances, injecting latency, simulating upstream outages — and uses the results to harden recovery mechanisms before incidents expose them.

Deployment pipelines are fully automated with progressive delivery (canary releases, blue/green deployments) and automated rollback triggers. The observability layer is comprehensive: every service emits structured logs, traces, and metrics. The team knows about degradation before users do.

What's present:

  • Everything from Levels 1–4, by definition

  • Chaos engineering on a defined schedule (Chaos Monkey, Gremlin, Pumba)

  • Canary deployments with automated rollback on metric regression

  • Synthetic monitoring in production — critical flows tested continuously

  • AI-assisted test generation and anomaly detection

  • MTTR (mean time to recovery) tracked as a primary engineering metric

  • Zero-surprise deployments: every release is a non-event

The reality: Very few teams operate at a pure Level 5 across their entire system. Most Level 5 organizations apply chaos engineering and synthetic monitoring selectively — to their highest-criticality services first. That's the right approach. Full coverage comes over time.

Where does your team sit?

Use this as a diagnostic, not a judgment. The goal isn't to reach Level 5 everywhere simultaneously, it's to know exactly where you are, why you're stuck, and what one level of improvement looks like concretely.

Level
Name
Defining characteristic
Typical release confidence
1
Ad hoc
No strategy, reactive testing
Low — every release is a risk
2
Reactive automation
Automation exists but isn't trusted
False confidence — the suite lies
3
Structured and reliable
Pyramid in place, failures are real signals
Moderate — known gaps remain
4
Proactive and integrated
Quality built into delivery, full observability
High — issues caught before production
5
Chaos-resilient
Failure-tested, synthetic monitoring, progressive delivery
Very high — degradation detected before users

The most common trap: Teams move from Level 1 to Level 2 by investing heavily in E2E automation — which is the hardest, most expensive layer to maintain. They build a fragile top-heavy suite, it starts failing for the wrong reasons, and they conclude that "test automation doesn't work for us." It does. They just built the pyramid upside down. Levels 3 and 4 require going back to the foundation — unit and component tests first — and building up from there.

Not sure which level you're at?

The honest answer often requires an outside perspective, it's difficult to audit your own testing blind spots when you're inside the system. DeviQA offers a QA architecture review for engineering teams looking to understand their current maturity level and map a practical path forward.

Team up with an award-winning software QA and testing company

Trusted by 300+ clients worldwide

Dmitry Reznik

About the author

Dmitry Reznik

Chief Technology Officer

Dmitry Reznik is the Chief Technology Officer and co-founder at DeviQA, bringing deep technical expertise across software architecture, implementation, and long-term system operation.