AI Insights

Top tips for generating API tests leveraging AI

Automated API testing has always been a balancing act. On one side, speed and coverage, and on the other, maintainability, determinism, and trust. We decided to share some top tips for what would happen if you invite AI into this equation and ask it to write production-grade API tests inside a real microservice ecosystem

This article shares the results of a hands-on experiment with GitHub Copilot in agent mode, showing how AI is moving beyond toy examples and becoming a reliable contributor to automated API testing, not in theory, but in practice.

Why This Experiment Mattered

AI-generated code is everywhere. Demos look impressive. Blog posts are optimistic. Yet one question keeps coming up in engineering teams: Can AI actually generate maintainable, framework-compliant API tests, or does it just look convincing until you run them?

Our goal was definite:

Test AI-assisted API test generation in a real-world microservice architecture
Use existing documentation only, no peeking at implementations
Enforce strict framework rules, type safety, and deterministic behaviour

What we explicitly did not aim for:

100% test coverage
Bug hunting
Making tests “pass at any cost”

This was not about shortcuts. It was about discipline.

The Playground: A Real Microservice System

The system under test was a fully containerised online shop built on microservices:

Independent services (users, cart, catalogue)
A single API gateway routing all REST requests
Services written in different languages
An existing TypeScript-based API test automation framework

In other words, realistic complexity.

The challenge was obvious from the start:
API documentation existed, but it was incomplete, inconsistent, and unaware of the gateway layer. Which brings us to the most important rule of the experiment.

One Source of Truth. No Exceptions

For this experiment:

Service API documentation was the only source of truth.

No gateway docs.
No “tribal knowledge.”
No fixing tests to match reality.

If the behaviour through the gateway differed from the documentation, it was treated as a bug, not something Copilot should “work around”.

This single constraint defined everything:

The scope of generated tests
Their correctness
Their limitations

And it exposed something crucial about AI-generated testing.

How We Actually Generated the Tests

Letting Copilot “just generate tests” doesn’t work. It hallucinates. It invents patterns. It ignores your framework. The real breakthrough came from adding a Framework Instruction Layer.

The Instruction Layer: The Missing Piece

Before generating a single test, we created a detailed set of rules describing:

API client inheritance and structure
Naming conventions and file layout
Assertion patterns
Error handling and status code validation
Positive vs. negative flow structure
Good and bad examples

These rules lived in Markdown files that Copilot was required to read before writing code. Think of it as teaching Copilot how your team thinks, not just what the API looks like. Without this layer, Copilot consistently:

Invented new abstractions
Misnamed entities
Ignored shared utilities
Produced unreviewable code

With it, the quality difference was dramatic.

Prompting Is Not a Detail, It’s the System

Once the framework rules were in place, test generation followed a strict, repeatable flow:

Define the role – Copilot acts as a Senior SDET, not a code generator.
Constrain the scope – One service, specific endpoints, and clear scenario types.
Specify scenario depth – Positive, negative, and edge cases, with examples.
Attach the API documentation – JSON files only. No hidden context.
Validate and iterate – Review style, correctness, and coverage against documentation. Fix prompts, not the code, whenever possible.

Over time, this became less like “asking AI for help” and more like programming the generator itself.

What Worked Surprisingly Well

Generating typed API clients directly from documentation
Producing consistent service-level tests aligned with the framework
Reusing generated code across services for E2E scenarios
Enforcing deterministic, independent tests

With the right constraints, Copilot behaved less like a junior developer and more like a very fast, literal senior engineer.

Where Things Fell Apart

AI is not magic, and the cracks were instructive.

1. Prompt Size and Context Loss

Large prompts caused Copilot to:

Forget earlier rules
Duplicate entities
Import unused or nonexistent code

Smaller, segmented tasks worked far better.

2. “Fixing” Failing Tests

When tests failed due to real system behaviour mismatches, Copilot often tried to:

Relax assertions
Remove validations
Change expectations

In other words, create false positives. Human oversight is non-negotiable.

3. Documentation Quality Is Everything

Incomplete or unclear documentation led directly to:

Missing coverage
Invalid assumptions
Broken tests

AI doesn’t fill gaps responsibly; it guesses.

The Real Lesson: AI Amplifies Your Discipline

This experiment didn’t prove that AI can replace test engineers. It proved something more interesting:

AI magnifies the quality of your existing processes, good or bad.

Strong frameworks → scalable test generation
Clear documentation → reliable automation
Vague rules → confident nonsense

Copilot didn’t remove the need for thinking. It punished the absence of it.

So… Can AI Write Your API Tests?

Yes, if you’re willing to do the hard work first. AI won’t save you from:

Poor documentation
Weak frameworks
Inconsistent conventions

But if those foundations exist, it can:

Speed up onboarding
Standardise test quality
Generate boilerplates at scale
Free engineers to focus on intent, not repetition

The future of test automation isn’t “AI instead of engineers”. It’s AI-guided by engineers who know exactly what they want.

Raman Piatlitski, SDET

Posted 30 Mar 2026

- AI & Machine Learning
Joe Wolski, CTO, Godel.

Why some CTOs are sleepwalking into an AI governance nightmare

Learn more
- AI & Machine Learning
Godel accelerating digital delivery with Awaze ahead of peak demand

Learn more
- AI & Machine Learning
- Software Engineering
Lead Java Software Engineer, Siarhei Dvaradkin

Change Propagation: SDD’s Central Unsolved Challenge

Learn more
- AI & Machine Learning
- Data Engineering & Analytics
Siarhei Oshyn, Head of Data / Data & AI Architect

What LLM will be the best choice for your business?

Learn more
- AI & Machine Learning
- Software Engineering
Valdemaras Girštautas, Jr, JavaScript Software Engineer

Prompt Context Types: Key Experimental Findings

Learn more
- AI & Machine Learning
Godel helps Welbeck Health turn AI ambition into action

Learn more