L1: Unit & Integration Test Generation

L1: Unit & Integration Test Generation

Unit and Integration Test Generation Strategies

In Modules 5 and 6, we designed the APIs and generated the frontend and backend code. However, in an enterprise CI/CD pipeline, unverified code is a liability. Because LLMs are probabilistic, they can generate code that looks syntactically perfect but contains subtle logical flaws.

This lesson covers how AI Architects use Claude to autonomously generate robust test suites, moving from manual Quality Assurance (QA) to an AI-driven Continuous Verification model.

1. The "Tautological Testing" Trap (The Greatest Vulnerability)

The most common mistake developers make when generating tests with AI is prompting: "Here is my function. Write a unit test for it."

The Architectural Failure: If the generated function contains a bug (e.g., adding taxes instead of subtracting discounts), Claude will look at the code, assume the logic is intentional, and write a test that perfectly validates the bug. The test will pass, and the bug will be silently deployed. This is a tautological test—it only proves the code does what the code currently does.

The Architectural Solution: Test against the Contract, not the Implementation.

You must instruct Claude to generate tests based on the API Contract (OpenAPI spec), the Business Requirements Document (BRD), or the Jira ticket acceptance criteria— never just the source code.

2. Agentic Test-Driven Development (TDD)

To avoid tautological tests, architects deploy a strict Agentic TDD workflow. Claude Code and multi-agent systems excel in this paradigm.

The Workflow:

  1. Phase 1 (The Test): You pass the Jira ticket or BRD to Claude and prompt: "Based on these requirements, write the unit tests for the authentication service. Do not write the implementation yet."

  2. Phase 2 (The Red State): You run the tests. They will fail because the code doesn't exist.

  3. Phase 3 (The Green State): You prompt the agent: "Write the implementation code to make these specific tests pass."

  4. Phase 4 (The Refactor): If the tests fail, the Validation-Retry Loop (from Module 4) catches the error trace and feeds it back to the agent to self-correct until the pipeline is green.

3. Unit Test Generation & Mocking Boundaries

Unit tests must be incredibly fast and completely isolated. An unconstrained LLM might write a unit test that attempts to make a live call to a production AWS S3 bucket.

Prompting Constraints for Unit Tests:

You must enforce strict mocking boundaries in your CLAUDE.md or System Prompt:

  • The "No Network" Rule: "Unit tests must be strictly isolated. You are forbidden from making live database queries, file system writes, or external HTTP requests."

  • Mocking Standards: "You must use our standard mocking libraries. For Node.js, usejest.mock(). For Python, use unittest.mock.patch. Mock all external Service and Repository layers when testing Controllers."

  • The AAA Pattern: "All generated tests must strictly follow the Arrange-Act-Assert structure with clear comments separating the phases."

4. Integration Test Generation Strategies

While unit tests isolate components, Integration Tests verify the "seams" between them (e.g., does the Controller correctly save to the Database?). Mocking is forbidden here; we want to test the real infrastructure logic.

Architectural Patterns for Integration Tests:

When generating integration tests, Claude must be aware of the testing environment's physical limits.

  • Ephemeral Environments: Instruct Claude to write setup and teardown hooks (beforeAll, afterEach) that initialize in-memory databases (like SQLite) or spin up Docker Testcontainers.

  • Constraint Example: "For integration tests, do not use mocks. Generate a test suite that connects to the test database URL provided in the.env.test file. Ensure you write an afterEach block to truncate all database tables to prevent state leakage between test runs."

5. Fuzzing and Edge Case Generation

Humans are notoriously bad at thinking of every possible way a user might break an input field. LLMs, having read millions of error logs and security exploits, are exceptionally good at generating malicious or unexpected inputs.

  • The Fuzzing Prompt: After generating the standard "Happy Path" tests, architects run a secondary generation pass.

  • "You are an adversarial QA Engineer. Review this API endpoint. Generate a suite of edge-case tests. Inject SQL injection payloads, extremely long strings, null values, negative integers, and invalid date formats to ensure the endpoint gracefully rejects them with a400 Bad Request." By separating the "Happy Path" generation from the "Adversarial" generation, you force the model's attention mechanism to focus entirely on breaking its own code, drastically increasing your pipeline's resilience.