LLM Collaboration¶

How to work effectively with LLMs using Specwright's specification-first approach.

The Handoff Model¶

Specwright enables a clean handoff between human and LLM:

Responsibility	Human	LLM
What the function does	Type hints, docstring	-
How it does it	-	Implementation
What to test	`@requires_tests` cases	Test implementations
Error policy	`@handle_errors` mapping	-
Valid states	`StateMachine` definition	Transition logic

The human defines the contract. The LLM fulfills it. Specwright verifies the result.

Workflow: Function Generation¶

1. Human Writes the Spec¶

from specwright import spec, requires_tests

@requires_tests(
    happy_path=True,
    edge_cases=["empty_list", "single_item", "duplicates"],
    error_cases=["non_numeric", "mixed_types"],
)
@spec
def median(values: list[float]) -> float:
    """Calculate the median of a list of numbers.

    For even-length lists, returns the average of the two middle values.
    Raises ValueError for empty lists.
    """
    ...

2. LLM Sees the Spec¶

An LLM reading this function knows:

Input: list[float]
Output: float
Behavior: Median calculation, with averaging for even-length lists
Error: ValueError for empty lists
Required tests: 6 specific test functions

3. LLM Writes the Implementation¶

@requires_tests(
    happy_path=True,
    edge_cases=["empty_list", "single_item", "duplicates"],
    error_cases=["non_numeric", "mixed_types"],
)
@spec
def median(values: list[float]) -> float:
    """Calculate the median of a list of numbers.

    For even-length lists, returns the average of the two middle values.
    Raises ValueError for empty lists.
    """
    if not values:
        raise ValueError("Cannot calculate median of empty list")
    sorted_vals = sorted(values)
    n = len(sorted_vals)
    mid = n // 2
    if n % 2 == 0:
        return (sorted_vals[mid - 1] + sorted_vals[mid]) / 2
    return sorted_vals[mid]

4. Specwright Validates¶

At decoration time: type hints and docstring are present
At runtime: arguments are list[float], return is float
At test time: all 6 required test functions exist

Workflow: State Machine Generation¶

1. Human Defines the States¶

from specwright import StateMachine, transition, spec

class DocumentReview(StateMachine):
    states = ["draft", "submitted", "in_review", "approved", "rejected", "published"]
    initial_state = "draft"
    track_history = True

2. LLM Adds Transitions¶

class DocumentReview(StateMachine):
    states = ["draft", "submitted", "in_review", "approved", "rejected", "published"]
    initial_state = "draft"
    track_history = True

    @transition(from_state="draft", to_state="submitted")
    @spec
    def submit(self, author: str) -> str:
        """Submit the document for review."""
        return f"Submitted by {author}"

    @transition(from_state="submitted", to_state="in_review")
    @spec
    def assign_reviewer(self, reviewer: str) -> str:
        """Assign a reviewer to the document."""
        return f"Assigned to {reviewer}"

    @transition(from_state="in_review", to_state="approved")
    @spec
    def approve(self, reviewer: str) -> str:
        """Approve the document."""
        return f"Approved by {reviewer}"

    @transition(from_state="in_review", to_state="rejected")
    @spec
    def reject(self, reviewer: str, reason: str) -> str:
        """Reject the document with a reason."""
        return f"Rejected by {reviewer}: {reason}"

    @transition(from_state=["rejected", "draft"], to_state="draft")
    @spec
    def revise(self, changes: str) -> str:
        """Revise the document."""
        return f"Revised: {changes}"

    @transition(from_state="approved", to_state="published")
    @spec
    def publish(self) -> str:
        """Publish the approved document."""
        return "Published"

3. Specwright Prevents Bugs¶

doc = DocumentReview()
doc.publish()  # Can't publish a draft!
# InvalidTransitionError: Cannot transition from 'draft' to 'published'
# via 'publish'. Valid source state(s): approved

The LLM can't introduce a bug where a draft gets published without review — the state machine makes it structurally impossible.

Tips for LLM Prompts¶

When asking an LLM to implement Specwright-decorated functions:

Do: Include the Full Decorator Stack¶

Implement the following function. The decorators define the contract —
follow the type hints, docstring, and test requirements exactly.

@requires_tests(happy_path=True, edge_cases=["empty"])
@spec
def process(items: list[str]) -> dict:
    """Process a list of items into a frequency dict."""
    ...

Do: Ask for Tests That Match Requirements¶

Also write the tests. The @requires_tests decorator expects these
test function names: test_process_happy_path, test_process_empty

Don't: Let the LLM Change the Spec¶

The spec is the human's domain. If an LLM suggests changing type hints, adding parameters, or modifying the docstring, that's a conversation — not an automatic change.

Using the CLI with LLMs¶

The specwright new command generates stubs that are perfect for LLM handoff:

specwright new function search_users \
    --params "query: str, limit: int" \
    --returns "list[dict]"

This creates search_users.py with the spec already defined. Hand the file to an LLM with:

Fill in the implementation for search_users.py. Follow the type hints
and docstring exactly. Also complete the test file test_search_users.py.

The LLM writes code, you run specwright validate, and the framework tells you if everything checks out.