Theme 05 · Test-Confidence Oriented

Observable behavior over mocks/spies

Explanation

Observable behavior over mocks/spies

Plain Human Explanation

A confident test checks what the user or system can observe: the response, saved record, queued message, emitted event, or visible state. A mock or spy only checks that code called another function.

Mocks are sometimes useful, especially at true outside boundaries. But if the test mostly says “this method was called,” it can pass even when the product behavior is wrong.

Technical Explanation

Prefer assertions against outputs, state changes, outbox rows, audit events, and rendered responses. Use spies only when the call itself is the observable behavior, such as sending a provider request that has no local result.

This often pairs with real seams: replace the outside dependency with a local port, then inspect the local result instead of checking an internal call count.

Why It Matters

User impact: tests fail when the actual outcome is wrong, not only when implementation details change.
Product behavior: assertions describe what the system promises to do.
Risk: call-count tests can lock in private structure while missing broken responses or saved state.
Decision point: assert observable behavior unless the outbound call is truly the behavior.

The Core Move

Ask, “What would a user, support tool, downstream job, or database record observe?” Assert that result directly.

Small Example

Observable behavior over mocks/spies: Small Example

Bad TypeScript Example

type Cart = {
  items: Array<{
    priceCents: number;
  }>;
};

function calculateTotal(
  cart: Cart,
  logger: {
    info(message: string): void;
  },
) {
  logger.info("calculating cart");
  return cart.items.reduce((sum, item) => sum + item.priceCents, 0);
}

test("calculates total", () => {
  const logger = { info: vi.fn() };

  calculateTotal({ items: [{ priceCents: 500 }] }, logger);

  expect(logger.info).toHaveBeenCalledWith("calculating cart");
});

type Cart = {
  items: Array<{
    priceCents: number;
  }>;
};

function calculateTotal(
  cart: Cart,
  logger: {
    info(message: string): void;
  },
) {
  logger.info("calculating cart");

  return cart.items.reduce(
    (sum, item) => sum + item.priceCents,
    0,
  );
}

test("calculates total", () => {
  const logger = {
    info: vi.fn(),
  };

  calculateTotal(
    {
      items: [
        {
          priceCents: 500,
        },
      ],
    },
    logger,
  );

  expect(logger.info).toHaveBeenCalledWith(
    "calculating cart",
  );
});

Good TypeScript Example

type Cart = {
  items: Array<{
    priceCents: number;
  }>;
};

function calculateTotal(cart: Cart) {
  return cart.items.reduce((sum, item) => sum + item.priceCents, 0);
}

test("calculates the total users will see", () => {
  const total = calculateTotal({
    items: [{ priceCents: 500 }, { priceCents: 125 }],
  });

  expect(total).toBe(625);
});

type Cart = {
  items: Array<{
    priceCents: number;
  }>;
};

function calculateTotal(cart: Cart) {
  return cart.items.reduce(
    (sum, item) => sum + item.priceCents,
    0,
  );
}

test("calculates the total users will see", () => {
  const total = calculateTotal({
    items: [
      {
        priceCents: 500,
      },
      {
        priceCents: 125,
      },
    ],
  });

  expect(total).toBe(625);
});

What Changed

The bad version checks an implementation detail and ignores the result.
The good version asserts the visible behavior: the total the user will see.
Removing the logger from the rule makes the test simpler and less brittle.

Realistic Example

Observable behavior over mocks/spies: Realistic Example

Notification tests are often written as “email client was called.” A stronger test checks the durable message the system queued and the response the caller received.

Bad TypeScript Example

test("sends invoice email", async () => {
  const email = { send: vi.fn() };

  await sendInvoiceReceipt({ customerId: "cus_1", invoiceId: "in_1" }, { email });

  expect(email.send).toHaveBeenCalled();
});

test("sends invoice email", async () => {
  const email = {
    send: vi.fn(),
  };

  await sendInvoiceReceipt(
    {
      customerId: "cus_1",
      invoiceId: "in_1",
    },
    {
      email,
    },
  );

  expect(email.send).toHaveBeenCalled();
});

Good TypeScript Example

type ReceiptCommand = {
  customerId: string;
  invoiceId: string;
};

type OutboxMessage = {
  kind: "invoice-receipt";
  customerId: string;
  invoiceId: string;
};

type Outbox = {
  enqueue(message: OutboxMessage): Promise<void>;
  all(): Promise<OutboxMessage[]>;
};

async function queueInvoiceReceipt(command: ReceiptCommand, outbox: Outbox) {
  await outbox.enqueue({
    kind: "invoice-receipt",
    customerId: command.customerId,
    invoiceId: command.invoiceId,
  });

  return { status: "queued" };
}

test("queues an invoice receipt message", async () => {
  const messages: OutboxMessage[] = [];
  const outbox: Outbox = {
    async enqueue(message) {
      messages.push(message);
    },
    async all() {
      return messages;
    },
  };

  const result = await queueInvoiceReceipt({ customerId: "cus_1", invoiceId: "in_1" }, outbox);

  expect(result).toEqual({ status: "queued" });
  await expect(outbox.all()).resolves.toEqual([
    { kind: "invoice-receipt", customerId: "cus_1", invoiceId: "in_1" },
  ]);
});

type ReceiptCommand = {
  customerId: string;
  invoiceId: string;
};

type OutboxMessage = {
  kind: "invoice-receipt";
  customerId: string;
  invoiceId: string;
};

type Outbox = {
  enqueue(
    message: OutboxMessage,
  ): Promise<void>;
  all(): Promise<OutboxMessage[]>;
};

async function queueInvoiceReceipt(
  command: ReceiptCommand,
  outbox: Outbox,
) {
  await outbox.enqueue({
    kind: "invoice-receipt",
    customerId: command.customerId,
    invoiceId: command.invoiceId,
  });

  return {
    status: "queued",
  };
}

test("queues an invoice receipt message", async () => {
  const messages: OutboxMessage[] = [];

  const outbox: Outbox = {
    async enqueue(message) {
      messages.push(message);
    },
    async all() {
      return messages;
    },
  };

  const result = await queueInvoiceReceipt(
    {
      customerId: "cus_1",
      invoiceId: "in_1",
    },
    outbox,
  );

  expect(result).toEqual({
    status: "queued",
  });

  await expect(
    outbox.all(),
  ).resolves.toEqual([
    {
      kind: "invoice-receipt",
      customerId: "cus_1",
      invoiceId: "in_1",
    },
  ]);
});

What Changed

The bad version does not care which message would be sent.
The good version checks the durable outbox record and the workflow result.
The assertion survives internal refactors because it is tied to system behavior, not a private call.

System Example

Observable behavior over mocks/spies: System Example

At system scale, observable assertions protect product outcomes while letting implementation details change.

Larger System-Level Bad TypeScript Example

test("admin refund calls services", async () => {
  const payments = { refund: vi.fn().mockResolvedValue({ id: "re_1" }) };
  const audit = { write: vi.fn() };
  const support = { notify: vi.fn() };

  await refundOrder("ord_1", { payments, audit, support });

  expect(payments.refund).toHaveBeenCalled();
  expect(audit.write).toHaveBeenCalled();
  expect(support.notify).toHaveBeenCalled();
});

test("admin refund calls services", async () => {
  const payments = {
    refund: vi.fn().mockResolvedValue({
      id: "re_1",
    }),
  };

  const audit = {
    write: vi.fn(),
  };

  const support = {
    notify: vi.fn(),
  };

  await refundOrder("ord_1", {
    payments,
    audit,
    support,
  });

  expect(
    payments.refund,
  ).toHaveBeenCalled();

  expect(audit.write).toHaveBeenCalled();

  expect(support.notify).toHaveBeenCalled();
});

Larger System-Level Good TypeScript Example

type Order = {
  id: string;
  status: "paid" | "refunded";
  amountCents: number;
};

type RefundRecord = {
  orderId: string;
  amountCents: number;
  reason: string;
};

type RefundSystem = {
  findOrder(id: string): Promise<Order | null>;
  saveOrder(order: Order): Promise<void>;
  saveRefund(refund: RefundRecord): Promise<void>;
  refunds(): Promise<RefundRecord[]>;
};

async function refundOrder(orderId: string, reason: string, system: RefundSystem) {
  const order = await system.findOrder(orderId);
  if (!order || order.status !== "paid") {
    return { status: 409, body: { error: "order-not-refundable" } };
  }

  await system.saveOrder({ ...order, status: "refunded" });
  await system.saveRefund({ orderId, amountCents: order.amountCents, reason });

  return { status: 200, body: { refunded: true, amountCents: order.amountCents } };
}

test("refund marks the order refunded and records the refund", async () => {
  let order: Order = { id: "ord_1", status: "paid", amountCents: 5000 };
  const refundRecords: RefundRecord[] = [];

  const system: RefundSystem = {
    async findOrder(id) {
      return id === order.id ? order : null;
    },
    async saveOrder(nextOrder) {
      order = nextOrder;
    },
    async saveRefund(refund) {
      refundRecords.push(refund);
    },
    async refunds() {
      return refundRecords;
    },
  };

  const response = await refundOrder("ord_1", "customer request", system);

  expect(response).toEqual({ status: 200, body: { refunded: true, amountCents: 5000 } });
  expect(order).toEqual({ id: "ord_1", status: "refunded", amountCents: 5000 });
  await expect(system.refunds()).resolves.toEqual([
    { orderId: "ord_1", amountCents: 5000, reason: "customer request" },
  ]);
});

type Order = {
  id: string;
  status: "paid" | "refunded";
  amountCents: number;
};

type RefundRecord = {
  orderId: string;
  amountCents: number;
  reason: string;
};

type RefundSystem = {
  findOrder(
    id: string,
  ): Promise<Order | null>;
  saveOrder(order: Order): Promise<void>;
  saveRefund(
    refund: RefundRecord,
  ): Promise<void>;
  refunds(): Promise<RefundRecord[]>;
};

async function refundOrder(
  orderId: string,
  reason: string,
  system: RefundSystem,
) {
  const order =
    await system.findOrder(orderId);

  if (!order || order.status !== "paid") {
    return {
      status: 409,
      body: {
        error: "order-not-refundable",
      },
    };
  }

  await system.saveOrder({
    ...order,
    status: "refunded",
  });

  await system.saveRefund({
    orderId,
    amountCents: order.amountCents,
    reason,
  });

  return {
    status: 200,
    body: {
      refunded: true,
      amountCents: order.amountCents,
    },
  };
}

test("refund marks the order refunded and records the refund", async () => {
  let order: Order = {
    id: "ord_1",
    status: "paid",
    amountCents: 5000,
  };

  const refundRecords: RefundRecord[] = [];

  const system: RefundSystem = {
    async findOrder(id) {
      return id === order.id ? order : null;
    },
    async saveOrder(nextOrder) {
      order = nextOrder;
    },
    async saveRefund(refund) {
      refundRecords.push(refund);
    },
    async refunds() {
      return refundRecords;
    },
  };

  const response = await refundOrder(
    "ord_1",
    "customer request",
    system,
  );

  expect(response).toEqual({
    status: 200,
    body: {
      refunded: true,
      amountCents: 5000,
    },
  });

  expect(order).toEqual({
    id: "ord_1",
    status: "refunded",
    amountCents: 5000,
  });

  await expect(
    system.refunds(),
  ).resolves.toEqual([
    {
      orderId: "ord_1",
      amountCents: 5000,
      reason: "customer request",
    },
  ]);
});

What Changed

The bad version proves the service choreography, not the refund outcome.
The good version asserts the response, saved order state, and refund record.
The test explains the product behavior while leaving room to change how the workflow is organized.

When To Use It

Observable behavior over mocks/spies: When To Use It

Use This When

The test can inspect a response, saved record, rendered output, queued message, audit event, or state transition.
A mock-heavy test keeps breaking during harmless refactors.
The important question is whether the product outcome happened, not which helper method ran.

Avoid This When

The outbound call is the only meaningful behavior and there is no local result to inspect.
You are testing a tiny adapter whose job is to call a provider with the right request shape.
Observing the final result would require a full end-to-end test when a focused spy would be clearer.

Tradeoffs

Observable assertions can require slightly better seams or local substitutes. The payoff is a test suite that protects behavior instead of private implementation shape.

Real seams
Integration tests
SQLite/local substitutes

Practice Prompt

Observable behavior over mocks/spies: Practice Prompt

Beginner Exercise

Find a test whose main assertion is toHaveBeenCalled. Write down the user-visible or system-visible outcome that call was supposed to create.

Intermediate Exercise

Replace one call-count assertion with an assertion against a returned value, stored record, queued message, or emitted event.

Stretch Exercise

Introduce a small outbox or local port so a workflow can be tested by inspecting durable behavior instead of spying on a provider client.

Reflection Question

When is a spy the clearest assertion, and when is it hiding the behavior you actually care about?

Suggest an edit

Leave a private editorial note. This creates a GitHub issue for this curriculum page.

Explanation

Observable behavior over mocks/spies

Plain Human Explanation

Technical Explanation

Why It Matters

The Core Move

Small Example

Observable behavior over mocks/spies: Small Example

Bad TypeScript Example

Good TypeScript Example

What Changed

Realistic Example

Observable behavior over mocks/spies: Realistic Example

Bad TypeScript Example

Good TypeScript Example

What Changed

System Example

Observable behavior over mocks/spies: System Example

Larger System-Level Bad TypeScript Example

Larger System-Level Good TypeScript Example

What Changed

When To Use It

Observable behavior over mocks/spies: When To Use It

Use This When

Avoid This When

Tradeoffs

Related Concepts

Practice Prompt

Observable behavior over mocks/spies: Practice Prompt

Beginner Exercise

Intermediate Exercise

Stretch Exercise

Reflection Question

Suggest an edit