Back to challenge
Run report

Expert prompt scorecard

Full Stack Ecommerce Checkout Web App: PromptGolf compares visible app completion against hidden product-engineering checks.

Codex CLI · gpt-5.5

Expert ecommerce spec

Use integer-cents totals; normalize promos; cap discounts; pre-discount free shipping; enforce stock, double-submit, loading/error, ARIA, and mobile behavior.

98.5
score
10/10 hidden · 1 prompt
Public tests
5/5
Hidden tests
10/10
  • Displays cart items, prices, and quantitiespublic

    Cart table is visible and scannable.

    pass
  • Allows quantity changespublic

    Increment and decrement controls are present.

    pass
  • Shows subtotal, shipping, tax, discount, and totalpublic

    Order summary includes expected rows.

    pass
  • Accepts promo codespublic

    Promo input and apply action are present.

    pass
  • Shows order confirmationpublic

    Checkout reaches a success state.

    pass
  • Integer cents mathhidden

    Avoids floating-point totals and tax drift.

    pass
  • Promo normalizationhidden

    Trims codes and handles case-insensitive matches.

    pass
  • Invalid code errorhidden

    Bad codes produce clear, recoverable feedback.

    pass
  • Discount floorhidden

    Discounts cannot push payable total below zero.

    pass
  • Shipping threshold orderhidden

    Free shipping uses the specified subtotal-before-discount rule.

    pass
  • Out-of-stock blockhidden

    Unavailable line items prevent checkout.

    pass
  • Double-submit preventionhidden

    Repeated clicks cannot create duplicate orders.

    pass
  • Quantity boundarieshidden

    Quantities cannot go negative, zero accidentally, or above stock.

    pass
  • Loading and error stateshidden

    Async states are visible and buttons disable while pending.

    pass
  • Mobile usability and accessibilityhidden

    Core controls work on small screens with labels and keyboard affordances.

    pass
Production-aware checkout
generated checkout preview
Checkout
PROMO15
Canvas Totex1
Field Notebookx2
USB-C Dockx3
Subtotal$152.00
Discount-$22.80
Total$140.18

The prompt names the domain quirks hidden tests care about, so the generated app survives reality.

Failure categories
None - all requested hidden checkout checks passed

Sandbox/run timeline

  1. Resolve model
    complete
    Codex CLI provider selected: gpt-5.5 through AI SDK adapter.
  2. Provision sandbox
    running
    Live sandbox adapter is configured; run creation probes the sandbox API and reports connected or degraded state.
  3. Generate app
    complete
    Agent applied the submitted spec to a Next.js checkout implementation.
  4. Install + build
    complete
    npm install cache restored, TypeScript build completed.
  5. Playwright evaluation
    complete
    Public and hidden checkout tests executed with product seed cart data.
  6. Scorecard
    complete
    Scoring rewards public tests, hidden tests, UX/style, and prompt efficiency.