Pricing calculation audit trail (RFC)

Design doc proposing a typed, milestone-frozen trace of every pricing decision so disputes about quote numbers can be answered exactly — even weeks later.

Author @mileszim PR dealops#5706 Type RFC / design only System Dealops 2 Files 1 Diff +207 / −0 Door Two-way Status Draft

The problem

“Why does this PricingQuote’s TCV/ARR show this number?” takes an engineer hand-reconstructing the calc. For numbers generated weeks ago, the engine may have moved on.

The proposal

Capture a typed TraceNode tree during engine runs, freeze it at three milestones (SAVE / APPROVAL_SUBMIT / CRM_PUSH), and expose an explainQuote tRPC procedure.

What ships in this PR

Just the RFC at rfcs/2026-05-29-pricing-calculation-audit-trail.md. No schema, engine, runtime, or UI changes. Feedback wanted on milestone capture, node taxonomy, and shadow-validation rollout.

1. Why this exists

The Dealops 2 pricing engine already funnels every TCV/ARR computation through one orchestrator (PricingEngineSummary.calculateSummary) and one service (PricingEngineService). A peLogger seam is already threaded through every layer. The architecture is ready — two things are missing.

Gap 1 — the reasoning path

Decisions like “volume 1,200 matched tier 3 [1000–5000]”, “applied 0.5 proration because the product runs Jul–Dec against a Jan–Dec contract”, or “monthly-min tier not found → fell back to selfServe” are made and then discarded. They’re what explain a number.

Gap 2 — no link to the displayed number

PricingEngineLogger.invocationId is a fresh UUID per run, never persisted on the quote, and only emits coarse checkpoints to Datadog. You can’t go from “quote shows $4.2M” to “here is the calc.”

What does exist today

Final numbers persist on PricingQuote.output as a rich PricingEngineSummaryOutput: per-product breakdown (byProduct, blended prices, tier discounts, ramp monthlyBreakdown) and per-commitment tiers (byCommitment). The decomposition of the total is in the DB; the path that produced each piece is not.

2. Locked-in decisions vs. open questions

Dimension	Decision	Implication
Audience	Both, phased — internal first, customer-facing later	Nodes carry a `visibility` field
Fidelity	Must match history exactly	Trace captured & frozen at the moment; can’t backfill
Granularity	Full line-item path	Every decision — pricebook row, tier, proration, fallback, variable
Manual explain	Ephemeral on drafts; stored on history	`explainQuote` branches on locked state
Retention	Keep all milestone traces forever	Dedupe via `@@unique(quoteId, inputHash, trigger)`
Storage	Inline Postgres JSON for now	Revisit blob offload later

3. The trace, as a typed tree

New types in packages/types/v2/. A tree of TraceNodes — renderable as a drill-down waterfall, diffable between two quotes.

Rollup nodes

Decision nodes

Visibility: internal

Visibility: customer

Rollup kinds ●

tcv, arr, deal_total
product_revenue
category_rollup
commitment

Decision kinds ●

list_price_pick — which pricebook row / variable
tier_match — which tier the volume landed in
proration — factor + the date math
ramp_step — per-month volume/revenue
variable_eval — registry var evaluated to X
fallback — alternate path + reason

Sample shape

type TraceNode = {
  kind: TraceNodeKind;
  label: string;                      // "Product: Card Issuing"
  inputs: Record<string, unknown>;    // { volume: 1200, tierBounds: [1000, 5000] }
  output?: AnnotatedNumber;
  source?: { kind: 'pricebook_row' | 'variable' | 'product_input'; id: string };
  note?: string;                      // "monthly-min tier not found → selfServe"
  visibility: 'internal' | 'customer';
  children: TraceNode[];
};

Visibility gating

Set at emit time. Cost/margin/internal-fallback nodes are internal; price × volume × term × discount nodes are customer. The Phase 3 customer-facing view renders only customer nodes.

What a trace looks like, sketched

tcv Deal TCV [customer] → $4,212,000 ├─ arr Year 1 ARR [customer] │ ├─ product_revenue Card Issuing [customer] │ │ ├─ list_price_pick pricebook_row #pb_7f2 [internal] │ │ ├─ tier_match volume=1,200 → tier 3 [1000–5000] [customer] │ │ └─ proration factor=0.5 (Jul–Dec vs Jan–Dec) [customer] │ └─ product_revenue Monthly Min [customer] │ └─ fallback monthly-min tier missing → selfServe [internal] └─ commitment Yr2 commitment scale [customer]

4. Capturing it — enrich the existing seam

A TraceCollector is a peLogger (same log() interface, so existing call sites still feed Datadog) and adds structured event(node, parentId?) + scoped withNode(node, fn). Injected exactly where PricingEngineLogger is constructed today.

Decision point	File	Node emitted
List-price selection	`getListPrice` / `formulaUtils.ts`	`list_price_pick`
Tier match	`findTierForVolume` / `blendTiered` (~1493–1553)	`tier_match`
Proration	`applyProductDateProration` (~2344)	`proration`
Ramp month	`getRevenue` ramp path	`ramp_step`
selfServe / other fallbacks	`penguin_calculator.ts:370`	`fallback`
Variable eval	`formulaUtils.ts` `getVal:30` / `getVar:71`	`variable_eval`
Rollups	`pricingEngineSummary.ts`, `getTCV`, `computeArrAtScale`	`product_revenue` / `category_rollup` / `tcv` / `arr`

Opt-in by design. useCalculatePricing fires on every keystroke; collecting a full trace there would be pure waste. The collector is enabled only when (a) persisting at a milestone, or (b) answering an explicit explainQuote. Default path keeps the cheap PricingEngineLogger. Low architectural risk (wiring exists), high surface area (3,640-line service) — mitigated by shadow validation, below.

5. Persisting it — at milestones, atomically

A new table written inside the existing withPlatformEvents transaction, so the trace commits atomically with the milestone’s PlatformEvent. Never a milestone without its trace, never an orphan trace.

SAVE

APPROVAL_SUBMIT

CRM_PUSH

model PricingCalculationTrace {
  id             String   @id @default(uuid())
  organizationId String
  pricingQuoteId String

  trigger    TraceTrigger        // SAVE | APPROVAL_SUBMIT | CRM_PUSH
  capturedAt DateTime @default(now())

  // Reproducibility / drift detection
  engineVersion  String           // git SHA
  inputHash      String           // hash of effectiveInput (+ spec/pricebook versions)
  effectiveInput Json             // the augmented input actually fed to the engine

  // Denormalized top-line: fast list reads, drift checks, indexing
  tcv Json
  arr Json

  invocationId String              // ties trace ↔ the Datadog peLogger run
  trace        Json                // the TraceNode tree

  @@unique([pricingQuoteId, inputHash, trigger]) // dedupe identical re-saves
  @@index([pricingQuoteId, capturedAt])
}

Why only milestones

Disputes are almost never about a transient editor keystroke. They’re about a number that became real: saved, approved, pushed to CRM. Those three transitions already emit PlatformEvents — natural and sufficient freeze points.

Why effectiveInput, not just input

Amendments run the engine on a baseline-merged input. get.ts:195–281 recomputes amendments on every read. We store the full effective input so the historical number is genuinely re-derivable, aligned with ApprovalGraphState.inputSnapshot.

6. Reading it back — `explainQuote`

A new tRPC procedure on the pricingEngine router. The branch logic encodes the ephemeral-vs-stored decision.

explainQuote({ pricingQuoteId, atTrace?: traceId })

  if atTrace provided OR the quote is in a historical/locked state:
      → return the STORED PricingCalculationTrace (exact-match history)
  else (live/edited draft):
      → recompute calculateSummary with a TraceCollector,
        return the tree, persist NOTHING

“Locked” keys off existing signals: PricingQuote.isLocked / isDocuSignLocked and the presence of milestone traces.

7. Rollout — history-capture first

The unusual part: persistence ships before the UI, because traces cannot be backfilled. Step through the phases below.

Phase 1 of 3

Phase 1 — Collector + milestone persistence, no UI Ship the TraceCollector, table, and write inside withPlatformEvents at SAVE / APPROVAL_SUBMIT / CRM_PUSH. Behind a per-org flag. Shadow validation asserts recomputed top-line == persisted output. Start accruing history.

8. Alternatives considered

Alternative	Verdict	Reason
Recompute-on-demand only	Rejected as primary	Reflects engine as it behaves now; silently disagrees with historical figures. Kept as the live-draft path.
Use existing `output.byProduct` / `byCommitment`	Kept as fallback	Has no path info (which tier, which row, which fallback). Fine degraded view for pre-capture quotes.
Stuff trace into `PlatformEvent` payload	Rejected	Payload is a Zod-validated external API contract; traces are large/nested/internal.
Datadog-only (enrich `peLogger`)	Rejected as system of record	Retention-limited, not CS/customer-accessible, `invocationId` isn’t tied to the quote. Kept feeding Datadog regardless.
Persist on every `calculateSummary`	Rejected	Enormous write volume for transient states nobody disputes.
Blob storage now	Deferred	Start inline with size cap + truncation marker; revisit if p99 size warrants offload.

9. Risks & mitigations

Shadow-validation rollout is the de-risk mechanism for instrumenting a 3,640-line engine. While flagged on, every milestone capture asserts recomputed-from-trace top-line equals persisted output.tcv/arr; mismatches alert via errorNotificationService. The explain UI cannot depend on traces until mismatch rate is ~0.

Risk	Mitigation
Instrumentation regresses a number or perf on the hot path	Collector off unless persisting/explaining. Keystroke path unchanged. Per-org flag.
Captured trace doesn’t reflect displayed number	Shadow validation (see callout). Hold the UI back until clean.
Trace size for 50 products × 36 ramped months	Depth/count caps with explicit `truncated: true` marker (mirrors logger’s depth-10 guard). Denormalized `tcv`/`arr` keep list reads off the JSON.
“Match history exactly” fails if engine code changed	The stored trace is the historical record — faithful regardless. `engineVersion` (git SHA) lets us flag when a recompute would diverge.
Amendment effective-input drift	Store `effectiveInput` aligned with `ApprovalGraphState.inputSnapshot`.
Unbounded growth (keep-forever)	Accepted per decision; `@@unique(quoteId, inputHash, trigger)` dedupes identical re-saves.

10. What this RFC does not change

No Prisma migration in this PR — the PricingCalculationTrace model is proposed, not added.
No changes to pricingEngineService.ts, pricingEngineSummary.ts, or any calculator.
No new tRPC procedure — explainQuote is proposed only.
No UI changes; no apps/client code touched.
No change to the PlatformEvent payload schema.
No change to keystroke-path performance — even when implemented, the collector is opt-in.
Sandbox extractor registry.ts is expected to need no entry (direct Organization FK) — to be confirmed when the model lands.

11. Open questions (explicitly deferred)

Customer exposure model

Is the curated waterfall shown to reps only, or to end customers (order form / proposal)? Affects how conservative visibility: 'customer' gating must be. Deferred to Phase 3 design note.

CRM-push granularity

One trace per push, or per quote? Leaning one-per-quote keyed to the push event; confirm against qliFanOut.ts writeback.

Trace schema versioning

Old stored traces must still render as TraceNode evolves. Likely a schemaVersion column if the shape proves volatile.

Diff tooling

“Diff two quotes’ traces” is arguably the highest-leverage debugging feature — out of scope here, but the data model is shaped to support it later.

12. Feedback wanted

Milestone capture points

Are SAVE, APPROVAL_SUBMIT, CRM_PUSH the right freeze points? Anything missing (e.g. quote duplication, version forks)?

TraceNode taxonomy & visibility gating

Is the kind set right? Is the internal vs customer binary sufficient, or do we need finer roles (e.g. rep-visible vs admin-only)?

Shadow-validation approach

Is asserting recomputed-top-line == persisted output the right invariant? Sufficient fixture coverage (new-business, amendment, renewal, ramped, tiered, multi-commitment)?

Pricing calculation audit trail (RFC)

1. Why this exists

2. Locked-in decisions vs. open questions

3. The trace, as a typed tree

Sample shape

What a trace looks like, sketched

4. Capturing it — enrich the existing seam

5. Persisting it — at milestones, atomically

6. Reading it back — explainQuote

7. Rollout — history-capture first

8. Alternatives considered

9. Risks & mitigations

10. What this RFC does not change

11. Open questions (explicitly deferred)

12. Feedback wanted

6. Reading it back — `explainQuote`