Plaid v2 diff harness: three fixes to cut 1246 structural diffs to ~550

Triage of the first Plaid v2 hux smoke — split into one classifier change, one missing-field emission, and one narrow canary backfill.

PR: dealops#5683 Author: @mehulshinde Bead: dealops-020.2 Parent: dealops-020 Files: 7 +/-: +1543 / -0 Status: Open

Fix 1 · Harness classifier

Adds a placeholder rule to classifyDiff and 6 custom fields to the v1 readback SOQL. Eliminates ~350 false-positive diffs that were never real divergence — just harness blindness.

Fix 2 · Top-level QLI fields

v2 now threads SBQQ__ProductCode__c and SBQQ__Description__c from the canary SalesforceProduct row into umbrella + tier QLI bodies. Closes ~100 genuine emission gaps.

Fix 3 · Narrow backfill

New CLI fills 5 missing extraFields keys on canary rows from a live SF read. Dry-run by default; --commit guarded against the read replica. Closes ~250 data-side diffs.

Harness

Writeback (v2 code)

Canary data (backfill)

Out of scope

1. Why this exists

The first hux smoke of the Plaid v2 diff harness compared v1's Salesforce readback against v2's would-write payload across 4 quotes and reported 1246 structural diffs. That number is too high to triage line-by-line, so dealops-020 ran a matched-pair inspection on individual QLIs to bin the diffs by root cause.

Bin A · Harness noise (~350)

v1's readback SOQL didn't SELECT custom fields that v2 already emits correctly, so v2's correct value diffed against v1=null. Composite-graph placeholders like @{dealops_new_quote.id} also got flagged structural.

Bin B · Code gaps (~100)

SBQQ__ProductCode__c and SBQQ__Description__c were dropped at the summary boundary in loadSalesforceProductsByProduct2Ids. Real v2 bug.

Bin C · Partial canary data (~250)

Some products' extraFields blob is missing 5 specific SBQQ keys. Other products carry them. Targeted fill on the canary org only.

Bin D · Out of scope (~550)

cjd concern #7.20 (per-unit price + discount math) and a PricebookEntry gap on one Product2Id. Separate beads.

Projected diff reduction

Baseline

1246

+ Fixes 1 & 2

~900

+ Fix 3 backfill

~550

Remaining

cjd #7.20

2. What changes — by fix

1a. Placeholder rule in `classifyDiff`

v2's composite-graph QLI bodies reference the not-yet-created quote via the SF placeholder @{dealops_new_quote.id} (and child-of-parent QLIs via @{dealops_<n>.id}). SF resolves these at execution time; v1's readback sees the resolved Id. Treat that as semantically equal.

// runPlaidDiffHarness.ts
function classifyDiff(field, v1, v2) {
  // Detection: v2 is a string starting with `@{` and ending with `}`.
  if (typeof v2 === 'string' && v2.startsWith('@{') && v2.endsWith('}')) {
    return 'acceptable';
  }
  if (STRUCTURAL_FIELDS.has(field)) return 'structural';
  // ...rounding-tolerance path follows
}

The rule lives before the structural/rounding dispatch, but only triggers on the exact @{...} shape — partial matches, non-string values, and v1-only mismatches all fall through to the normal classifier.

1b. v1 readback SOQL extension

Six custom fields added to SBQQ_QUOTELINE_FIELDS_TO_READ:

Field	Why it was a false positive
`Product_GBT__c`	v2 emits from `extraFields`; v1 readback was blind
`SBQQ__Hidden__c`	same
`Fee_Structure__c`	same
`Risk_Category__c`	same
`SBQQ__NonDiscountable__c`	same
`Product_Name_Summary_Variable_Mapping__c`	same

Design choice: extending the SELECT (rather than allowlisting in classifyDiff) keeps future drift visible. If v2 stops emitting one of these, the diff comes back automatically.

The summary boundary was dropping two fields

v1's per-QLI body reads SBQQ__ProductCode__c and SBQQ__Description__c off salesforceProduct.{ProductCode,Description}. v2's loadSalesforceProductsByProduct2Ids previously narrowed the SF row to concern #3/#5/#7 fields and silently dropped these two.

Before · summary builder

{
  Product2Id,
  CurrencyIsoCode: row.CurrencyIsoCode ?? null,
  extraFields: narrowSalesforceExtraFields(row.extraFields),
  children: [...],
}

After · summary builder

{
  Product2Id,
  CurrencyIsoCode: row.CurrencyIsoCode ?? null,
  extraFields: narrowSalesforceExtraFields(row.extraFields),
  ProductCode: row.ProductCode,
  Description: row.Description,
  children: [...] // children also gain ProductCode + Description
}

Emission in `buildPlaidQliTailFields`

// Empty string === unset (Prisma `@default("")`).
// Emitting "" would create a false-positive diff vs v1's real value.
const productCode =
  sfRow?.ProductCode != null && sfRow.ProductCode !== ''
    ? sfRow.ProductCode : undefined;
const description =
  sfRow?.Description != null && sfRow.Description !== ''
    ? sfRow.Description : undefined;

return {
  ...(extraFields ?? {}),
  CurrencyIsoCode: ...,
  SBCF_Approval_Level__c: approvalLevel ?? 0,
  Order_Form_Product_Name__c: ...,
  SBQQ__RenewedSubscription__c: renewedSubscriptionId ?? null,
  // Declared last so they win on key collision with extraFields.
  SBQQ__ProductCode__c: productCode,
  SBQQ__Description__c: description,
};

Picked up automatically by all three QLI paths — segment, tier, single — because they all flow through this tail helper. Per-tier child QLIs source from the child SF row, not the umbrella (mirrors the existing CurrencyIsoCode plumbing).

Why a backfill, not an emission fix

The diagnostic script queryPlaidCanarySalesforceProductExtraFields.ts (also on this branch) refuted the broader "extraFields is empty" thesis at org level — 100% of canary rows have a populated blob. The matched-pair inspection narrowed it: some products are missing the same 5 keys across all their rows, while others carry them. Field-level partial data, not empty data.

The 5 target keys

SBQQ__BillingFrequency__c

SBQQ__SubscriptionBase__c

SBQQ__DefaultSubscriptionTerm__c

SBQQ__ProrateMultiplier__c

AdditionalDiscountUnit__c

Exported as TARGET_EXTRA_FIELD_KEYS and pinned by a test — changing the set without updating the bead is flagged as scope creep.

Merge policy

Scenario	Outcome
Prisma row has key, SF also has key	Preserved Prisma value untouched
Prisma row missing key, SF has value	Filled SF value merged in
Prisma row missing key, SF returns null	Skipped never invent values
Prisma row has `key: null`	Filled null treated as missing
Prisma `extraFields` blob is null	Coerced to `{}`, then merged
SF returned no row for Product2Id	Skipped warning logged

Safety guards

Dry-run is the default; --commit is required to write.
--commit refused when DATABASE_URL === READ_ONLY_DATABASE_URL.
Reads through prismaReadOnly; SF calls are read-only SOQL.
SF Id IN (...) queries batched at 200 to stay well under SOQL char limit + 1000-element IN cap.
dotenv/config imported at the very top so env vars are set before Prisma's eager module-load side effects.

Operator invocation

# Dry-run (default)
npx tsx src/dealops2/scripts/populatePlaidCanarySalesforceProductExtraFields.ts \
  --organization=e846ccc7-40b5-4897-a31b-761d6f51654c

# Commit (operator runs against writable DATABASE_URL after merge)
npx tsx src/dealops2/scripts/populatePlaidCanarySalesforceProductExtraFields.ts \
  --organization=e846ccc7-40b5-4897-a31b-761d6f51654c --commit

3. Tests

16 new mocha cases across three files, all green. Test discipline pins both the happy path and the boundary conditions that would silently widen each rule.

Surface	Cases	What's pinned
`classifyDiff`	6	Placeholder happy path, child `SBQQ__RequiredBy__c` shape, non-placeholder mismatch stays structural, partial `@{...` doesn't trigger, non-string values follow normal path, rounding tolerance still works on price fields.
`buildPlaidQliTailFields`	3	Happy path emits both new fields, empty-string Prisma default collapses to undefined, missing SF row cleanly omits.
`populatePlaidCanarySalesforceProductExtraFields`	7	Dry-run writes nothing, commit writes only missing keys, SF null never invents, missing SF row skips, Prisma null-value treated as missing, null blob coerced to `{}`, `TARGET_EXTRA_FIELD_KEYS` list pinned.

4. What this PR doesn't change

v2's writeback graph structure — no changes to composite-request shape, batching, or ordering.
v1 code — readback SOQL extension is a SELECT-only addition on the harness side.
Production org behavior — Fix 3 is canary-only and operator-gated; it doesn't run automatically.
The Prisma schema — ProductCode and Description were already on the row; the summary builder just wasn't surfacing them.
cjd concern #7.20 (per-unit price + discount math, ~250 diffs) and the PricebookEntry gap on 01tUV000000SZ1HYAW — both filed as separate beads.

5. Operator follow-up after merge

Run Fix 3 with --commit against a writable DATABASE_URL.
Re-run the harness: runPlaidDiffHarness.ts --since=… --until=… --limit=5.
Expect structural diffs to drop from 1246 to ~550.
With only Fixes 1+2 (backfill not yet applied), expect ~900.

Rollback. Fixes 1 & 2 are pure additions — reverting the PR restores prior behavior with no schema or data side-effects. Fix 3 is operator-gated and only writes missing keys; rolling it back means leaving the merged data in place (harmless) or hand-clearing the 5 keys on the affected canary rows.

Plaid v2 diff harness: three fixes to cut 1246 structural diffs to ~550

1. Why this exists

Projected diff reduction

2. What changes — by fix

1a. Placeholder rule in classifyDiff

1b. v1 readback SOQL extension

The summary boundary was dropping two fields

Emission in buildPlaidQliTailFields

Why a backfill, not an emission fix

The 5 target keys

Merge policy

Safety guards

Operator invocation

3. Tests

4. What this PR doesn't change

5. Operator follow-up after merge

1a. Placeholder rule in `classifyDiff`

Emission in `buildPlaidQliTailFields`