Strategy & Operations · Builder

Organizations don't lack information.

They lack judgment infrastructure.

I've spent my career finding the thing underneath the thing — the structural failure mode hiding beneath the presenting problem. Across Braze, ASAPP, and Artsy, the pattern was always the same: organizations drowning in data, starving for decisions that stick. I'm building the instruments to close that gap.

Thomas Meerschwam
ex-Braze · ASAPP · Artsy · NYC
Thomas Meerschwam
Active Building

The Ground Truth Decisioning System

01
What happened?
WBR Generator
Live
02
What was decided?
Meeting Intelligence
Live
03
What will happen?
Pipeline Synthesizer
Live
04
What matters most?
Prioritization Engine
In dev
05
Is it working?
Eval Harness
In dev
Five tools. One argument.

Organizations have systems of record for data, communication, and tasks. They have no equivalent for decisions — no persistent layer representing what is decided, who owns it, and under what assumptions those commitments remain valid. Each tool is a chapter in the same case: organizational clarity is an infrastructure problem, not a leadership personality trait.

01
WBR Generator
What happened?
Live

Most Weekly Business Reviews are rituals, not diagnostics. The WBR Generator deliberately separates two layers: a deterministic Pandas pipeline computes what actually happened — week-over-week changes, 4-week trend slopes, derived ratios, and z-score anomaly detection against historical baseline. The reasoning layer (Claude Sonnet) receives only structured facts — never raw numbers — and is explicitly instructed not to summarize, not to hallucinate absent fields, and to produce decisions required this week and non-obvious leadership questions. A DATA AVAILABILITY block prevents hallucination on fields that aren't in the CSV.

"The bottleneck isn't data — it's interpretation. Dashboards can't solve a judgment problem."

WBR Generator · Architecture
Input · Streamlit UI
Metrics CSV upload Fuzzy column mapping Leadership context (3 sentences)
Layer 1 · Deterministic Analysis (analysis.py)
WoW Δ · 4-wk trend slopes Pipeline coverage ratio Z-score anomaly (|z|≥2.0) Business logic heuristics Runway / churn / CAC flags DATA AVAILABILITY block
↓ structured facts only — no raw numbers passed
Layer 2 · Reasoning (Claude Sonnet · prompt.py)
Decisions required this week Anomaly mechanism (not just flag) Non-obvious leadership questions Streaming output
Python · Pandas · NumPy Anthropic API Streamlit Cloud
02
Meeting Intelligence
What was decided?
Live

The gap between what got decided in a meeting and what anyone acts on is where execution debt is born. Meeting Intelligence runs deterministic preprocessing before Claude sees anything — timestamp stripping, speaker normalization, commitment detection, decision signal scoring. Then a two-pass Claude API call using structured tool use: Pass 1 extracts decisions, open items, blockers, and unresolved questions with confidence scoring; Pass 2 drafts a follow-up from structured output only — raw transcripts never reach the follow-up pass. Name a recurring series and the tool tracks patterns across sessions via Supabase, surfacing what keeps getting left unresolved week over week.

"Anyone can summarize a meeting. The judgment is in knowing what a decision actually looks like versus a discussion."

Meeting Intelligence · Architecture
Layer 1 · Preprocessing (preprocessing.py)
Timestamp strip · speaker norm Commitment detection Decision signal scoring Quality gate (min 200 words)
Layer 2 · Two-Pass Reasoning (Claude Sonnet · structured tool use)
Pass 1 — extract: decisions, open items, blockers, questions + confidence Pass 2 — follow-up draft from structured output only Ownership vacuum · circular dependency detection
↓ structured output only — raw transcripts never written
Layer 3 · Cross-Session Persistence (storage.py · Supabase)
Meeting Series tracking Fuzzy token-overlap pattern match Weighted execution debt score Org Friction Report
Python 3.11 Anthropic API Supabase (Postgres) pypdf
03
Pipeline & Forecast Synthesizer
What will happen?
Live

CRMs capture activity. They don't capture judgment. The Pipeline Synthesizer connects directly to HubSpot via REST API — resolving owner IDs to rep names and internal stage IDs to human-readable labels — then runs deterministic preprocessing before Claude sees anything: concentration analysis, slippage detection, deal age, days since last activity, weighted pipeline by stage probability. Claude then reasons like a CRO preparing for a board meeting, producing six structured sections via forced tool use: executive summary, forecast confidence, revenue risks, revenue opportunities, rep insights, and leadership actions.

"3.2x coverage is a signal. 'The forecast risk is concentration, not volume' is judgment. This tool produces the latter."

Pipeline Synthesizer · Architecture
Input · Dual Path
HubSpot REST API — owner ID resolve · stage label resolve · deal pagination CSV upload — Salesforce · Attio · any CRM
↓ both paths → same DataFrame schema
Layer 1 · Deterministic Preprocessing (preprocessing.py)
Stage normalization · probability weighting Concentration analysis (top N as % ARR) Slippage detection · deal age · staleness Coverage ratio · single-rep dependency
↓ structured metrics only — no raw CRM data passed
Layer 2 · Six-Section Reasoning (Claude Sonnet · forced tool use)
Executive summary · Forecast confidence Revenue risks · Revenue opportunities Rep insights · Leadership actions
Python · Pandas HubSpot CRM API v3 Anthropic API Streamlit Cloud
04
Prioritization Engine
What matters most?
In Development

Prioritization fails not because teams lack frameworks but because the assumptions driving scores are invisible and uncontested. This engine makes tradeoffs explicit — surfacing the disagreements hidden inside prioritization scores and forcing the real strategic conversation rather than deferring it into a spreadsheet.

"A prioritization matrix that everyone agrees on is usually a sign that no real decision got made."

Prioritization Engine · Planned Architecture
MCP Integration Layer
Project management MCP (Linear / Jira) Stakeholder input via structured form
↓ live org data — no manual export
Skill · Assumption Extraction
Score divergence detection Hidden assumption surfacing Stakeholder weight mapping
Reasoning · Disagreement Report (Claude Sonnet)
Contested assumption report Forced tradeoff articulation Decision prompt for leadership
↺ Human Calibration Loop
Assumptions revised Re-score → new divergence map
MCP servers Claude Sonnet Streamlit
05
Eval Harness
Is it working?
In Development

The meta-tool. An evaluation framework for assessing whether the outputs of the other four tools are actually improving organizational decision quality over time — not just generating more structure. The loop that keeps the system honest.

"Any system that can't measure its own impact will eventually become another ritual."

Eval Harness · Planned Architecture
Eval Suite · Defined Before Build
10 targeted evals per tool LLM-as-judge scoring (Claude Sonnet) Adversarial synthetic inputs Regression suite — score before/after every change
↓ evals defined first — not after
Skill · Output Quality Scoring
Does output surface the right decision? Hallucination detection Edge case flagging vs. auto-deciding
Trace Log · Supabase
Agent run trace · planned Score tracked over time Drift detection across sessions
↺ The Loop That Keeps the System Honest
Prompts updated Re-eval → score change tracked
Claude Sonnet (judge) Supabase CLI-driven iteration
The Arc

Three companies. Three failure modes.
One pattern underneath all of them.

Six years at the inflection point between strategy and operations — the place where good thinking goes to die if the infrastructure isn't there to hold it. At every company, the presenting problem was different. The underlying problem was always the same.

Braze
2024 – 2025
Scale through complexity

Product Manager, SMS & RCS · $3.5B public company

Owned GTM strategy for a high-eight-figure ARR product line. The mandate: launch RCS — a fundamentally new channel — across 11 international markets with 20+ cross-functional teams and no playbook. The presenting challenge was coordination. The real challenge was that each team optimized locally with no visibility into the full dependency chain.

Built the dependency map, held the line on a text-only MVP against Sales pressure (6 weeks to market vs. 4 months), and created the operational infrastructure before a single customer went live. Result: 40 enterprise customers in 90 days across 11 markets. Google's first official RCS for Business partner. 50%+ YoY ARR growth.

Leverage — the durable proof point

Separately: support cases growing faster than revenue — 100 tickets/quarter consuming 10% of engineering bandwidth. Everyone saw a capacity problem. The real problem was a signal problem: no one could see patterns beneath the volume, so every fix was reactive. Built a tagging framework (<30 seconds per ticket), ran cross-functional root cause analysis, isolated the 20% of issues driving 80% of volume. Fixed those specifically. Process continued after I left.

11Markets launched
40Enterprise customers, 90 days
20+Teams orchestrated
50%+YoY ARR growth
ASAPP
2022 – 2023
Align competing worlds

Product Manager & Strategic Advisor to CPO · Conversational AI, Fortune 500

The structural tension at ASAPP: enterprise customers needed deep customization; the product needed coherent architecture. I operated at that seam — managing commercial relationships while keeping the product from accumulating fatal debt.

Led a WhatsApp deployment to 2 million users in 30 days. When a large F500 customer threatened churn over a critical configuration requirement, I worked with engineering to modularize the core product rather than build a one-off fix — virtualizing the state layer in a way that protected the account, closed the expansion, and created a reusable architectural pattern that made similar enterprise configurations tractable going forward.

2MUsers in 30 days
F500Enterprise accounts managed
Artsy
2020 – 2022
Rebuild broken infrastructure

Revenue Operations & Product · Fine art marketplace

The revenue infrastructure had been patched for years — 15+ fragmented data sources, no single source of truth, a CRO preparing board reports from memory. Consolidated the data layer, built a revenue intelligence dashboard that compressed board prep from days to hours, and rebuilt subscription management from scratch — cutting contract management time by 50% while keeping churn below 5%.

Built self-service commissions analytics in Salesforce: a daily ETL pipeline pulling from Redshift that let 65+ reps model "if I close these three deals, I earn $X" — eliminating 100+ hours per quarter of manual calculation and changing how reps prioritized their pipelines in real time. Reps changed behavior because they could finally see reality.

15+Data sources consolidated
50%Reduction in contract mgmt time
100+Hours/quarter saved
The Argument

Why organizations accumulate execution debt despite having more information than ever

Over the last two decades, companies built extraordinary infrastructure for capturing and retrieving information — dashboards, CRMs, project tools, meeting transcripts, knowledge bases. The implicit assumption was that access to data was the binding constraint. It wasn't.

"The real constraint is what happens after information appears: how it is interpreted, turned into decisions, assigned ownership, and kept aligned as reality changes. On that front, we built almost nothing."

Why? Because decision infrastructure is hard to sell. The value of a CRM is legible: here are your contacts, your pipeline, your activity history. The value of a system that maintains organizational decision state is diffuse, slow to show up in metrics, and politically uncomfortable — genuine clarity exposes misalignment and accountability gaps that organizations have strong incentives to keep ambiguous. So the market built what was measurable and avoided what was necessary.

The predictable failure mode follows. Goals multiply without hierarchy. Metrics proliferate but lose trust. Meetings generate discussion but not durable, enforceable decisions. Ownership becomes implicit. Companies stop operating on a shared system of record and instead rely on informal memory, repeated clarification, and social alignment — still functioning, but no longer synchronized. This is execution debt: accumulating when decisions go unrecorded, commitments stay ambiguous, and the reasoning behind choices is lost faster than the work evolves.

AI makes this gap decisive rather than manageable. As the cost of generating outputs collapses, the bottleneck shifts from production to coherence. Without a decision layer, increased output produces entropy at higher velocity. The constraint was never information. It is judgment made durable: translating information into stable, traceable commitments that survive time, scale, and organizational complexity. That is not a data problem. It is an infrastructure problem whose absence doesn't announce itself until the cost is already compounded.

Organizations have systems of record for data, communication, and tasks. They have no equivalent for decisions. Notion, Coda, Jira, Confluence are useful tools that address adjacent problems. None were designed to maintain organizational decision state as a first-order concern. The layer has not been built as a unified system. That's what I'm working on.

If this resonates,
let's talk.

If you're building something where organizational clarity is the constraint — or if you just want to argue about decision infrastructure — I'd like to hear from you.

Send an email