From a demo to 59% auto-approved in twelve weeks.
Next Level Sports runs 80+ youth flag football, basketball, volleyball, and cheer programs across 11 states — plus an e-commerce team store at store.nextlevelsports.com. We run a managed AI agent team for them across customer service, operations, finance, and program configuration. The numbers below are still the published helpdesk production metrics; the finance pipelines are in rollout toward BigQuery.
- Drafts shipped
- 1,942
- Approved as-sent
- 59%
- Rewrite rate
- 35%
- Live agents
- 3
- Articles ingested
- 141 KB
What the agent team owns.
Next Level Sports has hundreds of daily customer touch-points: parents asking about registration, jersey sizing, refunds, schedule changes, and order status. Their team handles all of it through Front. The same platform runs internal Slack work for daily deposit reporting and operational lookups today, with finance data syncs and program-configuration roles in rollout. Their connected stack includes Slack, Front, ShipStation, SportsConnect, Postgres, Contentful, Customer.io, QuickBooks, Bill.com, BigQuery, Fast.io, and Nango.
Helpdesk drafting on Front
Alfred and Selina draft replies in Front, grounded in 141 KB of ingested help articles, program data, prior tickets, ShipStation order context, and read-only customer data. A team reviewer approves or edits; edits become few-shot examples for the next draft. The agent can tag, comment, archive, unassign, or route conversations while humans keep send authority.
Daily deposit reporting + program lookups
Frank logs into SportsConnect each morning, pulls yesterday's deposits, generates a CSV, files it to Fast.io, and posts a Slack summary. He also handles ad-hoc program and customer lookups against the database, with a browser fallback for systems that do not expose a clean API.
QuickBooks + Bill.com into BigQuery
The newest finance role is in rollout: a pair of Nango-backed sync pipelines where QuickBooks journal entries and Bill.com bills/line items load into the nls-finance BigQuery project, with per-agent configuration, scoped credentials, high-water marks, and Slack summaries. It carries no production approval metrics yet. The daily deposit report stays the live source-file workflow; BigQuery becomes the analysis layer as the role earns its way in.
Program operations and configuration review
Bruce is the program-operations role now in rollout: auditing Contentful program builds, checking division structures, schedules, locations, registration windows, and conflicts before staff publish changes. The rule is read first, verify second, then route any live-registration change for approval. The role is earning its way in rather than reporting its own metrics yet.
Routing + supervision
Supervisor pattern: routes work between customer service, finance, program ops, and lookup specialists; escalates to the human team when needed; and keeps the review trail visible so nothing falls through the cracks.
What twelve weeks of weekly review looks like.
- Week 1
Shadow mode. ~30% approval.
The agent drafts every reply; nothing ships without a human on the keys. We meet the team for the first weekly review, look at the misses, and tag the ten patterns that account for most of the rewrites.
- Week 4
Drafting in production. ~50% approval.
Top patterns absorbed. The agent's drafts are good enough that the team is mostly editing tone, not rewriting from scratch. Approval rate roughly doubles from week one.
- Week 12
Approved as-sent. 59%.
The team stops editing most drafts entirely. Where they do edit, it's for context the agent couldn't have known — not policy, not voice, not order data. Rewrite rate is 35%, trending down month over month.
1,942 drafts processed across the first eight months in production. 59% approved without rewrite — the human reviewer hit send as-is. 35% lightly edited for tone, missing context, or policy detail. The rewrite rate trends down week over week as the agent learns from each edit.
- Tone and phrasing 42%
- Missing context 28%
- Policy detail 18%
- Other 12%
When the team edits, they most often touch tone and phrasing. Those edits are the highest-leverage training signal — every edit becomes a few-shot example for the next draft.
The edit loop is the moat.
Most AI customer-service tools try to go 100% autonomous on day one, fail on edge cases, and quietly get turned off. Our agents start in shadow mode, graduate to drafting, and earn more responsibility as the rewrite rate drops. The same operating model now applies beyond support: add a finance sync, a program-audit role, or a reporting workflow only after the role is clear enough to review.
The 59% number is what progress looks like in real production, not a demo.