Use cases

Five scenes from inside the product.

WorkReef is industry-agnostic by design. Customer-specific context lives in one config field, not in code. The five scenes below are real shapes from the alpha; the same pattern repeats with different surface area per vertical.

Medical device + field services · HIPAA

Movemedical, customer one.

Tuesday, 5:47am. A Stryker rep named Dana files a ticket from the parking lot of a surgical center in Indianapolis. Her case is in OR-3 at 8am. The device pairing didn't sync to her case file overnight, and she needs the right serial numbers in the chart before the surgeon scrubs. Pylon takes the ticket. Two minutes later it's queued behind 47 other tickets, most of them not surgical.

What the Cartographer pulled on day one. Pylon tickets clustered by surgical case (Dana's is one of 2,847 over the last 90 days). Jira issues for the product team. GitHub Actions failures (CI reliability matters when your software runs inside operating rooms). Anthropic admin and OpenAI admin keys for the AI tools already in use. Salesforce for rep credit and territory. Snowflake for the BI layer.

What the Architect proposed. Surgical-case escalation routing as a hybrid candidate. The AI reads the case-file metadata, flags whether the ticket is OR-day blocking, routes Dana's straight to a human who knows the OR slot is in two hours. Customer impact during OR hours (6am to 6pm local) is priority-bumped automatically. The Steward watches Anthropic and OpenAI spend by department weekly and posts the digest to the Movemedical leadership channel on Monday at 7:00.

What the gate refused to advance. Autonomous response on surgical-case tickets. The quorum panel found PHI exposure on 11% of the cases the AI wanted to draft replies for. Compliance is a hard veto on the Architect's output. Recurrence numbers do not override it. Dana's ticket stays human-answered.

Customer Support · tier-1 with an AI Supervisor

AI handles tier-1. Maria oversees.

Ticket triage at scale is the obvious AI candidate. It is also the canonical "company shipped AI and customers noticed it was an AI" failure. The escape hatch is the AI Supervisor: a former tier-2 engineer retrained to oversee the AI's tier-1 work. WorkReef names this role explicitly.

Maria, your AI Supervisor, signs in Monday morning. 247 tickets handled by the AI in the last week. 18 of those landed in her queue because the shadow comparison flagged a disagreement with the human baseline. Each disagreement has the AI's draft, the human's actual call, the system prompt the AI used, and a calibration button. Maria's feedback rewrites the agent's system brief, not just a single response.

The promotion gate has held the candidate at the pilot phase for three weeks. Agreement was 89% across 234 runs as of last Friday. Just under the 85% threshold the platform requires to promote. By Tuesday morning agreement crosses 86%. Maria pulls eight more shadow runs to be sure before clicking promote. The platform does not push her. Backward moves are always allowed.

PHI and PCI tickets route to humans regardless of recurrence numbers. The compliance class is a hard veto in the quorum, not a soft signal.

If you're the support dept lead running this pilot, read this

Engineering · CI Reliability

Diego's flaky workflow becomes an automation candidate.

Diego is your platform engineer. He signs in at 7am. Fourteen CI runs failed overnight. He's seen the same Cypress integration test fail thirty-one times across two branches in the last month. He has half a fix in his head and zero time to write it before standup.

WorkReef's GitHub connector saw the pattern first. It walked failure runs across Diego's ten most-active repos, clustered by workflow and repo, and surfaced the failure as a task in a CI Reliability workstream. Thirty-one fires across two branches passed the AI-candidate threshold. Datadog log clustering corroborated. Two sources, same fingerprint. The cluster jumped to the top of the Architect's analysis queue last week. The quorum panel scored it AI-feasible. Diego approved the pilot.

Now an agent monitors that workflow. When it fires, the agent files a PR with a draft fix and labels it. The intentionally narrow tool scope: file a PR, label it, summon the IC overseeing AI for review. No merge authority. Diego's senior, Ana, gets the review request in Slack with the agent's name and the proposed diff. She approves or rewrites. Either way, agreement metrics climb. Eight more weeks like this and the promotion gate might propose advancing the agent to merge-without-human-review. The platform won't propose it before then. The backward move (revoke merge authority) is always available.

AI program management · the Steward

"What did AI cost us last month?"

Your CFO asked the question at the all-hands. You don't have a clean answer. The Anthropic key is on Pradeep's personal Console account. ChatGPT Enterprise is paid monthly on a corporate card without a department tag. GitHub Copilot seats sit assigned to people who left. The Vertex AI bill is buried inside Google Cloud somewhere. You have been promised the number for six weeks.

The Steward pulls from every connected source daily: Anthropic admin, OpenAI admin, Microsoft Copilot, GitHub Copilot, Power Automate, Vertex via the GCP billing export. Spend lands in one ledger. Last month's was $42,180 across six providers. Engineering accounted for 53% of it (Copilot seats plus Diego's CI Reliability agent's API calls). Customer Support 28% (Maria's tier-1 agent). The remainder split across Marketing, Sales, Finance, and the Steward's own runs.

Cost-per-value, ranked worst-first. Each Monday the Steward divides spend by a value-USD-equivalent per department and per source. The worst ratios get flagged. Measured value is rated above self-reported value when it ranks. Last week's worst ratio: Marketing's Custom GPT spend was up from $3,200 to $4,800 and attached value attribution still reads zero. The Steward posted it to Teams on Monday at 7:04am with one click to chase up.

Adoption health. A tool that was used heavily and is now dormant surfaces as waste. Copilot seats paid for and not invoked in 30 days surface as the easiest cut. The Steward does not moralize about spend. It shows the math and lets you decide.

Security · CISO governance review

Janelle exports the audit log.

Janelle is the CISO. She does not log in to demo the agent canvas. She logs in to certify the platform is safe to operate inside the company. Her workspace is a read-only Governance page that names every agent's current scope, the audit trail status, the customer-managed-key status, the redaction posture, and the security position against SOC 2 and HIPAA controls. What's done, what's mid-stride, what's still in front of us.

Audit verification. Janelle exports the audit log as CSV. She runs the verifier locally on her own hardware. Every entry's hash matches; the chain holds. She picks a random row from three months ago, edits one character, re-runs the verifier. It names the tampered row and breaks the chain at every subsequent entry. She doesn't have to take our word for it. She makes a screenshot for her quarterly board report.

Approval queue. Janelle sees who approved what, when, and what fired afterward. Restricted actions wait on a named approver with a 24-hour window. Her job is reviewing the approvals her team is making, not approving every per-task call herself. The platform tells her if anyone is rubber-stamping; it surfaces approvals that have an unusually short consideration time as a signal.

Spend caps. The monthly AI spend cap gates every cost-incurring path, including scheduled agent runs. When the cap is hit, further LLM calls return a clear error and the runs record as suspended. Sustained suspended runs surface on the on-call dashboard so Janelle's SRE counterpart sees the signal before she does.

Your scene isn't on the list?

The platform is industry-agnostic. The connector catalog has 28+ entries and a Custom REST escape hatch for the SaaS you built yourself. Talk to us about your stack.

Request access Platform overview