Microsoft 365-native Bring your own key, your own LLM Audit trail your CISO can verify
The demo went great. The CFO has a question your AI team can't answer.
Claude is on every desk. Copilot is in Office. Three of your engineers built ChatGPT scripts that touch the production database. Marketing pays $4,800 a month for a Custom GPT nobody owns. Your CFO asks what does this cost and what is it worth, and the answer takes you six weeks of cross-functional scrambling. Then it's stale the moment you deliver it.
The companies whose AI rollouts work share one thing. They can answer what does this bot do, who approved it, and what stops it when it gets one wrong in a sentence. The rest can't. WorkReef is the layer that makes you one of them. Not by replacing the AI you already run. By governing it, and giving you a defensible answer the next time the board asks.
28+
connectors
3
frontier models, every recommendation
30/85%
shadow runs, agreement to promote
7
personas, each with their own surface
5
layers between AI and your data
What it feels like
It's a conversation. The answer paints itself.
Type a question. WorkReef thinks. The page beside the chat redraws into the cards, charts, and tables that answer it. Approve in one click. Drill into a specific candidate. Ask a follow-up. The canvas updates.
This is not a chatbot bolted onto a dashboard. The dashboard is composed from the conversation. The Lead agent narrates what changed overnight. The Steward agent shows you spend without you asking. The Architect proposes takeover candidates with the quorum trace attached. You read it like email, not like Tableau.
"What did the Customer Support team spend on Anthropic last week?"
"Why is Sarah's queue overloaded?"
"Promote the Tier-1 triage candidate to live."
"Show me everything that needs my approval today."
How the platform operates
Four moves. Then on repeat.
01
Discover
Connect your stack. The Cartographer agent ingests Microsoft 365, Salesforce, Pylon, QuickBooks, Jira, GitHub, Snowflake, AWS, Datadog, and 19 more. People, departments, workstreams, applications. The shadow AI nobody told you about. Spend nobody tracks. Inferred from real signal, not an interview script.
Twelve discovery dimensions worked in passes, never all at once
Existing AI surfaced before any new AI is proposed
Cross-source corroboration before anything lands as a proposal
02
Understand
Humans and AI agents both fill positions in your org. Capacity, cost, task portfolio. When a position is pressed, you see it before the people in it tell you. When Anthropic spend doubles week-over-week, the Steward flags the anomaly in your Monday digest. Where the work is, where the slack is, who is overloaded. Visible without you asking.
The Steward agent watches spend, value, and adoption daily
Capacity rebalance proposed before any AI takeover is considered
Self-reported value rated lower than measured value in the rankings
03
Transform
For every task that might become AI, the Architect drafts the analysis. Then Claude, GPT, and Gemini each vote on the same four questions: does the math work, is this human-sensitive, is the customer at risk, does compliance allow it. A deliberator pass synthesizes the panel and names the agreement and disagreement explicitly. If the panel didn't agree, the recommendation gate refuses to surface do_now. The home page never shows an unvetted "AI takeover ready" call. PHI and PCI tasks stay human-only even at perfect recurrence numbers.
Three impact dimensions: cost · service · riskNONE is honest. No padded scores.Re-runnable per candidate, idempotent on double-click
04
Drive
The differentiator. Once a candidate is approved, the platform provisions the agent itself. A placeholder lands in the org-map with an intentionally narrow tool scope: no Teams, no calendar, no destructive surface. It runs in shadow alongside the human. Agreement rate climbs, or it doesn't. Promotion to live requires 30 runs at 85% agreement. Backward moves (autonomous, assist, off) are always allowed.
Mock-provisioning lands a real Agent + AI Position; the org-map updates end-to-end
Restricted actions block until a named approver releases them, 24-hour window
Monthly per-customer AI spend cap gates every cost-incurring path
Why a panel, not a model
One model is one opinion. Your AI program needs a vote.
Single-model AI sounds confident about everything. That is the problem. WorkReef puts every consequential decision (should this take over, what is the customer risk, what is the rollout) to Claude, GPT, and Gemini at the same time. Each panelist's response is persisted with tokens, cost, latency, and any errors. The deliberator synthesizes the verdict and names the agreement and disagreement.
If one panelist errors out, confidence drops a level. If two panelists agree and one disagrees on a PHI dimension, the gate downgrades the call from "do now" to "pilot" silently. The operator sees the math behind the verdict, not a confident headline that papered over the dissent.
Inside Microsoft 365. Where your team already works.
WorkReef is a Microsoft 365-native platform. Agents are real first-class participants in your tenant, not bots living in someone else's app. They show up where your team already shows up.
Teams adaptive cards
The Steward posts your weekly digest to Teams on Monday morning as an adaptive card. With the agent's name, avatar, and a one-click "ask a follow-up" button. Not a ping from a generic bot account. A named participant your team recognizes.
Calendar invites from named agents
When the Concierge needs ten minutes to walk you through Pylon, it sends a calendar invite. From its own name, on the right person's calendar, with the right context already pre-loaded. Your day fills with agents the same way it fills with colleagues.
Entra-native identity + org chart
People come from Entra. Departments are inferred from groups. The reporting hierarchy is what Active Directory already says it is. The change leader does not type a single colleague's name into a form to bootstrap the platform. Entra and the discovery pass do it.
Every morning
An executive briefing. Not another dashboard you forget to open.
Your Lead agent is the platform's tenant-level chief of staff. It drafts a briefing at 7am. What changed overnight. Three decisions waiting on you. Two anomalies the Steward caught. One agent that ran into trouble and needs your call. Read it on your phone in two minutes.
The Lead is yours. It has a name. It has a voice the team learns to recognize. It is not a personality bolted on top of GPT. Its system brief reflects your industry, its memory is your tenant, and its tools are scoped to what you authorize.
The home page reframe
An inbox, not a dashboard.
After discovery, the platform already knows who everyone is, what they do, and what they touch. You don't navigate to that information. You act on it.
Persona assignments
147 stakeholders discovered. The platform proposes a persona for each: change leader, dept lead, IC augmented, IC overseeing, IC replaced. Ranked by confidence. Bulk approve. The platform stops being empty.
Transformation proposals
Per workstream, per role: quorum-vetted takeover candidates with cost, service, and risk dimensions named, a rollout plan, kill criteria. Cluster-grouped by department so 24 decisions collapse into one approval.
Department + workstream cleanups
Free-text department strings from Entra collapse into canonical Departments. Discovered apps match to Workstreams. Approve the suggestions; the platform commits them.
Integration follow-ups
The Concierge couldn't probe X. A credential expires in four days. A connector returned partial data. Here's the scope to upgrade. Boring infrastructure stops being your problem.
Five layers between AI and your data
Each one drops independently when your CISO asks.
"We can't have third-party inference." Layer 3 swaps the provider. "We hold the key." Layer 1's BYOK is the cryptographic shred. "Prove this didn't fire." Layer 4's audit chain is exportable and verifiable on their hardware. Every claim defensible against a real security review.
Each customer's data is in a separate database. Bring your own KMS key from Azure Key Vault, AWS KMS, or GCP KMS. Rotate the key and the data is unreadable to us.
2
Data minimization, enforced at storage
Observation summaries capped at 240 characters by a hard storage limit. Raw email and chat content cannot be persisted, even if an extractor tries. K-anonymity of 3 on team aggregations.
3
Choose where inference happens
Azure OpenAI, Amazon Bedrock, on-prem, or our managed providers. Per customer. Model allowlist enforced on the server. Redaction applied before any prompt leaves your trust boundary.
4
Tamper-evident audit
Every entry cryptographically linked to the one before it. Tampering with any past row breaks every subsequent entry. Your compliance team verifies the chain offline on their own hardware.
5
Per-action approval gates
Restricted actions refuse to fire without a named approver. 24-hour window. Approval requests and decisions both audit-logged. You decide which actions need a gate.
Designed for seven kinds of people
Not for the buyer with everyone else shrugged off.
The change leader is the buyer. They are also one of seven kinds of people who will sign in. Each persona below sees a different page, built for what they came to do.
AI Change LeaderPortfolio rollups, escalations, trust signals. The page the board sees in screenshots.
Department LeadTheir team's scope. Their approvals. Their "I want to weigh in" channel.
AI SupervisorThe new role the platform names. Live ops, disagreement queue, incident review.
IC being augmentedHours saved. Drafts to approve. Boundaries the IC sets themselves.
IC overseeing AILive feed of the AI's calls with intervene buttons. Calibration tools.
IC being replacedHonesty. Voice. Career path. Exit dignity. No automation-viability score about themselves, ever.