Blog

Every SaaS Company Is Launching an AI Assistant. Here's What the Usage Data Is Revealing.

Blog

Every SaaS Company Is Launching an AI Assistant. Here's What the Usage Data Is Revealing.

Blog

Every SaaS Company Is Launching an AI Assistant. Here's What the Usage Data Is Revealing.

Junil Kim

CEO

Feb 12, 2026

Junil Kim

CEO

Feb 12, 2026

Junil Kim

CEO

Feb 12, 2026

Every SaaS Company Is Launching an AI Assistant. Here's What the Usage Data Is Revealing.

The first wave of enterprise SaaS companies have shipped their in-app AI assistants. Xero has JAX. Optimizely has Opal. Atlassian has Rovo. HubSpot has Breeze. Descript has Overlord. The list keeps growing.

If you've been paying attention, you've noticed the pattern: a chat surface (usually a right-hand slider), backed by tool-calling for key workflows, with the rest handled by text-based answers grounded in product docs. Ship it, announce it, move on to iteration.

But now something interesting is happening. The companies that launched early and are data-driven are starting to see real usage data. And the data is surfacing a problem that nobody fully anticipated.

What gets automated first (and what doesn't)

When a product team builds an AI assistant, the first instinct is to automate the highest-frequency workflows through backend tool calling. This makes sense: if 40% of users ask "create an invoice" or "set up an A/B test," you wire up the tools so the assistant can execute those actions end-to-end.

Xero's JAX, for example, is strong at backend automation for invoices, contacts, and financial insights. Optimizely's Opal can route to tools across their platform. Atlassian's Rovo has a massive tool inventory spanning Confluence actions, Jira operations, Google and Microsoft integrations, and multi-tool orchestration. Descript's Overlord handles scripts, scenes, speakers, media edits, and video exports.

These are impressive systems. But they all share the same structural limitation.

For every workflow the assistant can fully automate, there are dozens it can't. The assistant knows what the user wants to do. It can explain how to do it. But it can't actually do it for them, so it falls back to step-by-step text instructions.

The long tail is bigger than anyone expected

This is where the usage data gets interesting. When you look at what users actually ask an AI assistant, the distribution is not what product teams planned for. There's a long tail of workflows that are either:

Genuinely long-tail workflows that were never prioritized for tool-calling automation. Think account settings, bank feed connections, third-party integrations, template customization, permission changes, or manual uploads. These are real tasks that real users need to complete, but they weren't in the top 10 workflows the team built tools for.

Core workflows with edge-case parameters that make them too customized for one specific customer or scenario. The workflow itself might be common, but the way this particular user needs to do it requires navigating UI that the backend tool doesn't fully cover.

Take Xero's JAX as an example. JAX handles invoices and contacts beautifully through tool calling. But when a user asks about setting up bank feeds, customizing invoice templates, managing third-party integrations, or changing account permissions, JAX responds with numbered text instructions. "Step 1: Go to Settings. Step 2: Click on Bank Accounts. Step 3: Select Add Bank Connection..."

Descript's Overlord has a similar pattern. It has an export-video tool that handles standard video exports. But ask it to export a timeline as XML, and it responds with numbered instructions walking you through the UI. The tool exists for the common case, but the long-tail variant falls back to text.

Atlassian's Rovo might be the most interesting case because of its sheer tool breadth. Rovo can create and edit Confluence pages, manage Jira issues, handle bulk operations, and orchestrate across multiple tools. But ask it to embed a team calendar into a Confluence page, and it inserts some legacy markup and then tells you: "Here are the remaining steps to finish this in the UI." The assistant did what it could programmatically and then handed the rest to you as instructions.

This pattern repeats everywhere. The assistant automates the head of the distribution and narrates the tail.

Why users drop off at text instructions

Here's the part that the usage data makes painfully clear: users don't follow text instructions.

When an AI assistant responds with "Step 1: Navigate to Settings > Integrations > Click Add New..." the user is essentially back to reading product documentation. The mental effort required to parse multi-step text, map it to the actual UI they're looking at, find the right buttons and menus, and execute each step in sequence is not meaningfully different from just searching the help center themselves.

This is the first time many product teams are dealing with non-deterministic product behavior at scale. Their traditional product is a point solution: you click a button, it does the thing. The AI assistant introduces a spectrum of user experiences they've never had to manage before. Sometimes the assistant fully solves the problem. Sometimes it partially solves it. Sometimes it gives you a wall of text and wishes you luck.

The drop-off happens right at that transition. The user asked a question expecting the assistant to help them do something. Instead, they got instructions. Now they have to do the cognitive work themselves, and many of them simply don't.

What shows up in the data: repeated follow-up questions after the assistant answers. Users dropping off mid-workflow after receiving instructions. Support tickets and escalations that originate from AI assistant sessions. "How do I..." queries where the assistant's answer didn't actually get the user to completion.

Why backend automation can't solve this

The instinct is to say: "We just need more tools. If we automate more workflows, the assistant won't need to fall back to text."

This is partially true but fundamentally limited, for two reasons.

First, the long tail is genuinely long. For a complex product like Xero, Atlassian, or HubSpot, the number of possible UI workflows runs into the hundreds or thousands when you account for configuration paths, edge cases, and customer-specific setups. Building backend tool-calling coverage for all of them is a multi-year engineering investment that never fully catches up.

Second, and more importantly, companies can't abandon the UIs they've built. The product interface is the product. Users still need to navigate it, configure it, customize it. Even if you could automate every backend action, there are workflows that are inherently UI-bound: setting up visual layouts, configuring drag-and-drop elements, navigating nested settings pages, or working with interfaces that require seeing and interacting with the screen.

The assistant needs a way to help users in the UI, not just behind it.

The missing layer: on-screen guidance as a callable tool

This is where we landed after talking to a number of these enterprise AI teams. The pattern kept repeating: the assistant is smart, it knows what the user needs, it can explain the steps, but it can't show them.

What these assistants need is an on-screen guidance tool they can call when they decide it's better to show than tell. The assistant stays the main experience. It still handles questions, still automates what it can through tool calling. But for the workflows where it would otherwise fall back to text instructions, it calls an on-screen agent instead.

The on-screen agent takes the user's question and context from the assistant, reads the current state of the UI, and guides the user through the workflow step by step. It highlights the right elements, waits for the user to act, and adapts to what they actually did. If the user asks a follow-up question mid-flow, it handles that too.

This is fundamentally different from static tour tools like Pendo or WalkMe, which require pre-mapping every flow and break when the UI changes. The on-screen agent is LLM-native: it generates guidance on the fly from the user's question and the live screen state.

We built this as a headless tool that any product AI assistant can call. We integrated it behind Optimizely's Opal as the first proof point: when a user asks Opal "show me how to..." or when Opal determines the workflow requires UI navigation, it calls Moss. Moss pops up and guides the user through the UI. Opal stays the orchestrator. Moss is the hands.

It turns the assistant into a full Swiss knife: automation when it can, on-screen guidance when it should, text instructions never.

What to watch for in your own assistant data

If you've launched a product AI assistant, or are about to, here are the signals that indicate this last-mile UI gap is real for your users:

Repeated follow-up questions after answers. If users keep asking variations of the same question after the assistant responds, they likely understood the answer but couldn't execute it in the UI.

Drop-off after text instructions. If session data shows users receiving a step-by-step answer and then not completing the workflow, the instructions aren't landing.

Support escalations originating from AI sessions. If users go from assistant to support ticket, the assistant got them partway but not all the way.

"How do I..." queries with low backend tool coverage. If the majority of user questions can't be fully resolved through tool calling, you're sitting on a large volume of text-instruction responses.

These are the patterns we keep seeing across every company we talk to. The assistant is working. Users are asking good questions. The gap isn't intelligence or knowledge. It's the last mile between knowing what to do and actually doing it in the UI.

If you're seeing the same pattern, we'd love to hear about it. Get in touch.