HubSpot is building AI on 280,000 portals. Most of them would fail an audit.

Written by Connor Skelly | May 11, 2026 11:11:56 PM

HubSpot recently published its vision for what it calls "growth context": an intelligence layer that draws patterns from over 280,000 customer portals and exposes them through APIs. Deal risk scores, conversion benchmarks, industry-specific pattern recognition. A developer makes a single call and gets back a pre-computed risk assessment that accounts for deal velocity, stakeholder engagement, and comparisons to similar deals across the ecosystem.

The claim is direct: this intelligence is "something no standalone model, and no platform without a network of this scale, can replicate."

The engineering ambition is real. But there's a question underneath this announcement that matters more than the technology itself, and it's one that nobody building the layer seems positioned to answer.

What does the data underneath actually look like?

We audit these portals (and know what the quality is)

We work inside HubSpot portals almost every day at Fission. Inside the portals that real companies rely on to run their sales, marketing, and service operations.

We know what the average HubSpot portal looks like because we're the ones opening the hood. And what we find, consistently, is operational drift. Properties created by people who left the company years ago, with no descriptions and no naming conventions. Automation built by cloning instead of architecting. Pipeline stages that stopped reflecting how the team actually sells sometime around the second or third rep hire. Lifecycle definitions that exist on paper but aren't enforced by any system rule.

We wrote about this pattern in detail in our post on portal drift. It's not negligence. It's the natural result of a system that grows faster than the operational discipline around it. Properties accumulate. Workflows multiply. Nobody cleans it up because nobody knows what's safe to touch.

This is the norm across the HubSpot ecosystem (other partners agree). And these are the portals feeding the intelligence layer.

The benchmark problem

When Duncan says context is the moat, he's describing aggregate context: patterns drawn from 280,000+ portals, statistical regularities, industry benchmarks, what's normal across the ecosystem. That kind of context is genuinely valuable. We'll use it ourselves where it makes sense.

But aggregate context is only half of what actually drives good decisions in a specific business. The other half is your specific operational context: the way your sales team actually sells, what your buyers respond to, how deals progress through your pipeline. What "healthy" looks like given your sales cycle, your ICP, your team's structure, and your operational reality today. Aggregate context describes the population average. Your specific context describes you.

This distinction matters because it changes what you have to invest in.

Consider the deal risk score example Duncan uses in the piece. The intelligence layer can return a risk score that knows whether 30 days in-stage is fast or slow for your industry, that knows a champion went quiet after a reorg, that knows a comparable deal stalled on the same objection last quarter. Useful aggregate signals. But the actual decision of what to do about that specific deal, in your specific pipeline, depends on context the aggregate doesn't have. Who's the buyer's actual decision-maker? What's the relationship history? Is this deal stalling on the same objection that killed your last three losses, or is it different this time? The aggregate intelligence can flag the risk. Your specific operational context determines whether the flag leads to the right action.

The same logic applies to organizational readiness. Aggregate context might tell HubSpot that "most companies of this size have this many active workflows." Your specific context is what your team can actually absorb right now, what your operational capability looks like across data architecture, process discipline, and team adoption. A solution that fits an advanced team is wrong for a beginning team, and the aggregate context can't tell you which one you are. Only specific context can.

The part that doesn't add up

HubSpot has spent the past year telling customers to get their data in order before using AI tools. That's been the Breeze message. Clean your data. Fix your properties. AI on bad data scales bad decisions.

They're right. We tell our clients the same thing constantly. We wrote a whole post about what an operations audit actually uncovers and why the foundation has to come first.

But the platform is now building its intelligence from the ecosystem's data as-is. The individual customer hears: your portal needs to be clean for AI to work. The platform itself is learning from every portal, clean or not. The benchmarks your well-maintained portal gets scored against are shaped by portals that would fail a basic audit.

The standard being applied to the customer and the standard being applied to the training data are not the same standard.

What this means if you actually maintain your portal

If you're running a reasonably governed HubSpot instance with enforced deal stages, documented properties, and reliable close dates, the intelligence layer creates a specific problem for you. The benchmarks may not reflect your reality because they're averaged across portals that operate nothing like yours.

Your deal might look "at risk" because it's moving slower than a benchmark built on unreliable data. Or it might look healthy because the benchmark is so noisy that real warning signs don't surface as anomalies. Either way, you're measuring against a yardstick you can't inspect and didn't build.

The companies with the cleanest data get the least accurate benchmarks, because their operational discipline puts them furthest from the messy average the model learned from. That's a strange reward for doing the work.

All this being said, HubSpot is not a company of fools. I imagine the first version of this will look similar to their marketing benchmarks for things like Marketing Emails. Compare your CRM operations to similar companies, if you want. At least that's what I hope they do.

How we apply this in practice

This is why our consulting methodology starts where it starts.

When we're brought in to scope work for a new client, we don't lead with a HubSpot recommendation. We frame the problem in their specific context first. What's the underlying business challenge? What's the desired outcome? What's the organizational reality around that challenge? What constraints sit around the work itself, including capacity, timeline, and the practitioner-client relationship? Only after the framing is clear does the question of which HubSpot implementation, which AI tool, or which intelligence layer become useful to ask.

That diagnostic stance is the difference between solving the problem someone described and solving the problem they actually have. It's why we walk into engagements asking why before we ask what.

The four lenses we apply to every recommendation are explicitly about specific context, not aggregate patterns. The core process framework asks what category of business challenge this actually is for you. The organizational assessment grades your specific team across team structure, experience level, and operational capability. The solution implementation decision tree branches on what exists in your environment versus what's missing. The decision-making framework governs whether and how to take on the work given your specific situation. Every lens points inward.

And our crawl-walk-run sequencing is built on the same principle. Crawl is establishing your specific operational foundation: data architecture that fits your business, deal stages that reflect how you actually sell, lifecycle definitions that match how your customers move. Walk is building the operational habits to act on signals: review cadences, coaching practices, processes that fit your team. Run is where aggregate intelligence (HubSpot's, third-party tools, or anything else) starts producing real value, because it now has specific operational context to land in.

The order matters. Aggregate intelligence layered on top of weak specific context produces generic recommendations that nobody trusts and nobody acts on. Strong specific context layered with aggregate intelligence on top produces decision-making that compounds over time.

The operational foundation still comes first

None of this means HubSpot shouldn't build the intelligence layer. Giving developers access to pre-computed risk scores and conversion patterns through an API is a useful idea and a real engineering achievement.

But the value of any AI built on CRM data depends entirely on what's underneath it. And right now, the ecosystem that intelligence is learning from is one where the average portal has more operational debt than operational discipline.

280,000 portals is a scale advantage if the data is governed. It's a liability if it isn't.

The companies that will actually benefit from this era of CRM intelligence are the ones that already have their foundation in order. Clean data architecture. Governed properties. Enforced pipeline stages. Documented processes. The operational layer that determines whether AI has something real to work with or just a larger volume of noise to pattern-match against.

That hasn't changed. If anything, it just became the most important investment a company on HubSpot can make. And one that your company should make whether you use HubSpot or not. If you do, your structure, schema, and execution functionality comes right out the box.

View full post