Indonesia Singapore ไทย Pilipinas Việt Nam Malaysia မြန်မာ ລາວ
← Back to Blog

AI Analysts Need Trusted Data Foundations to Deliver

AI analysts fail not from model limitations but from dirty pipelines — fix your data trust layer before deploying AI to business users.

A figure constructing a complex data pipeline bridge while AI systems wait on the other side
Illustrated by Mikael Venne

Pointing AI at your data warehouse sounds simple. Here's why trusted data architecture is the real unlock for AI-powered customer analytics in SEA.

Two executives walk into a meeting. They asked the same question of your AI analyst the night before. They got two different numbers. The meeting derails. This is not a hypothetical.

That scenario — documented by Monte Carlo Data’s Lior Gavish — is what happens when organisations skip the unglamorous middle layer between raw data and AI-generated answers. Everyone wants to point Claude, or GPT, or whatever model is trending this quarter, directly at their warehouse and declare victory. The first week looks great. Then the cracks appear.

The uncomfortable truth is that the AI isn’t the problem. The data architecture underneath it is.

The ‘Point and Pray’ Approach Fails at Scale

Large language models are phenomenally good at reasoning over well-structured, well-labelled, semantically consistent data. They are equally good at confabulating plausible-sounding nonsense when the data is ambiguous — joining the wrong tables, misinterpreting field names, or surfacing metrics that don’t share a common definition.

Monte Carlo Data identifies the core failure mode clearly: without a governed semantic layer sitting between the warehouse and the AI, every query becomes a fresh interpretation exercise. Your AI analyst might define ‘active user’ as someone who logged in this month. Your product team defines it as someone who completed a core action. Both are defensible. Neither is consistent. At that point, you don’t have an analytics capability — you have a very expensive disagreement machine.

The fix isn’t more prompting. It’s a metrics layer — a single, versioned, business-logic-encoded definition of every KPI — that the AI queries against, not around. Teams using dbt Semantic Layer, Looker’s LookML, or Cube.js are already building this. The AI sits on top. The truth lives in the layer.

Real-Time Context Is the New CDP Mandate

Tealium’s CMO Heidi Bullock summarised the mood at Digital Velocity New York with refreshing directness: the next decade belongs to teams that can turn trusted, real-time customer data into machine-consumable context — for both human analysts and AI agents.

That framing matters because it shifts the CDP conversation away from ‘unified profile as a reporting asset’ toward ‘unified profile as an operational input.’ The customer data platform is no longer just a place data goes to rest. It’s the connective tissue between behavioural signals, transactional history, and declared preferences — and it needs to emit that context in real time to wherever a decision is being made: a personalisation engine, a next-best-action model, an AI agent fielding a customer service query.

For Southeast Asian brands, this is especially pointed. A customer who browsed on Shopee, purchased through a LINE Official Account, and redeemed loyalty points in-store has left signals across four separate data environments. Without a CDP that stitches those touchpoints into a coherent, timestamped identity, your AI agent is working from a partial script. It will personalise confidently and incorrectly — which is arguably worse than not personalising at all.


Governance Isn’t a Tax — It’s What Makes AI Trustworthy

The instinct in most organisations is to treat data governance as a compliance cost: something the legal and IT teams care about, something that slows down the data team. That framing is now actively dangerous.

When AI systems are querying your data and surfacing answers to business stakeholders who lack the context to interrogate those answers, governance becomes a product feature. It’s what allows you to say, with confidence, that the number your CMO sees on Monday morning is the same number your Head of eCommerce sees — and that both match the figure in the board pack.

Monte Carlo’s framework for making AI a trusted analyst maps directly onto CDP implementation best practice: document your data contracts, monitor for schema drift, flag anomalies before they reach the model, and version your metric definitions as rigorously as you version your code. None of this is exotic. All of it is skipped in the rush to demo an AI analytics capability to leadership.

The brands getting durable value from AI-powered data activation in this region — think regional retailers running demand forecasting across fragmented logistics networks, or telcos personalising prepaid offers at scale — have one thing in common: they invested in the plumbing before they invested in the interface.

Activation Is Only as Good as the Signal Feeding It

There’s a reason the best CDP implementations treat data quality as a continuous process rather than a launch-phase checklist. Customer behaviour in Southeast Asia is high-frequency and cross-platform by default. A user in Jakarta might interact with a brand across a mobile app, a WhatsApp Business account, an in-store QR redemption, and a Tokopedia product listing — all within 72 hours. Each touchpoint carries signal. Most brands are capturing fragments.

The Tealium thesis — collection to action — is the right mental model, but collection without curation is just noise at higher volume. The activation layer, whether that’s a journey orchestration tool, an AI recommendation engine, or an LLM-powered agent, will inherit whatever assumptions and errors live in your ingestion pipeline. Garbage in, confident AI out.

Practically, this means CDP implementations should prioritise three things before AI activation: identity resolution with explicit conflict-resolution rules (not just probabilistic matching), event schema governance with mandatory field validation at ingestion, and a data observability layer that alerts the data team — not the AI — when something breaks.

Those aren’t exciting deliverables to present to a CMO. But they’re the difference between an AI analyst that builds trust and one that quietly erodes it.

Key Takeaways

  • Deploy a governed semantic layer between your warehouse and any AI analytics tool — consistent metric definitions are non-negotiable before you give AI access to business stakeholders.
  • Treat your CDP as an operational real-time context engine, not a reporting asset — AI agents need timestamped, identity-resolved signals to personalise accurately across Southeast Asia’s fragmented platform ecosystem.
  • Build data observability into your ingestion pipeline from day one — anomaly detection should alert humans before errors reach the model, not after a stakeholder meeting goes sideways.

The organisations that will define AI-powered marketing in Southeast Asia over the next three years aren’t the ones with the most sophisticated models. They’re the ones that did the unglamorous work of making their data trustworthy first. The question worth sitting with: does your current data architecture deserve the AI you’re about to put on top of it?


At grzzly, we spend a lot of time in exactly this space — auditing data foundations, designing CDP architectures, and helping growth teams in Southeast Asia build the trust layer that makes AI activation actually work. If your team is navigating the gap between ambitious AI use cases and the messy reality of your current data stack, we’d like to be useful. Let’s talk

Velvet Grizzly

Written by

Velvet Grizzly

Architecting the unified customer profile — stitching together behavioural, transactional, and declared data into platforms that actually earn their licence fee.

Enjoyed this?
Let's talk.

Start a conversation