Indonesia Singapore ไทย Pilipinas Việt Nam Malaysia မြန်မာ ລາວ
← Back to Blog

Your AI Data Layer Is Only as Smart as What Feeds It

Before activating AI on your customer data, audit whether your data layer can carry context — not just attributes.

An editorial illustration of a figure assembling a complex data pipeline from mismatched pipes and connectors, with an AI brain floating above waiting to receive input
Illustrated by Mikael Venne

The AI data layer isn't a feature upgrade — it's a new infrastructure contract. Here's what CDP teams in Southeast Asia need to get right first.

Most CDPs in production today were built to answer a simple question: who is this customer? The question the next three years will demand is harder — what does this customer mean right now, in this context, across these signals? That’s not a model problem. It’s a data layer problem.

Context Is the New Currency of the AI Data Layer

Tealium’s Nick Albertini makes a pointed argument: every era of digital infrastructure has been defined by the data layer underneath it. Tag management shaped analytics. Event streams shaped personalisation. The AI era, he contends, will be shaped by a layer that doesn’t just collect data — it carries context. The distinction matters enormously for how CDP teams structure their pipelines.

A traditional customer profile stores attributes: purchase history, device type, segment membership. An AI-ready data layer needs to carry something richer — the sequence of intent signals, the recency and decay of behavioural patterns, the relationship between declared preferences and observed contradictions. Without that contextual scaffolding, feeding a customer profile into an LLM or a recommendation engine produces confident-sounding output from poorly understood input. In Southeast Asian markets, where a single user might transact on Shopee, browse on TikTok Shop, and redeem loyalty points in-store within 24 hours, the cross-channel context problem is structurally more complex than most Western CDP playbooks anticipate.

The implementation implication: before you wire AI to your CDP, audit whether your event schema actually preserves sequence and session context — or just snapshots state.

LLMs Are Components, Not Oracles — Structure Your Pipeline Accordingly

One of the more useful reframes circulating in data engineering circles right now comes from Clara Chong at Towards Data Science. Working through a batch of 100 unstructured PDFs, she found that treating an LLM as a monolithic problem-solver produced inconsistent, hard-to-validate output. The fix wasn’t a better prompt — it was building a deterministic loop around the agent: strict input schemas, constrained output formats, explicit validation steps between each stage.

This maps directly to CDP activation challenges. Marketing teams who drop a customer segment into an AI tool and expect it to generate personalised messaging at scale are skipping the engineering discipline that makes that output trustworthy. The LLM is a transformation layer, not a data layer. It needs upstream structure — clean identity resolution, normalised event taxonomy, explicit context fields — before it can reliably produce downstream value.

For teams running multi-language environments across Thai, Bahasa, Vietnamese, and English (standard in any regional CDP deployment), this matters doubly. LLM output quality degrades faster in lower-resource languages when the input context is ambiguous. Garbage in, hallucinated personalisation out.


AI Agents Are Now Data Consumers — Design Your Layer for Them

Monte Carlo’s Lior Gavish surfaces something CDP architects haven’t fully reckoned with yet: the definition of a “user” has changed. For three decades, data infrastructure was designed around a human sitting at a screen. AI agents — autonomous systems that query, reason, and act on data without human intervention — are now operating in production environments. Gavish calls this the Agent Experience (AX) paradigm, and it has direct consequences for how data layers need to be designed.

A unified customer profile built only for human analysts or campaign managers is missing a consumer. AI agents need data that is machine-readable in structure, not just human-interpretable in content. That means explicit data contracts, consistent schema versioning, and metadata that describes not just what a field contains but what it represents — its lineage, confidence level, and freshness. For Southeast Asian brands running AI-assisted customer service on LINE or automated campaign triggers through Grab’s ecosystem, the agent reading your customer data needs to understand that a 72-hour-old cart abandonment signal in Bangkok behaves differently than one in Jakarta.

This is less a technology challenge than a data governance one. The teams that will extract value from AI activation fastest are those who treat their CDP schema as a product with internal consumers — including non-human ones.

Feedback Loops Close the Activation Cycle

The Buffalo Sabres case — an NHL franchise using Alchemer’s AI-powered feedback platform — is an instructive example of what happens when declared data is actually used. The organisation integrated real-time fan sentiment signals into its CX operation, creating a loop where feedback influenced operational decisions that in turn shaped the next feedback cycle. The result was measurable movement across revenue, engagement, and satisfaction metrics, though the precise figures weren’t disclosed.

For CDP teams, the lesson isn’t about sports — it’s about the activation loop. Most platforms are better at ingesting data than closing the feedback cycle. Behavioural and transactional data flows in reliably; declared data (survey responses, preference centre inputs, post-purchase sentiment) often lives in a silo. The AI data layer argument is that these streams need to converge, because a model optimising on behavioural signals alone will miss the declared intent that contradicts the pattern. A Southeast Asian consumer who purchases premium skincare once during a Harbolnas sale looks very different in behavioural data than they do in declared preference — and treating them identically is how you waste retargeting budget.

Implementation note: before adding another ingestion connector, map your feedback data back to your identity graph. If declared signals can’t be resolved to the same profile as your behavioural signals, you’re not closing a loop — you’re opening another silo.

Key Takeaways

  • Audit your CDP event schema for context preservation — sequence, session, and signal decay — before connecting any AI layer on top of it.
  • Treat LLMs as transformation components inside a deterministic pipeline, not as autonomous problem-solvers; strict input schemas upstream are what make output trustworthy downstream.
  • Design your data layer for AI agent consumers, not just human analysts — that means machine-readable contracts, schema versioning, and field-level metadata about lineage and confidence.

The brands that win the next phase of CDP maturity won’t necessarily have the most data — they’ll have the most contextually coherent data. The question worth sitting with: if an AI agent queried your unified customer profile today, would it understand what it was reading, or just what it was seeing?


At grzzly, we work with regional marketing and data teams to get CDPs past the licence-fee-justification stage — architecting data layers that actually serve AI activation, not just reporting dashboards. If your unified profile is more unified in name than in practice, we’ve had that conversation before and we know where the bodies are buried. Let’s talk

Velvet Grizzly

Written by

Velvet Grizzly

Architecting the unified customer profile — stitching together behavioural, transactional, and declared data into platforms that actually earn their licence fee.

Enjoyed this?
Let's talk.

Start a conversation