Designing the AI-Driven Data Foundations
Architecture, Principles, and Practice
A practical guide to building modern data foundations for the AI era — from architecture patterns and platform design to the principles that help enterprises move from experimentation to real outcomes.
Pre-order
Choose your preferred retailer below.
About the Book
In an era where artificial intelligence is reshaping every industry, the foundation of any successful AI initiative lies in its data architecture. Designing the AI-Driven Data Foundations provides a comprehensive guide to building the modern data infrastructure that AI demands.
Drawing on decades of experience advising enterprises on cloud, data, and AI strategy, Sanjeev Mohan delivers a practical framework covering architecture patterns, core principles, and real-world practices for designing data platforms that are not just AI-ready, but AI-driven from the ground up.
Whether you are a data architect, engineering leader, or technology executive, this book equips you with the knowledge to design, implement, and evolve data foundations that power the next generation of intelligent applications.
Table of Contents
An overview of the themes and chapters covered in the book.
The gap between what organizations want from AI and what their data can actually support has never been wider. This Introduction defines what “AI-driven data foundations” really mean, and previews the book’s trade-off-centered, AI-first approach. Written for architects, CDOs, engineers, and the growing number of practitioners who now find themselves making data platform decisions.
A tour of how we got here, from early data warehouses to the modern AI era, and why the architectural principles beneath the churn are more stable than the headlines suggest. Establishes the market dynamics, guiding principles, and building blocks that anchor every decision in the rest of the book.
The systems that run the business, from relational databases to NoSQL, NewSQL, and vector stores. Covers when each category fits, how they interact with AI workloads, and what to watch for as operational and analytical boundaries blur.
Data warehouses, lakes, lakehouses, and the rise of open table formats. A practical framework for choosing between them based on workload, scale, governance needs, and AI readiness.
A nine-dimension framework for evaluating any data store against AI-era requirements, including performance, cost, openness, governance, interoperability, and agent accessibility. Designed to cut through vendor claims and surface what actually matters for your context.
How ingestion, transformation, and orchestration are changing as machines become primary data consumers. Introduces the Extract, Context, Link (ECL) pattern for agent-ready data access and examines modern pipeline architectures, streaming, and the shift from ETL to context engineering.
From dashboards to semantic layers to data products as first-class citizens. Covers how analytics is being reshaped by AI, what data product thinking really requires, and how to build analytical capabilities that serve both humans and agents.
A practitioner's view of the generative AI stack: foundation models, RAG, fine-tuning, MCP, and the emerging world of AI agents. Includes patterns such as the circuit breaker for agent reliability, and honest guidance on where generative AI actually delivers value versus where it stalls.
Governance treated as an enabler, not a brake. Presents a four-layer framework covering the operating model, data and AI products, core capabilities, and enforcement controls. Explains why traditional approaches fall short in AI-driven environments and how metadata makes modern governance actionable.
The controls layer of governance, examined in depth. Covers why “good data” isn’t good enough for AI, the shift from static rules to dynamic discovery, the expanded attack surface from prompt injection to agent hijacking, and emerging obligations such as the EU AI Act.
What happens after deployment, when stateful agents and nondeterministic behavior collide with traditional operations. Traces the evolution from DevOps to DataOps to MLOps, and covers data contracts, DataFinOps, production metrics for agentic systems, and using AI agents themselves for root cause analysis and automated remediation.