During FabCon Poland 2025, the Microsoft Fabric conference organised by Onex Group, one of the most important architectural topics was the role of OneLake as the foundation of the entire Fabric platform. Microsoft often refers to OneLake as “OneDrive for data” and this comparison goes far beyond a catchy metaphor. In practice, OneLake is the layer through which all data in Fabric is stored, accessed, processed and governed.
From a data architecture perspective, OneLake addresses challenges that organisations have struggled with for years: fragmented data estates, siloed platforms, inconsistent security models and increasing pressure to deliver analytics and AI at scale. In this context, OneLake becomes the common foundation not only for reporting and analytics, but also for machine learning and generative AI use cases.

OneLake in Microsoft Fabric as a single, central data repository
OneLake is designed as a single, unified storage layer for all organisational data, regardless of where that data originates or in what format it exists. It allows organisations to consolidate data coming from the Microsoft ecosystem, such as Azure, Dynamics and Microsoft 365, as well as from external platforms including AWS, Google Cloud Platform, Oracle and on-premises systems.
A key principle behind OneLake is centralisation without lock-in. Structured, semi-structured and unstructured data can coexist within the same repository. Traditional tabular data sits alongside CSV, JSON and XML files, while documents such as PDFs and images can also be stored and analysed, increasingly in the context of generative AI scenarios.
From an AI readiness standpoint, OneLake does not introduce new requirements. The same fundamentals that enable trustworthy analytics apply equally to AI. Data must be well-modelled, contextualised and aligned with business logic. Without that foundation, neither reporting nor GenAI can deliver meaningful results.
Ingesting data into OneLake: shortcuts, transformations and mirroring
The first stage of the data lifecycle in OneLake is ingestion. Microsoft Fabric provides several mechanisms that allow organisations to ingest data both through traditional integration patterns and through significantly simplified approaches that reduce engineering effort.

One of the most important mechanisms is shortcuts. Shortcuts create virtual connections to Delta-formatted data stored outside of Fabric. They can point to data in AWS S3, Google Cloud Storage, Azure or even another Fabric tenant. The data is not physically copied when the shortcut is created. Instead, Fabric exposes the table structure, while the actual data is read only when it is queried.
This approach is particularly valuable in multi-cloud environments. For example, data generated in Databricks on AWS can be exposed in Fabric as if it were native, enabling immediate joins and analysis using SQL or other Fabric engines without duplication or complex ETL pipelines.

For data delivered in formats such as CSV, JSON or XML, Fabric introduces Shortcut Transformations, currently available in preview. This capability automatically converts semi-structured data into Delta format without requiring custom pipelines. The mechanism listens for changes in a defined location, detects new or modified files and continuously applies those changes to the data stored in OneLake.
These transformations can be enriched with AI capabilities. Text can be translated, summarised, analysed for sentiment or scrubbed for sensitive information as part of the ingestion process. This enables near-real-time scenarios where data is not only ingested, but also semantically processed as it arrives.
A third key integration mechanism is mirroring. Known from earlier database technologies, mirroring in Fabric enables continuous synchronisation between operational systems and OneLake. It is particularly relevant for systems such as Azure SQL Database or Dataverse, where analytical models and reports must reflect operational changes quickly. Mirroring removes the need to build complex pipelines for change detection and data reconciliation.
A unified data format across Fabric engines
One of the most significant architectural changes introduced with Microsoft Fabric is the use of a unified data format. All Fabric engines, including Spark, the SQL-based Data Warehouse, Real-Time Analytics and Analysis Services, operate on the same data stored in OneLake.

This eliminates the need to copy data between layers or systems to support different workloads. Transformations performed in Spark are immediately visible to the Data Warehouse and can be consumed by Analysis Services through Direct Lake. The result is a dramatically shortened data lifecycle and a much simpler architecture.
By removing repeated data movement between staging, warehousing and reporting layers, organisations reduce complexity, improve consistency and accelerate time to insight.
OneLake as the foundation for AI and integration with Microsoft Foundry
OneLake also plays a central role in AI-driven architectures. Data stored in OneLake can be consumed by analytics workloads and by AI platforms integrated with Fabric.
In this context, Microsoft Foundry acts as the AI counterpart to Fabric. While Fabric enables end-to-end analytics, Foundry supports advanced AI workloads including machine learning, generative AI and AI agents. Crucially, these solutions do not require a separate data layer. They access data directly from OneLake, using the same governed and secured foundation.

Fabric IQ further complements this approach by introducing a structured understanding of data semantics and business meaning. This logical description of data, long familiar in data governance practices, becomes increasingly important in GenAI scenarios where context is critical.
Security, governance and data discovery in OneLake
Centralising data requires a consistent and enforceable security model. OneLake Security allows organisations to define row-level and column-level security policies that apply across all Fabric engines and beyond. The same policies are respected when data is accessed via Power BI, Excel or other integrated tools.
Data governance and discoverability are addressed through the OneLake Catalog, which serves as a central hub for discovering data assets across the organisation. Users can explore available datasets, understand metadata, review security policies and see how assets are structured. Integration with Microsoft Purview enables a broader, enterprise-wide view of data assets.

Discovery is further enhanced through Copilot, which allows users to search and explore data assets using natural language, making large and complex data estates more accessible.
Key takeaways: why centralised data is critical for analytics and AI
AI readiness does not fundamentally differ from analytics readiness. Both require high-quality, well-modelled data grounded in business logic. OneLake provides the architectural foundation that enables this at scale.
Centralisation, unification and security are essential, but technology alone is not enough. Successful analytics and AI initiatives also require operational discipline, clear communication and a strong understanding of business processes. Fabric and OneLake provide the implementation layer, but value is realised only when they are embedded within a broader organisational strategy.