Microsoft Fabric and OneLake: The Unified Analytics Platform Guide for 2026

One capacity, seven workloads, one lake. How Fabric actually fits together, with a realistic cost model, Direct Lake patterns, and the gotchas nobody puts in the brochure.

S
Sai Kiran Pandrala

OneLake, the OneDrive for data

Microsoft Fabric is built on OneLake: a single, tenant-wide, multi-region lake that stores everything in Delta Lake format. Every Fabric workload reads and writes to OneLake by default. Different workloads (Data Engineering, Warehouse, Real-Time Intelligence, Data Science, Power BI) share the same data without copying.

This solves the historical data platform headache: data silos between warehouses, lakes, BI semantic models, and real-time stores. In Fabric you don't have a warehouse and a lake, you have OneLake, and it's both.

  • Lakehouses, a workspace containing Delta tables + files; SQL Analytics Endpoint gives read-only T-SQL.
  • Warehouses, read/write T-SQL, transactional, with Delta storage underneath.
  • KQL Databases, real-time event store on Delta.
  • Datasets (Power BI semantic models), either Import, Direct Lake, or DirectQuery.
  • Shortcuts, virtual links to data in S3, ADLS Gen2, Dataverse, or another Fabric workspace, without copying.

Seven workloads, one capacity

WorkloadWhat it doesReplaces (traditional)
Data FactoryPipelines, dataflows, copy jobsAzure Data Factory + SSIS
Data EngineeringSpark notebooks, lakehousesDatabricks / Synapse Spark
Data WarehouseT-SQL warehouse with DeltaSynapse Dedicated SQL pool
Data ScienceML notebooks, experiments, MLflowAzure ML + Databricks ML
Real-Time IntelligenceEventstreams, KQL databases, activatorEvent Hubs + ADX + Logic Apps
Power BISemantic models, reports, dashboardsPower BI Premium
Activator (Reflex)Trigger-based automation on streamsLogic Apps / Azure Functions

The shared asset: capacity. You buy an F-SKU (F2, F4, F8, … F2048). Every workload consumes capacity units (CUs). Unused CUs roll up, bursty workloads draw more for a short time ("smoothing"). This is the core economic abstraction, and the thing most Fabric adopters budget wrong.

Direct Lake, the feature that changes Power BI

Direct Lake is Power BI's third storage mode. In Import mode, Power BI holds a copy in-memory, fast, but stale and slow to refresh. In DirectQuery, every visual query hits the source, fresh, but slow. Direct Lake reads Parquet files directly from OneLake into the Power BI engine, page-by-page, at Import-mode speed, but with no refresh step and no staleness.

For data that lives in OneLake (or is shortcut in), Direct Lake is the default answer. Expect:

  • Sub-second dashboard refresh on petabyte-scale tables.
  • No more "refresh failed" emails at 6am.
  • Fallback to DirectQuery automatically for queries Direct Lake can't serve (rare, watch your model design).
If you're still importing nightly into Power BI from a data warehouse in 2026, you have tech debt. Migrate the fact tables to OneLake, then switch semantic models to Direct Lake. Your team gets a week of their life back.

Mirroring, free CDC from your operational databases

Fabric Mirroring continuously replicates operational databases (Azure SQL, Azure Cosmos DB, Snowflake, Databricks Unity Catalog tables, on-prem SQL Server via private links) into OneLake as Delta tables. It's near-real-time (seconds of lag), transactionally consistent, and free at the mirror side, you pay only on the source.

Why this changes things

  1. No more ETL pipelines. Most of Azure Data Factory's job was replicating operational data to analytics. Mirroring does it with zero engineering.
  2. The data lake is continuously fresh. Analytics run on data that's minutes (sometimes seconds) old.
  3. You can do real-time HTAP on top of Azure SQL without Business Critical tier.

When Mirroring won't cover you

  • Non-supported sources (PostgreSQL, MySQL, Oracle, pipeline it, or use Fabric's open-mirroring partner connectors).
  • Required transformations before data lands (use Dataflow Gen2 or notebooks on top of the mirror).
  • Strict schema versioning (you'll need DLT or notebook-driven Silver/Gold layers).

Capacity sizing, the 15-minute version

SKUCUs~Price/mo (PAYG)Reserved 1yrSuits
F22$263$156Dev/learning
F88$1,051$623Small team, 1-2 reports, small lakehouse
F3232$4,205$2,490Mid org, many users on Direct Lake
F6464$8,410$4,981Replaces Power BI Premium P1; broad enterprise use
F256256$33,640$19,922Heavy Spark/DW loads, large BI userbase

Rules of thumb:

  • Start one size smaller than you think. Capacity smoothing absorbs bursts.
  • Pause dev capacities at night (1-click; you're billed by the minute).
  • Monitor with the Capacity Metrics app. If you see throttling more than 2% of minutes, size up.
  • Reserved pricing cuts ~40%, commit once you have 2 months of telemetry.

The Fabric Medallion architecture

The Medallion pattern (Bronze → Silver → Gold) still applies, with a Fabric twist:

  1. Bronze. Raw data lands via Mirroring, pipelines, or Eventstreams. Delta tables, append-only.
  2. Silver. Cleansed and conformed. Notebooks or Dataflow Gen2 transform Bronze to Silver with schema enforcement, PII handling, SCD2.
  3. Gold. Aggregated, business-facing. Star schemas for Power BI Direct Lake. One table per business concept.

In Fabric, each layer is a separate lakehouse in a dedicated workspace, with shortcuts where convenient. This provides access control per layer, analysts get Gold, engineers get Silver and Bronze.

Which workload for which transformation?

TransformationBest tool
Small-to-mid Dataflow (up to ~5M rows)Dataflow Gen2 (low-code)
Large batch ETL / schema evolutionNotebooks (PySpark or T-SQL)
Set-based SQL transformationsWarehouse T-SQL
Real-time streamsEventstream → KQL DB → Activator
ML feature engineeringData Science notebooks + MLflow

Governance with Purview + Fabric

Fabric integrates with Microsoft Purview for classification, sensitivity labels, and lineage. Each OneLake item can have a Purview sensitivity label; downstream reports inherit it; DLP policies block or warn on export of sensitive content.

Minimum governance to turn on

  • Label every lakehouse and warehouse (Public/General/Confidential/Highly Confidential).
  • Enable Purview auto-labelling with trainable classifiers (financial, HR, etc.).
  • Add Purview DLP on "Fabric and Power BI" location.
  • Turn on Fabric Information Protection for downloadable artefacts.
  • Audit exports to Excel/CSV; require justification for Highly Confidential exports.
  • Enforce RLS (row-level security) and OLS (object-level security) in Power BI semantic models.

Real-Time Intelligence, the new Eventhouse pattern

Real-Time Intelligence brings together Eventstreams (Kafka / Event Hubs / CDC ingestion), KQL Databases (time-series analytics), and Activator (trigger-based automation). It's what you build when you need dashboards and alerts on seconds-old data.

Use cases that actually need this

  • IoT device telemetry with anomaly alerts.
  • Financial trading surveillance.
  • Fraud detection triggered by unusual patterns.
  • Operational dashboards for call centers, logistics, retail ops.
  • AIOps, correlating logs/traces/metrics from your cloud infrastructure in real time.
If your "real-time" dashboard is refreshing every 15 minutes, you don't need Real-Time Intelligence, Direct Lake is enough. Use RTI when the SLA is seconds.

Data Science in Fabric, MLflow, autoML, and ModelOps

Fabric Data Science packages Jupyter-compatible notebooks with MLflow tracking, model registry, autoML (via SynapseML / LightGBM / scikit-learn), and model-serving endpoints. For SMB and mid-market teams who don't have full Azure ML or Databricks, it's enough.

Integration with OneLake means you train on Delta tables that are continuously updated, and your inference can be batch-scored back into Delta for downstream BI, no extra plumbing.

Limits to know: GPU clusters are supported but pricier than raw Azure ML. Custom Docker environments are limited. For heavyweight ML (distributed training, custom PyTorch runtimes, research workflows), stay on Azure ML or Databricks ML.

Migrating from Synapse / ADF / Power BI Premium

Most Fabric adopters come from Synapse Analytics, Azure Data Factory, or Power BI Premium. Migration playbook:

  1. Power BI Premium P1 → Fabric F64. Same compute, now with OneLake and all workloads. Premium capacities can be upgraded in-place by Microsoft support.
  2. ADF pipelines → Fabric Data Factory. Most activities map 1:1. Self-hosted integration runtimes still work (called "on-premises data gateway" in Fabric).
  3. Synapse Dedicated SQL Pool → Fabric Warehouse. Schema and data portable. CETAS + BACPAC + scripts.
  4. Synapse Spark → Fabric Data Engineering. Notebooks migrate with minor path updates for OneLake.
  5. ADX → Fabric KQL DB. Data can be shortcut from ADX into Fabric until you're ready to move fully.

Don't do a big-bang migration. Move one workload at a time; run dual-write for a month; switch consumers; retire the old.

Capacity planning: sizing F-SKUs without burning the budget

Company sizeWorkload profileStarting SKUUpgrade trigger
< 50 usersPower BI + light pipelinesF2–F4Throttling events > 5/week
50–200Warehouse + Power BI + DSF16–F32CU 90th percentile > 80%
200–1000Enterprise BI + RTIF64Concurrent interactive queries > 120
1000+Multi-workload, multi-tenantF128–F256Use multi-capacity + workload isolation

Levers that let you stay on a smaller SKU: (1) schedule heavy pipelines off-peak with smoothing, (2) move reports from Import to Direct Lake where freshness allows, (3) drop Auxiliary Logs on Sentinel and similar cold data out of Fabric into OneLake shortcuts, (4) pre-aggregate hot dashboards into materialized views.

Medallion done right: concrete pipeline patterns

  1. Bronze. Land raw data with schema-on-read. Never mutate. Partition by ingestion date. Retain 30–90 days then archive to OneLake cool tier.
  2. Silver. Type-cast, dedupe, apply SCD2 where needed. One notebook per source. Test with dbt or Great Expectations before merging.
  3. Gold. Business-ready marts. Star schema. Slowly-changing dims separated from facts. Refreshed incrementally, not full.

Anti-patterns to avoid: business logic in Bronze ("just add a calc column"), Gold tables nobody owns, and nested notebooks > 3 deep (makes debugging impossible).

Ten levers to cut Fabric spend by 30% without losing capability

  • Move small reports to a shared capacity instead of dedicated F-SKU
  • Schedule pipelines at night when CU is idle (smoothing does the rest)
  • Use Direct Lake on Lakehouse for dashboards, no import refresh CU cost
  • Prune unused Power BI datasets; each refresh costs CU
  • Switch to Incremental Refresh on any dataset > 1 GB
  • Replace Copy Activity with Shortcuts when data lives in ADLS / S3 / GCS
  • Use Mirroring instead of nightly ingest from Cosmos / Snowflake / SQL
  • Drop raw JSON that you transform immediately, keep Parquet only
  • Apply V-Order + optimize write to keep small files under control
  • Pause F-SKU during weekends in non-prod tenants

Migrating from Synapse: a pragmatic 6-week plan

Treat Synapse-to-Fabric like any warehouse migration: don't lift and shift, refactor as you go.

WeekActivityOutput
1Inventory datasets, pipelines, workspacesMigration backlog
2Spin up Fabric F-SKU, import Power BI workspacesReports live on Fabric
3Shortcut ADLS folders into OneLakeNo data movement yet
4Rebuild Gold tables as Lakehouse + WarehouseDual-write for 2 weeks
5Convert pipelines to Fabric Data Pipelines / NotebooksOrchestration moved
6Cut over reporting, decommission SynapseOne platform, lower bill

Direct Lake: when it is magic, when it isn't

Use Direct Lake whenFall back to Import when
Tables fit comfortably in capacity memory (F-SKU dependent)Heavy calculated columns / calculated tables needed
You want near-real-time freshness without scheduled refreshComplex RLS with dynamic security patterns
Delta tables land via OneLake, Shortcuts, or MirroringNon-OneLake data sources that are not shortcut-able
Users hit the report less than 200x/hourHigh-concurrency dashboards (use Composite + DQ)

Tactical tip: mix Direct Lake for the fact table and Import for dim tables you want to enrich with DAX-heavy logic. This is the "compose modes" pattern and it is officially supported in 2026.

Real-Time Intelligence patterns worth copying

  1. Ingest once, fan out many. Eventstream lands a topic into both Eventhouse (hot analytics) and OneLake (cold archive) in one definition.
  2. Materialised views on Eventhouse for the five metrics your ops team watches. Sub-second on billions of rows.
  3. Reflexes fire when thresholds break - Teams message, webhook, Power Automate. Replace hand-rolled alerting.
  4. Fabric Real-Time Dashboard with auto-refresh 5-10s. Ops TVs finally work without a bespoke web app.

Typical latency end-to-end for the full chain: 2-5 seconds from device emit to dashboard. Good enough for 99% of operational use cases.

Smoothing, throttling, and the 24-hour window

Fabric's capacity metrics confuse most finance teams. The short version: you pay for a reserved CU, you burn within a 24-hour moving window, and Microsoft throttles only when sustained burn exceeds your ceiling.

  • Burst workloads absorb into the smoothing window - schedule heavy pipelines at 2am and they cost nothing extra.
  • Autoscale pause on schedule during weekends for dev/test tenants.
  • Workload isolation: separate F-SKU for Real-Time Intelligence if it is noisy - Power BI users won't see freezes.
  • Pay-as-you-go bursting: turn it on for month-end spikes; turn it off the rest of the time.

A well-tuned F64 can carry a 500-user BI load plus nightly Data Engineering. Poorly tuned, the same workload needs F128. Do the tuning before you resize.

Organizational readiness: the non-technical success factors

Fabric wins or loses based on organizational shape more than technical choices. Four patterns predict success.

Pattern 1 - a named Fabric admin, not a committee

Every successful tenant has one person whose calendar reflects Fabric ownership - capacity assignment, workspace governance, and cost review. Every failed tenant has a committee. If you can't name the Fabric admin today, name one before week two.

Pattern 2 - domains before workspaces

Set up domains first - Finance, Sales, Operations, Product. Workspaces live inside domains with inherited governance. Without domains, workspaces proliferate and nobody knows where the real data lives six months in.

Pattern 3 - certified datasets, not certified ambitions

Endorse datasets that have a named owner, documented refresh schedule, and business approval. Everything else is Promoted at best. Users learn quickly to trust certification - or distrust it - based on early consistency.

Pattern 4 - the data literacy program

Buying F64 and expecting adoption without training is like buying a plane without pilots. Run a monthly Fabric academy: one hour live, one hour recorded, one hour office hours. Every business unit sends two representatives. After three months, you have 24-60 internal champions and you stop needing external consultants for most tasks.

Budget expectations

Typical 500-user mid-market Fabric program, year one: F64 at roughly $100K/yr, Power BI Premium Per User for 40 makers at roughly $20K/yr, implementation partner at $150-250K one-time, internal training at $40K. Year two it halves because implementation is behind you and capacity is right-sized.

Fabric rewards organizations that invest in data literacy. It punishes organizations that expect technology to compensate for missing roles and processes. Decide now which one you are.

The one-page operating model for Fabric

Fabric scales when the operating model is clear. Keep it to one page and everyone in the org can see where they fit.

  • Central platform team. Owns capacity, domains, governance, cost reporting, certified items, security. 3-6 people at most organizations.
  • Business-unit data teams. Own domain workspaces, build Gold marts, build certified Power BI datasets. 1-3 people per business unit.
  • Makers and analysts. Build reports in personal workspaces, promote to business-unit workspace when ready, certified by platform when it graduates to tenant-wide.
  • Data engineering guild. Horizontal community of practice across business units. Shares notebooks, pipeline patterns, and lessons. Monthly guild call.
  • Governance council. Quarterly. Reviews cost, risk, capacity, roadmap. Chaired by the Fabric admin. Members include data protection, security, finance, and the CDO or equivalent.

Skip any of these five and the weak link becomes the bottleneck within two quarters. Invest in all five early and Fabric becomes a durable platform rather than an expensive experiment.

Fabric in one sentence - and what it means for your roadmap

Fabric is Microsoft's bet that the future of analytics is a single lake, a single SKU, and a single governance plane. If the bet pays off, every organization eventually lands here. If it doesn't, Fabric becomes a friendlier face on the same Synapse-plus-Power-BI stack you already know - which is still useful.

Either way, the move is the same: land your data in OneLake once, govern it once, and let different engines serve different workloads. Do this and your platform becomes portable, your costs become visible, and your analysts stop arguing about the canonical source. Skip it and you end up with the same silo war you had in 2018 with a more expensive logo.

Start small - one domain, one capacity, one certified dataset. Prove value in 90 days. Scale from success. That is the path that worked for every customer I have seen succeed, and the path that the successful ones always wish they had started sooner.

Tools that pair well

  • dbt for Fabric, Delta Lake transformations with modular SQL and tests.
  • Great Expectations, data quality checks on Silver/Gold.
  • Git integration, Fabric items now version in GitHub/Azure DevOps natively.
  • Tabular Editor 3, heavy-duty semantic model editing.
  • DAX Studio, query profiling.
  • VS Code + Fabric extension, local development for notebooks and pipelines.
  • Open-source Delta-rs, read/write Delta tables from Python/Rust for ad-hoc local work.
  • DuckDB, query Parquet files from OneLake for prototyping without spinning capacity.

Frequently Asked Questions

Can I use Fabric without Power BI Premium?

Yes, any F-SKU includes Power BI Pro-equivalent capability for report consumption. Report authors still need a Power BI Pro license. F64 and above effectively replace Premium Per Capacity (P-SKUs); below F64 you pay per-user for Pro.

Is Fabric a replacement for Databricks?

For many mid-market and Power BI-centric organisations, yes. For serious data engineering, ML/AI platforms, and open-source-heavy teams, Databricks still has depth advantages. The two coexist well, OneLake shortcuts can point at Databricks Unity Catalog tables and vice versa.

How is Fabric priced versus the old Synapse + Power BI Premium combo?

For equivalent compute, Fabric's unified F-SKU is roughly the same total cost as separate Synapse + Power BI Premium licenses, but utilisation is far better because unused capacity in one workload absorbs bursts in another. Most adopters save 10-25% at equal coverage.

Can I shortcut to AWS S3 data?

Yes. OneLake shortcuts can virtualise data from S3, ADLS Gen2, Google Cloud Storage, Dataverse, and other Fabric workspaces without copying. Permissions and egress fees apply at the source.

What happens if I exceed my capacity?

Capacity smoothing absorbs short bursts (up to 24 hours). Sustained overage triggers throttling, individual operations may be rejected or delayed. The Capacity Metrics app shows exactly when and where throttling happens. Solution is either scaling up, optimizing the workload, or enabling autoscale for Data Science only.

Is Fabric data really in Delta Lake format?

Yes. Lakehouse tables, Warehouse tables, and KQL DB tables all persist to Delta in OneLake. External tools that can read Delta (Databricks, Spark, Trino, DuckDB with delta-kernel-rs, Python with deltalake) can read Fabric data directly. This is a genuine open-format commitment.

#Microsoft Fabric#OneLake#Power BI#Data Factory#Synapse#lakehouse#Direct Lake#data platform

Join the HowToFixMe

One email every Sunday. Microsoft, Azure, AI, and the automations that actually save you hours.