Skip to main content
Data Platform

Delta Lake on Microsoft Fabric: Migration Gotchas & Fixes

Migrating Delta Lake workloads to Microsoft Fabric is not a lift-and-shift. The OneLake storage model, the way Fabric handles Delta table registration, and the quirks around external shortcuts mean there are specific failure modes that appear only after you go live. Here are the ones we hit in production.

Amit Kumar Singh - Technology Consulting Partner at MyData Insights

Technology Consulting Partner · MyData Insights

14+ years in industrial data · Former Accenture & EY · GCC, India, SEA

25 May 2026 · 12 min read

The bottom line

Delta Lake on Fabric works well but has specific migration traps: ADLS shortcuts need explicit Delta log paths, schema evolution settings differ from Databricks defaults, and the Lakehouse SQL endpoint needs table registration before Power BI can read via Direct Lake.

Why Migrate Delta Lake to Fabric

Most organisations arriving at Microsoft Fabric already have Delta Lake data somewhere — usually Azure Data Lake Storage Gen2 (ADLS), sometimes Databricks, occasionally Azure Synapse Analytics. The migration to Fabric is attractive because Fabric unifies the analytical layer: one platform for data engineering, BI, and AI instead of three. The OneLake storage model means a single Delta table is queryable by Spark notebooks, the SQL analytics endpoint, Power BI via Direct Lake, and the Fabric Data Agent without copying data. That is the architectural win.

What the documentation does not adequately cover is the specific ways the migration fails in practice. We have run this migration for clients in manufacturing, FMCG, and logistics where the source data lives in ADLS Gen2 with existing Delta tables, and the same set of issues appears every time.

OneLake Shortcuts: The First Trap

OneLake Shortcuts let you point a Fabric Lakehouse at external data in ADLS, S3, or GCS without copying it. The trap is that when you create a Shortcut to an ADLS container that holds a Delta table, Fabric does not automatically recognise it as a Delta table. The Shortcut creates a file path reference — not a managed Delta table. If you navigate to the Lakehouse Files explorer, the folder appears, but it does not show up in the Tables section, and the SQL analytics endpoint cannot query it.

The fix is explicit Delta table registration. After creating the Shortcut, you need to run a notebook command to register the table against the SQL endpoint. Only after this step does the table appear in the SQL endpoint and become queryable by Power BI Direct Lake.

OneLake Shortcuts are a pointer to data, not a Delta table registration. You always need the explicit CREATE TABLE step to make the table visible to the SQL endpoint.

Table Registration and the SQL Endpoint

The Fabric Lakehouse SQL analytics endpoint is an auto-generated, read-only SQL endpoint that exposes all managed Delta tables in the Lakehouse. "Managed" here means tables created via spark.sql CREATE TABLE or tables written using DeltaTable.createOrReplace() in PySpark — not just files that happen to be in Delta format sitting in the Files section. This catches every team migrating from Databricks who expects the table catalog to auto-populate.

Another gap: if you are migrating from Azure Synapse Analytics, Synapse external tables defined over ADLS do not migrate to Fabric automatically. You need to re-create the table definitions in the Fabric Lakehouse using the CREATE TABLE syntax. The column definitions, partition columns, and Delta properties all need to be explicitly set if they differ from defaults.

Schema Evolution Differences

Databricks enables schema evolution on Delta tables by default in many notebook contexts. Fabric does not. If your existing Delta pipelines use mergeSchema=True in the DataFrameWriter or autoMerge=True in Delta table properties, you need to set these explicitly in Fabric. The Databricks default session variable for auto-merge does not exist in Fabric Spark — set mergeSchema on write operations explicitly.

Column mapping mode is another schema evolution gotcha. Databricks columns with spaces or special characters are handled by column mapping. If your source tables use column mapping and you migrate the Parquet files to Fabric without explicitly setting this Delta table property, column reads will fail with cryptic errors. Always check the source Delta table properties before migration using DESCRIBE EXTENDED in Databricks and recreate any non-default properties in Fabric.

Direct Lake Mode Gotchas

Direct Lake mode in Power BI reads Delta tables from OneLake directly without import, giving import-speed query performance on live data. The current per-table row guardrails are 1.5 billion rows at F64 and 300 million rows at F32, scaling up with capacity SKU. There is no support for calculated columns that reference DirectQuery sources. Composite Direct Lake + Import semantic models are in public preview as of May 2026, so mixed-mode tables in the same model are now possible — but not yet GA. Tables that exceed the SKU row guardrail fall back to DirectQuery against the SQL endpoint, with the corresponding performance drop.

The framing fallback to DirectQuery is the most common surprise. When a Direct Lake model hits a query it cannot serve from the in-memory column store — usually a complex DAX measure or a query that exceeds the row limit — it falls back to DirectQuery against the SQL endpoint. The performance difference shows up in Power BI Performance Analyzer as queries taking 3-5 seconds instead of under 200ms. The fix is to optimise the Delta table: run OPTIMIZE with ZORDER on the columns most commonly filtered, and check that the Lakehouse table statistics are up to date by running ANALYZE TABLE.

A Migration Sequence That Holds Up in Production

The order that avoids the late surprises is consistent. First, inventory the source: run DESCRIBE EXTENDED on every Delta table in ADLS or Databricks and record the non-default properties — column mapping, deletion vectors, auto-merge, partition columns. That inventory is the migration plan; the failures in this guide all trace back to a property nobody recorded. Second, decide per table whether it is a OneLake Shortcut (data stays in ADLS, Fabric reads in place) or a managed copy into the Lakehouse — Shortcuts for data you do not own the write path to, managed tables for anything Fabric will OPTIMIZE or VACUUM.

Third, register every table explicitly against the SQL endpoint with CREATE TABLE — Shortcuts and stray Delta files in the Files section do not auto-populate the catalog, and Power BI Direct Lake cannot see a table the SQL endpoint cannot. Fourth, set schema-evolution and column-mapping properties explicitly on write, because Fabric does not inherit the Databricks session defaults. Only then build the Power BI Direct Lake semantic model, and immediately run OPTIMIZE with ZORDER on the common filter columns plus ANALYZE TABLE so the first user query lands in the column store, not a DirectQuery fallback.

Validate before you cut over. Confirm row counts match the source, run the heaviest report through Performance Analyzer to prove sub-200ms reads, and check no table silently exceeded the SKU row guardrail. A migration that skips the validation pass looks done and fails in week two — which is exactly when the business has started trusting it.

Where This Still Breaks

Governance metadata does not travel. Databricks Unity Catalog carries column-level security, lineage, and tag properties that have no automatic equivalent in Fabric — the Delta files themselves are fully compatible, but the access model is not. If row/column security was enforced in Unity Catalog, you rebuild it in Fabric with workspace roles, OneLake data-access roles, and semantic-model RLS before go-live, or you ship a migration that quietly drops your security posture.

Shared write paths are the second trap. A OneLake Shortcut that lets Fabric Spark run VACUUM or OPTIMIZE against an ADLS location still owned by another pipeline means two engines mutating the same Delta log — a recipe for corrupted transactions. Decide a single writer per table, and keep Fabric read-only on any Shortcut whose source another system still writes.

And the honest "do not migrate yet" cases: a heavy interactive Spark ML workload that depends on Databricks-specific runtime features may not have a clean Fabric equivalent today, and a handful of bleeding-edge Delta 4.x table properties still need removing before they read on older Fabric runtimes. Check each flag against your target runtime. Fabric unifies the estate well — but lift-and-shift assumes a compatibility that the property inventory exists to verify, not assume.

The Delta files are compatible; the catalog, security, and table properties are not. The migration work is recreating those — which is why a property inventory, not a data copy, is the real first step.

What This Means for the Data Lead

The architectural prize is real: one platform for data engineering, BI, and AI instead of three, with a single Delta table readable by Spark, the SQL endpoint, Power BI Direct Lake, and the Fabric Data Agent without copies. That consolidation is usually where the cost and operational case for the move lives. But it is earned through the property-and-catalog work, not delivered by a lift-and-shift — the teams that treat it as a copy job are the ones debugging cryptic column-read errors in production.

Scope the timeline by schema complexity, not data volume. Ten to twenty tables with straightforward schemas and no Unity Catalog dependencies is a 2–4 week migration; fifty-plus tables with column mapping, custom properties, and active pipeline rebuilds runs 8–16 weeks. Phase it — migrate one well-understood domain end to end, prove Direct Lake performance and security parity, then move the rest on the validated runbook.

Inventory the properties, choose Shortcut versus managed per table, register explicitly, rebuild the security that Unity Catalog held, validate, then cut over. Do that and Delta Lake on Microsoft Fabric delivers the unified estate it promises. Skip the inventory and the migration is not faster — it just moves the failures from your test plan to your production reports.

Delta Lake migrations to Fabric are worth doing — the architecture simplification is genuine and the Direct Lake mode for Power BI is a real performance improvement over DirectQuery. The gotchas are specific and avoidable if you know they are coming. If you are planning a migration from Databricks, ADLS, or Synapse to Fabric and want a technical review of your specific setup, I am happy to work through it.

Free Assessment

Where does your operation sit on the data maturity curve?

8 questions. 3 minutes. You get a scored breakdown across data infrastructure, analytics readiness, and automation potential — with a specific next step for your industry.

Microsoft FabricDelta LakeData ArchitectureData PlatformOneLake

Your Data · Our Technology · Our Automation

Get practical insights every fortnight

Amit writes about Microsoft Fabric, Power BI, AI in operations, and digital transformation for manufacturing and supply chain leaders. Practitioner perspective - no fluff, no vendor spin.

No spam. Unsubscribe any time. Also on Substack.

Is this the challenge you're facing?

Book a 30-minute call. We'll look at your specific operation and tell you what's achievable - plainly and without slides.