Skip to main content
AI & Automation

Power BI Copilot Needs a Clean Semantic Model

Copilot in Power BI looks magical in the demo. Then it meets a real semantic model with 40 measures named Measure 1 through Measure 40, and the magic produces confident wrong answers.

Amit Kumar Singh - Technology Consulting Partner at MyData Insights

Technology Consulting Partner · MyData Insights

13+ years in industrial data · Former Accenture & EY · GCC, India, SEA

24 May 2026 · 9 min read

The bottom line

Power BI Copilot is production-ready — but only on a hardened semantic model. Garbage measures, ambiguous columns and missing synonyms produce confidently wrong answers. Clean the model first; enable Copilot second.

Introduction

Most Copilot in Power BI pilots fail before anyone types a prompt. Not because the feature is broken — because the semantic model underneath it was never properly built.

Copilot does not know what your business means by "revenue." It does not know that your SAP S/4HANA has three different document types that all appear as sales orders. It does not know that your OTIF calculation has a different denominator depending on which plant is reporting. It reads your model and answers based on what it finds. If what it finds is inconsistent, it will give you a confident answer built on inconsistent logic — faster than any analyst would.

The contrarian position worth stating clearly

AI amplifies the quality of the data underneath it. Always — in both directions. If the semantic model is clean, governed, and aligned with how the business actually talks about performance, Copilot becomes a genuinely useful tool for an analyst or an operations director who wants to query performance without writing DAX.

If the model has duplicate measures with different logic, tables joined on the wrong grain, and column names that made sense to the developer who built them eighteen months ago, Copilot will produce wrong answers confidently and frequently. That is worse than no Copilot at all — because executives will trust the first few answers, and then spend months untrusting every subsequent one after a discrepancy surfaces.

The premise of most Copilot pilots — "let's turn it on and see what it does" — is the wrong approach. The right question is: "Is our semantic model ready for an AI to reason over it, and would we trust the output if it were a junior analyst reading the same data?"

What "semantic model readiness" actually means for an industrial business

A Power BI semantic model for a manufacturing or FMCG business is not just a set of tables. It is a formalised set of business rules about how the organisation's data should be interpreted. When it is built correctly, it encodes:

- One definition of OTIF — agreed between finance, operations, and logistics — with a single denominator that applies consistently across plant sites - One definition of OEE — calculated from MES downtime records, not from a production supervisor's manual log - One definition of forecast accuracy — tracking demand forecast from SAP S/4HANA planning at the SKU-month level against actual shipped quantity - Row-level security that ensures a plant manager in Chennai sees their plant's data, not the entire group's margin figures - Relationships between fact and dimension tables that are correct at the right grain — so an OTIF calculation does not accidentally double-count partial deliveries

When any of these are missing or inconsistent, Copilot does not fix them. It reasons over the model as-is and gives you answers that reflect its inconsistencies.

The most common failure is duplicate measures. An organisation builds a Power BI report in 2023. Someone builds another in 2024 with slightly different revenue logic. Both measures are called "Revenue" with different display names. Copilot picks one. Nobody knows which. The executive asks Copilot for last quarter's revenue and gets a number that does not match the board report. Trust in both Copilot and Power BI collapses at the same moment.

The foundation layer: Microsoft Fabric, OneLake, and Direct Lake

Copilot in Power BI does not operate on raw SAP S/4HANA data. The data has to travel through a pipeline, land somewhere structured, and be modelled before Copilot can reason over it.

For mid-market industrial businesses, the foundation that makes this work reliably is Microsoft Fabric with OneLake as the storage layer and Delta Lake as the open table format. Azure Data Factory pipelines pull from SAP S/4HANA — financial documents, sales orders, production confirmations, goods movements — and from other sources (SAP ByDesign procurement data, Dynamics 365 customer records, MES output tables) into OneLake as Delta Parquet files.

From there, the Fabric transformation layer builds the curated tables that the semantic model reads from. Not raw SAP extracts — curated, governed, validated tables where a "confirmed customer order" has a consistent definition applied at the pipeline level, before it ever reaches Power BI.

Direct Lake is the connection mode that makes this performant at scale. Instead of importing data into the Power BI model (which introduces a refresh lag and a data size ceiling), Direct Lake reads directly from OneLake Delta tables in real time. For an operations director who wants to see this morning's production performance against yesterday's OEE baseline, this is the difference between a live operational tool and a reporting artefact that is already twelve hours out of date by the time it loads.

ADF pipeline design matters here. An SAP S/4HANA extraction that pulls full delta loads every four hours is structurally different — in complexity, cost, and latency — from a change-data-capture approach that streams incremental updates. The right choice depends on your SAP landscape, your Fabric capacity tier, and what "live" actually needs to mean for the specific decisions the dashboard supports. There is no universal answer, and any consultant who tells you otherwise has not built one of these in a real SAP environment.

Row-level security: the governance piece most organisations skip

Row-level security (RLS) in a Power BI semantic model is not optional in an industrial business with multiple plants, business units, or regional P&Ls. It is the mechanism that ensures a regional supply chain manager in the GCC sees their region's stock cover and OTIF performance — not the group's consolidated figures that include business units they have no visibility into.

Getting RLS wrong has two failure modes. The first is over-permissive: people see data they should not, which creates a data privacy and commercial sensitivity problem. The second is under-designed: the RLS implementation is so restrictive that senior leaders cannot get cross-business views when they legitimately need them, so they export to Excel and the governance model breaks down immediately.

RLS in Microsoft Fabric / Power BI is implemented at the semantic model level, applied via DAX filter expressions. In a multi-plant SAP S/4HANA environment, this means the user's plant assignments in SAP need to be surfaced and maintained in a Dataverse or Fabric table that the RLS logic reads from. When someone's plant responsibilities change in SAP, the Power BI access should update automatically — not require a manual ticket to the BI team.

This is not complex to build. It is, however, regularly not built at all — because the initial model was a proof-of-concept that became production without the governance layer ever being added.

Where the master data problem surfaces in Copilot answers

Broken master data in SAP S/4HANA is the most common reason Copilot in Power BI gives wrong numbers — and it is entirely invisible in the UI.

A customer record with duplicate entries. An item code that exists in two plant codes with different units of measure. A cost centre assignment that changed in SAP six months ago but was never corrected in the historical records. None of these issues are visible in the Power BI interface — they look like data. Copilot reads them as data. The answer Copilot produces is wrong, and the wrong answer looks exactly like a right answer.

Copilot is not a data cleansing tool. It will not flag master data inconsistencies. It will not warn you that three records exist for the same customer under slightly different names. It will aggregate them, calculate a number, and present it with the same confidence it would present a number built on clean data.

The implication: before enabling Copilot in Power BI in a production environment, the master data in the source systems — SAP S/4HANA, SAP ByDesign, Dynamics 365 — needs to be in a state where you would trust a junior analyst to work from it. If the answer to that is "not quite," the Copilot programme needs to wait until the master data programme catches up.

What this looks like in practice

A manufacturing business with operations across three plant sites had been running Power BI reports for two years. The reports were trusted by the local site managers but regularly contradicted each other when the group CFO pulled a consolidated view. The root cause: each site's Power BI model had been built by a different report developer, each with slightly different OTIF and gross margin definitions.

The programme: rebuild the semantic model on a Microsoft Fabric / OneLake foundation, with a single set of agreed definitions established with finance and operations jointly. ADF pipelines from SAP S/4HANA into Delta Lake tables in OneLake. One semantic model with one OTIF measure, one OEE measure, Direct Lake connection for live data, and RLS aligned to the SAP plant authorisation hierarchy.

Copilot was enabled in the final week of the prototype phase, against the rebuilt model. The first prompt the CFO typed — "Show me OTIF for last quarter by plant" — returned a number that matched the manually reconciled board report for the first time in two years. That was the sign-off moment.

Where this approach doesn't fit

If your SAP S/4HANA master data is fundamentally broken — duplicate vendors, inconsistent material master units, plant assignments that do not reflect actual operational structure — the Microsoft Fabric / semantic model build will surface those inconsistencies rather than resolve them. The data remediation work in SAP is a prerequisite, not something that can be done in parallel with the analytics build.

This programme is also not right for businesses that do not have analytical consumers ready to use Copilot — where the culture of Power BI adoption is still low and dashboards are not regularly used in leadership meetings. Building a Copilot-ready semantic model on top of a reporting culture that does not yet trust Power BI produces an expensive, underused asset.

Six weeks to first value

In Discover — two weeks — we assess your current Power BI semantic models: measure consistency, RLS coverage, data freshness, and the ADF pipeline architecture feeding them. We identify the one metric where the data is clean enough and the business agreement is clear enough to build a Copilot-ready model quickly.

In Prototype — weeks three to six — we build or rebuild the semantic model for that metric on a Microsoft Fabric / OneLake foundation with Direct Lake, apply RLS, instrument the ADF pipeline from SAP S/4HANA or SAP ByD, and enable Copilot against the governed model. One metric. One clean answer. Demonstrated in your environment, not in a demo tenant.

The model decides Copilot's quality. Spend the four weeks hardening it. Then enable Copilot — and the plant manager who used to email the analyst at 7am gets the OEE number in plain English at the morning huddle.

Free Assessment

Where does your operation sit on the data maturity curve?

8 questions. 3 minutes. You get a scored breakdown across data infrastructure, analytics readiness, and automation potential — with a specific next step for your industry.

Power BI CopilotSemantic ModelDirect LakeMicrosoft FabricDAXConversational BI

Your Data · Our Technology · Our Automation

Get practical insights every fortnight

Amit writes about Microsoft Fabric, Power BI, AI in operations, and digital transformation for manufacturing and supply chain leaders. Practitioner perspective - no fluff, no vendor spin.

No spam. Unsubscribe any time. Also on Substack.

Is this the challenge you're facing?

Book a 30-minute call. We'll look at your specific operation and tell you what's achievable - plainly and without slides.