North America · Packaging Manufacturing
Korpack
“The production manager would open Power BI at 8am. The data was from yesterday.”
Semantic model refresh slashed 90% - from 1.5 hours to 15 minutes with Microsoft Fabric.
Key Results
15 min
Semantic model refresh
Was 1.5 hours
−85%
Pipeline runtime
Was 3–4 hours
Real-time
Data freshness
As soon as data lands
1
Storage layer
OneLake replaces 4 separate stores
Tech Stack
The Situation
At Korpack, the analytics platform existed - Power BI was deployed, dashboards were built, KPIs were defined. The problem was that by the time anyone opened a dashboard, the data was already hours old. The semantic model took 1.5 hours to refresh. The Azure Data Factory pipeline that fed it ran for 3–4 hours. And because there was no Change Data Capture, every single run processed the entire dataset from scratch - regardless of how much had actually changed. The result was a BI platform that the business had lost confidence in. "Is this live data or yesterday's?" became a standard question before anyone acted on a number. When BI data can't be trusted to be current, people stop using it - and go back to calling the ERP directly or asking someone in operations.
In manufacturing and packaging operations, this pattern is extremely common:
- ✓
Your dashboards take 1–2 hours to refresh - so they're almost never showing live data
- ✓
Your ETL or data pipelines run for hours even when very little has actually changed
- ✓
People have stopped trusting the BI tool because they're never sure if the data is current
- ✓
You're running full dataset refreshes because nobody set up incremental loading from the start
- ✓
You've moved to a modern BI tool but the underlying architecture is still batch-based and slow
- ✓
Operations decisions default back to ERP or direct system queries because "the dashboard is probably out of date"
If three or more of these describe your operation, you're looking at the right case study.
The Root Problem
- 1
Power BI semantic model refresh took 1.5 hours - meaning dashboards were stale for most of the working day
- 2
Azure Data Factory pipelines processed the entire dataset from scratch on every run (3–4 hours each)
- 3
No Change Data Capture meant 85–90% of each pipeline run was redundant - reprocessing unchanged records
- 4
Business users had lost confidence in the BI platform because they couldn't rely on data being current
- 5
Operations and finance decisions were reverting to direct ERP queries, undermining the BI investment
How We Fixed It
Diagnose the architecture before touching anything
The first step was understanding exactly why the pipeline was slow - not assuming. We traced data from source to dashboard and found two root causes: the ADF pipeline had no incremental logic (full table scans every run), and the Power BI semantic model was using Import mode which required a full data reload on every refresh. Both were fixable without replacing the underlying data.
Migrate to Microsoft Fabric - OneLake as the single storage layer
We migrated the platform to Microsoft Fabric, collapsing what had been separate compute and storage layers into one. OneLake became the single storage destination for all data, eliminating the copy-move-transform cycles that were adding latency at every step. Fabric's native integration between ADF, Spark, and Power BI also removed the handoff overhead between tools.
Implement Change Data Capture on Azure SQL
CDC was enabled on all Azure SQL source tables. Now ADF pipelines only process rows that have actually changed since the last run - not the full dataset. A pipeline that previously scanned 40 million rows now scans 15,000 changed rows. That is where the 85% time reduction came from.
Rebuild semantic models in Direct Lake mode
Power BI semantic models were rebuilt using Direct Lake mode - which queries OneLake directly without importing data into a separate engine. The 1.5-hour import refresh cycle was eliminated entirely. Dashboards now reflect data as soon as it lands in OneLake - no separate refresh job needed.
Implement Hub & Spoke to enforce clean data boundaries
A Hub & Spoke architecture was put in place within Fabric: a central Hub containing curated, governed data, with department-specific Spokes consuming from it. This ensured incremental processing was maintained at every layer and gave the business a scalable pattern for adding new data sources without rebuilding the core.
Measured Outcomes
Semantic model refresh
1.5 hours
15 minutes (−90%)
↑ Key win
ADF pipeline runtime
3–4 hours
15–20 minutes (−85%)
↑ Key win
Data freshness
Hours stale
Near real-time
Processing method
Full table scans every run
CDC incremental only
BI trust / adoption
Low - "is this current?"
High - live data, verified
What This Means For You
What this means if your BI platform feels slow or untrustworthy
Slow pipelines and stale dashboards are almost never a tooling problem - they're an architecture problem. Power BI, Fabric, and ADF are all capable of near-real-time data at scale. But if the underlying pattern is "full table scan → import refresh", speed is structurally impossible regardless of compute power you throw at it. The fix - CDC plus Direct Lake - is well-established and deployable without replacing your existing BI investments. If your team has stopped trusting the dashboard, that's the symptom. The root cause is almost certainly in the pipeline architecture.
Next Step
Is this your situation?
Book a 30-minute call. No slides, no pitch. We'll look at your specific setup, tell you what's causing the problem, and what a realistic fix looks like - including timeline and cost range.