The bottom line
Predictive maintenance vendors will show you a demo with clean data, pre-labelled failure events, and a well-integrated CMMS. Your plant has none of that. The three problems that determine whether a PdM deployment succeeds - labelled training data scarcity, CMMS integration gaps, and poor asset selection - are almost never discussed in vendor conversations. Solving them is more valuable than any algorithm upgrade. Before you buy a PdM platform, fix the data layer underneath the one you already have.
In This Article
The Labelled Training Data Problem
Supervised machine learning models - the type used in most predictive maintenance applications - require historical examples of the event they are trying to predict. For bearing failure prediction, the model needs sensor readings (vibration, temperature, current) from hundreds of historical bearing failures, labelled with the failure type and timing. Without this labelled dataset, the model cannot learn to distinguish pre-failure signatures from normal operational variation.
Most manufacturers do not have this data in structured form. They have maintenance work orders in a CMMS, written in free text by technicians who did not know they were creating training data. Converting those work orders into structured failure labels - with asset ID, failure mode, failure date, and the sensor readings that preceded it - is a data engineering and data quality task that typically takes 3–6 months before model training can begin.
Vendors demonstrate PdM on assets with well-documented failure histories because the demo is designed to show the model working, not to show the implementation effort. The gap between a successful demonstration and a production deployment is the labelled data problem - and it is the most common reason PdM pilots fail to scale.
Vendors demonstrate PdM on the assets where the training data already exists. The first question to ask any PdM vendor is: "Where did your training data come from, and how long did it take to prepare?"
The CMMS Integration Gap
Predicting a failure is only useful if a work order gets created, the right parts are pre-positioned, and the maintenance technician receives the alert in a system they already use. Without CMMS integration, a PdM alert is a number on a screen that requires a human to notice it, interpret it, decide it is credible, and manually create a work order. In a busy plant, that human step is where most PdM value is lost.
A PdM prediction that triggers an automatic work order in the CMMS - with the right failure code, asset history, and recommended parts - closes the loop from insight to action. That integration is not straightforward. Most CMMS systems have different data models, different APIs, and different criticality classification schemes. Building the integration is an engineering task that vendors routinely underscope in their implementation proposals.
The rule of thumb is: if the PdM implementation does not include CMMS integration in scope, the operational value is at risk. The prediction may be accurate. The response may still be manual, slow, and inconsistent.
Asset Selection: The Decision That Determines ROI
The most common asset selection mistake in PdM programmes is selecting assets based on sensor availability or equipment complexity rather than maintenance cost history and failure consequence. Vendors tend to pitch on the assets that are most instrumented - because those are the easiest to demonstrate. Operations teams are drawn to the most sophisticated equipment - because those seem most worthy of AI investment.
The correct starting point is a maintenance cost analysis. Which assets have generated the most unplanned downtime over the past 24 months? Which failures have caused the longest production stops? Which assets have the highest maintenance cost per unit of output? Those are your first targets - regardless of how many sensors they currently have.
Starting with high-cost, high-frequency failures means the ROI is measurable within 12 months. Starting with complex, low-frequency failures means you may spend 18 months building a model that has never seen enough failure events to be reliable. Asset selection is not a technical decision - it is a business prioritisation decision, and it should be made from maintenance cost data, not engineering instinct.
What a PdM Foundation Actually Looks Like
Before the model comes the foundation the demo never shows. Sensor streams — vibration, temperature, current — land in Microsoft Fabric Real-Time Analytics, with OneLake holding the historian: years of granular signal the model will eventually learn from. Asset hierarchy and maintenance history come across from the CMMS and ERP via Azure Data Factory, so a reading can be tied to the specific asset, its criticality, and its failure record. This is the unglamorous data integration work that determines whether PdM is possible at all.
The labelled-data problem is largely a structuring problem, and this is where it gets solved. Technicians wrote failure history as free text in the CMMS without knowing they were creating training data; Copilot and language models can now help convert those work orders into structured failure labels — asset ID, failure mode, date — joined to the sensor readings that preceded each event. It still takes months and human validation, but it turns an unusable archive into a training set rather than starting data collection from zero.
Then the loop is closed in Power Platform, not left as a number on a screen. A prediction triggers Power Automate to raise the CMMS work order with the failure code, asset history, and recommended parts; Power Apps gives the technician the screen to accept it. A Power BI semantic model surfaces fleet health on the same governed data, so the maintenance planner and the model see one version of asset condition. The same foundation feeds the wider manufacturing analytics estate — OEE on the assets PdM is protecting — so it is one platform, not a bolt-on.
The order is the whole point: foundation and labels first, model second, automated response third. Vendors invert it — lead with the model — because the model demos well. The plants that scale PdM are the ones that built the layer underneath it first.
Where PdM Still Breaks
On a new asset group with no failure history, supervised PdM simply cannot start — there is nothing to learn from. The honest path is 6–12 months of sensor deployment and data collection before a reliable model is possible, or unsupervised anomaly detection as an interim. A vendor who promises a working model on day one of an uninstrumented asset is selling the demo, not the deployment.
CMMS integration is the second routinely underscoped task. Different CMMS systems have different data models, APIs, and criticality schemes, and the prediction-to-work-order link is real engineering. If integration is not in the implementation scope, the prediction may be accurate while the response stays manual, slow, and inconsistent — and most of the value leaks out at that human step.
And asset selection is where ROI is won or lost before any model runs. Choose by sensor availability or equipment glamour and you may spend 18 months on a model that never saw enough failures to be reliable. Choose by maintenance cost and downtime history — a business decision, not an engineering one — and the return is measurable inside a year. Skipping that analysis is the quiet reason many PdM programmes never pay back.
You are not funding a learning exercise. Pick the assets by maintenance cost, fix the data layer, then model — in that order.
What Changes for the Maintenance Leader
Done in the right order, PdM moves maintenance from calendar-based and reactive to condition-based — typically a 15–25% reduction in unplanned downtime once the programme matures on the assets that actually drive cost. The gain is real, but it comes from the foundation and the closed loop, not from the algorithm the vendor led with.
The sequencing also de-risks the spend. A six-week Discover and Foundation build connects the priority assets' sensors, structures the CMMS history into labels, and stands up fleet-health visibility on Microsoft Fabric — first value in 6 weeks — before the supervised model work begins. You learn whether the data can carry a model before you commit to the model, rather than discovering the gap 18 months in.
The first question to ask any PdM vendor is where their training data came from and how long it took to prepare. If the honest answer to your own version of that question is "we don't have it yet," that is the project — fix the data layer underneath the platform you already have, and the predictions the vendor promised become achievable rather than aspirational.
The vendor demonstration works because the vendor controlled the data. Your site isn't the vendor's demonstration environment. The labelled failure history, the clean sensor feeds, the integrated CMMS - none of it comes with the licence. Plan for the 12–18 months it takes to build them, and the programme you end up with will actually deliver.
Free Assessment
Where does your operation sit on the data maturity curve?
8 questions. 3 minutes. You get a scored breakdown across data infrastructure, analytics readiness, and automation potential — with a specific next step for your industry.