Industrial KPIs & Maintenance Metrics
Transforming Raw Factory Data into Strategic Performance Intelligence
"You cannot manage what you do not measure." In the industrial world, Key Performance Indicators (KPIs) act as the cockpit instrumentation for plant management. Without them, decisions are made on gut feeling, which inevitably leads to catastrophic failure or excessive cost.
1. The KPI Hierarchy
Effective performance measurement is structured in a pyramid. If you measure everything, you measure nothing.
Strategic (Level 1)
**OEE, Maintenance Cost/RAV, Safety Record.** The boardroom metrics.
Tactical (Level 2)
**MTBF, Backlog (Wks), Schedule Compliance.** The department manager metrics.
Operational (Level 3)
**MTTR, PM Completion Rate, Re-work %**. The technician/team lead metrics.
2. Modern Maintenance Dashboard
Schedule Compliance (Leading Indicator)
OEE Trend (Lagging Indicator)
3. OEE: The Ultimate Efficiency Metric
Overall Equipment Effectiveness (OEE) is the universal metric for measuring the percentage of planned production time that is truly productive. It is calculated by multiplying three factors:
OEE = Availability \times Performance \times QualityAvailability
Accounts for **Downtime Losses** (Breakdowns, Changeovers, Setup time).
Performance
Accounts for **Speed Losses** (Idling, Minor Stoppages, Reduced Speed operation).
Quality
Accounts for **Quality Losses** (Scrap, Rework, Yield loss during startup).
4. Financial KPIs: Maintenance as % of RAV
One of the most powerful "Top-Level" metrics is **Maintenance Cost as a Percentage of Replacement Asset Value (RAV)**.
The Benchmark
World-class facilities typically operate between **2.0% and 3.0%**. If your ratio is above 5%, you are likely in a "firefighting" loop with high emergency spending. If it's below 1%, you are likely under-maintaining, which will lead to a "Reliability Debt" that eventually manifests as catastrophic failure.
5. Wrench Time: The Productivity Leak
A common misconception is that a technician working 8 hours a day is 100% productive. In reality, the **Wrench Time** ΓÇö the actual time spent performing maintenance ΓÇö is often as low as 25-35%.
Where does the time go?
The "Non-Productive" 65% is consumed by: **Searching for parts (20%)**, **Travel time (15%)**, **Waiting for instructions/permits (15%)**, and **Administrative paperwork (15%)**.
By measuring Wrench Time through "Day-in-the-Life" (DILO) studies or CMMS data analysis, organizations can identify systemic bottlenecks. Increasing wrench time from 30% to 45% effectively increases the maintenance workforce by 50% without hiring a single new person.
6. The 6 Big Losses of OEE
To fix OEE, you must understand where the losses occur. Total Productive Maintenance (TPM) categorizes these into six specific buckets:
1. Equipment Failure
Large-scale downtime events (Breakdowns).
2. Setup & Adjustments
Time lost during changeovers or machine tuning.
3. Idling & Minor Stoppages
The "micro-stops" that aren't recorded as breakdowns but kill performance.
4. Reduced Speed
Running the machine slower than its nameplate capacity.
5. Wrench Time: The Productivity Leak
A common misconception is that a technician working 8 hours a day is 100% productive. In reality, the **Wrench Time** ΓÇö the actual time spent performing maintenance ΓÇö is often as low as 25-35%.
Where does the time go?
The "Non-Productive" 65% is consumed by: **Searching for parts (20%)**, **Travel time (15%)**, **Waiting for instructions/permits (15%)**, and **Administrative paperwork (15%)**.
By measuring Wrench Time through "Day-in-the-Life" (DILO) studies or CMMS data analysis, organizations can identify systemic bottlenecks. Increasing wrench time from 30% to 45% effectively increases the maintenance workforce by 50% without hiring a single new person.
6. The 6 Big Losses of OEE
To fix OEE, you must understand where the losses occur. Total Productive Maintenance (TPM) categorizes these into six specific buckets:
1. Equipment Failure
Large-scale downtime events (Breakdowns).
2. Setup & Adjustments
Time lost during changeovers or machine tuning.
3. Idling & Minor Stoppages
The "micro-stops" that aren't recorded as breakdowns but kill performance.
4. Reduced Speed
Running the machine slower than its nameplate capacity.
7. Leading Indicators: The PM-to-CM Ratio
A critical leading indicator is the **PM-to-CM Ratio** (Preventive Maintenance to Corrective Maintenance). It measures the health of your maintenance strategy.
The 80/20 Rule
World-class maintenance organizations aim for an **80/20** ratio ΓÇö 80% proactive work (PM, PdM) and only 20% reactive work (CM). If your ratio is 50/50, your technicians are constantly "fighting fires," which means they lack the time to perform high-quality PMs, leading to even more failures. It is a death spiral that can only be broken by rigorous schedule compliance.
8. MTTR: Breaking Down the Repair Clock
Mean Time To Repair (MTTR) is often misunderstood as just "the time it takes to fix it." To actually improve MTTR, you must break it down into its constituent parts:
1. Detection & Notification
How long does it take for someone to notice and report the failure?
2. Response & Diagnosis
Travel time to the asset and the time required to find the root cause.
3. Parts & Tools Logistics
The biggest killer of MTTR ΓÇö waiting for the storeroom to find the spare parts.
4. Active Repair & Testing
The actual "wrench time" and the time to verify the asset is safe to run.
From Chaos to Control: The Steel Mill
A high-output rolling mill suffered from 15% unplanned downtime. The management team was focused on "Tons Produced" (a lagging metric) and ignored the maintenance backlog.
The Intervention
The facility implemented a "Leading Metric" dashboard focusing on **Schedule Compliance** and **Backlog Health**. They discovered their backlog was at 12 weeks ΓÇö a state of total reactive chaos. By freezing non-critical work and focusing strictly on **Preventive Maintenance (PM) Compliance**, they reduced the backlog to 4 weeks over six months.
Result: Unplanned downtime dropped from 15% to 4.5%, and tons produced increased by 20% without adding a single machine.
Mean Time Between Failures. A measure of an asset's reliability (Total uptime / number of failures).
Mean Time To Repair. A measure of an asset's maintainability (Total downtime / number of repairs).
The amount of approved work not yet completed, usually measured in labor weeks.
Replacement Asset Value. The current cost to replace an asset with a new one of similar capacity.
The ratio of good units produced to the total units started in a process.
The total time an asset is operational and capable of performing its intended function.
