Operation and Maintenance (O&M) costs are an important factor in the total Life Cycle Cost (LCC) of energy systems. A good O&M optimization strategy needs to keep many pieces of equipment running efficiently while balancing operational variables and service crew limitations. Those complexities make simulation and reinforcement learning an ideal pairing for optimizing energy system O&M plans. 

Pathmind partnered with Engineering Group to demonstrate how reinforcement learning can help improve O&M efficiency in energy systems. This collaboration features a modified version of a well-known field service model hosted on AnyLogic Cloud and shows how reinforcement learning outperforms existing heuristics, such as routine maintenance.


The wind farm shown in the field service model contains 20 turbines equipped with Prognostics & Health Management (PHM) capabilities. PHM combines knowledge of failure mechanisms with system lifecycle management. Three service crews are available to perform preventive maintenance on working turbines or corrective maintenance on those that are broken. 

Preventive maintenance often results in less downtime because it averts breakdowns, and leads to lower costs, as it addresses equipment issues before a complete replacement of more expensive components becomes necessary. 

A turbine continues to operate and produce revenue if its Remaining Useful Life (RUL) is higher than zero. An O&M optimization strategy has to maximize revenue while also factoring in limited maintenance staff and the uncertainties of equipment failure and production variability.

Compared to the routine maintenance heuristic, reinforcement learning increased revenue by more than 30%.

Why Pathmind Was Needed

O&M optimization tools need to be flexible and intelligent to accommodate energy system complexities. Other optimizers are not able to handle the number of factors that must be considered when making O&M decisions, such as crew sizes and location, variable energy production levels, changing demand, equipment failures and actual degradation, and the quality of data being received from PHM algorithms.

Likewise, routine maintenance heuristics do not guarantee optimal performance since they use recurring maintenance schedules without taking the actual state of the equipment into account.

Deep RL For Optimal Operation and Maintenance of Energy Systems model


Pathmind’s reinforcement learning was able to make informed decisions while considering a turbine’s status, production predictions for the next three days, expected RUL, maintenance cost, and the current position of the maintenance crew within the wind farm. Using the Pathmind policy, the maintenance crew coordinated spatially to reduce distance travelled and made decisions that favored maintenance over repair services, cutting down on equipment failures. Compared to the routine maintenance heuristic, reinforcement learning increased revenue by more than 30%.

About Engineering Group

With over 12,000 professionals in 65 global locations, Engineering Ingegneria Informatica S.p.A designs, develops and manages innovative solutions for the business areas where digitalization is having the biggest impact.

Project Resources

  • View this model on AnyLogic Cloud.
  • This project was featured during “Deep Reinforcement Training and Machine Learning Applications for Industry 4.0: Optimization of Field Service Management and Manufacturing Operation” at INFORMS Business Analytics 2021. Watch the presentation to learn more.