Every business wants to make smart decisions amid uncertainty, ward off risks, and seize opportunities. Every manager wants to make sure their operations hit performance goals while keeping costs down. Using reinforcement learning to make better operational decisions can give you a real competitive advantage.
Some business application areas are better suited than others to reinforcement learning. These include manufacturing, supplying chain, resource allocation, and sequencing and routing problems. In general, problems that have many agents, a large range of decisions to make, or fast-changing data, can benefit from optimization with reinforcement learning.
Simulation and Optimization – A Proven Business Problem Solver
Businesses that simulate their core operations tend to understand those operations better. And businesses that optimize those simulations tend to find ways to keep everything running smoothly, increasing efficiency and resilience.
Viruses, lockdowns, supply shocks and economic dislocations have made business less predictable than ever. Those disruptions can raise costs and destabilize companies, and increase the value of both simulation and optimization.
Simulations are attempts to mirror the real world, and because real-world problems are complex, useful simulations must be, too. But as simulations grow more complex, traditional optimization tools can fail.
Progress in artificial intelligence, however, has made it possible to solve problems and optimize simulations that used to be unworkable. Think tanks such as DeepMind and OpenAI have made breakthroughs applying artificial intelligence to video games, which are just recreational simulations. One type of AI they use is called reinforcement learning, an algorithm that rewards actions that lead toward your stated goal, and penalized ones that don’t. Reinforcement learning, or RL, was used to beat the world champion of Go in 2015.

Reinforcement Learning for Complex Business Problems
Now RL is being applied to business problems. Complex problems, which people have a hard time developing intuition about, often combine many components in systems that show emergent properties.
What are Emergent Properties?
If you understand how a muscle cell works in the heart, you still may not understand how all those muscles combine to form the pump that keeps us alive. The heart’s action as an organ operates on a larger scale than a simple cell, and it functions as a single organ. That action is an emergent property of all those cells combined. If you study birds, you might think you understand, say, a single starling. But flocks of starlings exhibit behavior that cannot be predicted using knowledge of individual birds.
Human behavior also has emergent properties. Traffic shock waves on highways are one example of a macro phenomenon that emerges from many individual actions coming together. And that’s why emergent properties are important for business operations.
When many robots or many delivery vehicles make decisions in concert, new patterns arise that simulations can help you detect, so that you can make better decisions. Or when you attempt to model many businesses in a sector, or many consumers in an economy, you may identify trends that will help you strategize.
In the presentation through the link, agent-based modeling is used to understand how pharmaceutical representatives influence the behavior of physicians in larger group practices. The group dynamics shape the emergent behavior of a medical practice that “has a mind of its own,” beyond the actions of the individual physicians. And here, PwC outlines some emergent phenomena that an airline faces as consumer behavior responds to their marketing strategies and those of their competitors, affecting revenue.
Simulation modelers will see the benefits of reinforcement learning when emergent properties arise in complex business problems. RL highlights new decision paths through those simulations, taking group dynamics into account. In addition, simulations that include quickly changing data, or a large array of decisions to make over many steps, are also suitable for RL. Businesses then have the opportunity for better decision-making and are ultimately better positioned to achieve their goals.
Reinforcement Learning Examples
Here are a few examples of reinforcement learning business applications where simulation optimization is best applied.
Manufacturing
The most well-known applications of AI in manufacturing are probably semi-autonomous robots that combine reinforcement learning and computer vision. But RL can also be used like a control center to orchestrate the action of many machines across an entire factory. That is, beyond making individual machines more autonomous, it is useful for factory-wide control systems to organize and coordinate action across many machines. For example, moving multiple products simultaneously over a complex assembly line requires coordination between those products to avoid collisions and minimize delays. Learn more about how reinforcement learning can help.
Supply Chain
The most well-known applications of AI in manufacturing are probably semi-autonomous robots that combine reinforcement learning and computer vision. But RL can also be used like a control center to orchestrate the action of many machines across an entire factory. That is, beyond making individual machines more autonomous, it is useful for factory-wide control systems to organize and coordinate action across many machines. For example, moving multiple products simultaneously over a complex assembly line requires coordination between those products to avoid collisions and minimize delays. Learn more about how reinforcement learning can help.
Resource Optimization
Allocating personnel or capital resources is usually managed with human intervention or automated with hard rules. But RL algorithms can effectively allocate resources and predict resource gaps tied to changing demand in complex environments such as call centers or data centers.
Routing/Scheduling
Routing product deliveries cost-effectively can be a challenge for large manufacturers. Deciding which factories should deliver to which warehouses based on real-world trial and error often results in lost time and unnecessary costs. Adding reinforcement learning to a routing simulation can show how to increase efficiency for manufacturers and distributors.
How to Use Reinforcement Learning for Business Applications
No matter what problem you apply RL to, there are four steps to improve your operational decision making.
1. Simulation
Simulation mirrors the relevant aspects of your operations in a virtual model.
2. Optimization
Optimization can surface new decisions to make within the simulation, to improve performance.
3. Validation
Validation compares the simulation to its real-world counterpart to ensure that it is useful and that the optimized actions are applicable.
4. Deployment (Inference)
Once validated, a predictive model can be deployed to affect your business’s operations in real-time to achieve the performance gains indicated in the simulation. A reinforcement learning policy server allows you to integrate a trained, decision-making AI with your business’s existing software stack.
Reinforcement learning uses cutting-edge AI to solve complex business problems in supply chain, manufacturing, resource allocation, and more. Once you have trained your predictive model in simulation, and validated against the real-world, it enables you to embed it in your business and reap the benefits of better decisions.
Contact us to find out more.