Microsoft open sources tool to use AI in simulated attacks


Join GamesBeat Summit 2021 this April 28-29. Register for a free or VIP pass today.

As part of Microsoft’s research in ways to use machine learning and AI to improve security defenses, the company has released an open-source attack toolkit to let researchers create simulated network environments and see how they fare against attacks.

Microsoft 365 Defender Research released CyberBattleSim, which creates a network simulation and models how threat actors can move laterally through the network looking for weak points. When building the attack simulation, enterprise defenders and researchers create various nodes on the network and indicate which services are running, what vulnerabilities are present, and what security controls are in place. Automated agents, representing threat actors, are deployed in the attack simulation to randomly execute actions as they try to take over the nodes.

“The simulated attacker’s goal is to take ownership of some portion of the network by exploiting these planted vulnerabilities. While the simulated attacker moves through the network, a defender agent watches the network activity to detect the presence of the attacker and contain the attack,” the Microsoft 365 Defender Research Team wrote in a post discussing the project.

Using reinforcement learning for security

Microsoft has been exploring how machine learning algorithms such as reinforcement learning can be used to improve information security. Reinforcement learning is a type of machine learning in which autonomous agents learn how to make decisions based on what happens while interacting with the environment. The agent’s goal is to optimize the reward, and agents gradually make better decisions (to get a bigger reward) through repeated attempts. 

The most common example is playing a videogame. The agent (the player)  gets better at playing the game after repeated tries by remembering the actions that worked in previous rounds.

In a security scenario, there are two types of autonomous agents: the attackers trying to steal information out of the network and defenders trying to block, or mitigate the effects of, an attack. The agents’ actions are the commands attackers can execute on the computers and the steps defenders can perform in the network. Using the language of reinforcement learning, the agent’s goal is to maximize the reward of a successful attack by discovering and taking over more systems on the network, and finding more things to steal. The agent has to execute a series of actions to gradually explore the networks, but to do so without setting off any of the security defenses that may be in place.

Security training and games

Much like the human mind, AI learns better by playing games, so Microsoft turned CyberBattleSim into a game. Capture the flag competitions and phishing simulations help strengthen security by creating scenarios where defenders can learn from attacker methods. By using reinforcement learning to get the reward of “winning” a game, the CyberBattleSim agents can make better decisions on how they interact with the simulated network. 

The CyberBattleSim focuses on threat modeling how an attacker can move laterally through the network after the initial breach. In the attack simulation, each node represents a machine with an operating system, software applications, specific properties (security controls), and a set of vulnerabilities. The toolkit uses the Open AI Gym interface to train automated agents using reinforcement learning algorithms. The open source Python source code is available on GitHub

Erratic behavior should quickly trigger alarms and security tools would respond and evict the malicious actor. But if the actor has learned how to compromise systems faster by shortening the number of steps it needs to succeed, that gives defenders insight as to the places that need security controls in order to detect the activity sooner.

The CyberBattleSim is part of Microsoft’s broader research to apply machine learning and AI to automate many of the tasks security defenders are currently handling manually. In a recent Microsoft study, almost three-quarters of organizations said their IT teams spent too much time on tasks that should be automated. Autonomous systems and reinforcement learning “can be harnessed to build resilient real-world threat detection technologies and robust cyber-defense strategies,” Microsoft wrote.

“With CyberBattleSim, we are just scratching the surface of what we believe is a huge potential for applying reinforcement learning to security,” Microsoft wrote. 


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Products You May Like

Leave a Reply

Your email address will not be published. Required fields are marked *