TECHNOLOGY
“ The agent can choose from a set of actions . With each action comes feedback , good or bad , that becomes part of its memory ”
SAMRAT CHATTERJEE CHIEF DATA SCIENTIST , PACIFIC NORTHWEST NATIONAL LABORATORY
The attack was divided into several stages , including reconnaissance , execution , persistence , defence evasion , command and control , collection , and exfiltration , when data is transferred out of the system . The adversary was declared the winner if they successfully reached the final exfiltration stage .
By testing these DRL algorithms under these controlled conditions , the team was able to assess the strengths and weaknesses of each approach , providing valuable insights into the potential of this technology to enhance cybersecurity defence strategies .
“ Our algorithms operate in a competitive environment — a contest with an adversary intent on breaching the system ,” says Chatterjee . “ It ’ s a multistage attack , where the adversary can pursue multiple attack paths that can change over time as they try to go from reconnaissance to exploitation . Our challenge is to show how defences based on deep reinforcement learning can stop such an attack .”
“ Our goal is to create an autonomous defence agent that can learn the most likely next step of an adversary , plan for it , and then respond in the best way to protect the system ,” says Chatterjee .
Despite the progress , no one is ready to entrust cyber defence entirely to an AI system . Instead , a DRL-based cybersecurity system would need to work in concert with humans , says coauthor Arnab Bhattacharya , formerly of PNNL .
“ AI can be good at defending against a specific strategy but isn ’ t as good at understanding all the approaches an adversary might take ,” says Bhattacharya . “ We are nowhere near the stage where AI can replace human cyber analysts . Human feedback and guidance are important .”
aimagazine . com 107