BotTokenizer

BotTokenizer: exploring network tokens of HTTP-based botnet using malicious network traces. Nowadays, malicious software and especially botnets leverage HTTP protocol as their communication and command (C&C) channels to connect to the attackers and control compromised clients. Due to its large popularity and facility across firewall, the malicious traffic can blend with legitimate traffic and remains undetected. While network signature-based detection systems and models show extraordinary advantages, such as high detection efficiency and accuracy, their scalability and automatization still need to be improved.{par}In this work, we present BotTokenizer, a novel network signature-based detection system that aims to detect malicious HTTP C&C traffic. BotTokenizer automatically learns recognizable network tokens from known HTTP C&C communications from different botnet families by using words segmentation technologies. In essence, BotTokenizer implements a coarse-grained network signature generation prototype only relying on Uniform Resource Locators (URLs) in HTTP requests. Our evaluation results demonstrate that BotTokenizer performs very well on identifying HTTP-based botnets with an acceptable classification errors.