Due to the growing complexity of IT systems and newly emerging security threats, keeping IT systems secure and detecting or preventing attacks is a constant challenge. Recently, data-driven techniques based on machine learning have emerged as one promising direction for flexible, “intelligent” IT security systems that learn to detect threats, attacks or fraud from large-scale data. In contrast to more traditional approaches that are based largely on manual analysis by security experts, data-driven approaches can in some scenarios adapt more quickly, exploit subtle patterns in data that are not easily recognizable to human experts, and through automation reduce the workload on security experts.
The lecture studies different threats and tasks in IT security (such as filtering malicious email messages, detecting malicious executable files, discovering security vulnerabilities in source code, or detecting fraudulent activity). We discuss how such tasks can be cast as machine learning problems, the process of data collection and data representation, and appropriate machine learning techniques for solving these tasks.