Bachelor or Master thesis: Dynamic Analysis for Parser Detection in Binaries

Description

Parsers for network protocols, file formats, and digital signatures in binary executables and firmware are particularly exposed components within any software system, as they are usually the first layer of the attack surface. Automatically discovering such parsers would enable security analysts to comprehensively understand the communication protocols, data formats, and state machines that govern a device’s interactions. With this knowledge, security experts can expose hidden vulnerabilities, identify attack surfaces, and ultimately engineer targeted exploits to enhance security posture.
Moreover, in the context of fuzzing, parsers play a pivotal role in crafting practical test cases. Understanding the parsing logic enables the creation of well-crafted input data that can systematically explore various code paths, uncovering potential weaknesses and edge cases. This
 precision in testing can save valuable time and resources, ensuring a more efficient and effective security assessment. In addition, knowledge of the exact location of a parser enables automatic harnessing of such parsers, allowing a fuzzer to fuzz a protocol parser without additional human interaction.

Goal

In this thesis, you will develop a utility that dynamically analyzes a given binary for parsers of a predefined set of protocols and file formats. The exact way your tool will operate is up to you, but we have some ideas to get you started.
You will then define multiple metrics to measure the characteristics of your utility, such as performance and correctness. Using these metrics, you can then evaluate your tool compared to existing solutions.


Related Work

Automated binary analysis: A survey

Requirements

This is a Bachelor or Master thesis topic
– Knowledge of C++ and Python
– Some experience in reverse engineering (ideally with Ghidra)