Status
Available
Description
The main goal of this thesis is the analysis of redaction tools for PDF documents. The thesis consists of three research questions:
- Test Catalog: A comprehensive test-bed including basic tests as well as edge cases will be created. The main goal is, to establish a systematic approach for the generation of such test cases. Also a test environment by considering as much as possible redaction tools will be established.
- Test Methodology and Evaluation: First a testing methodology, defining how large number of PDFs can be automatically redacted and evaluated, will be elaborated. Second, each of the collected redaction tools will be evaluated against the files in the test-bed. The results will be systematized and discovered issues will be highlighted.
- Redacted Files in the Wild: Many websites such as “Frag den Staat” offer redacted files for download. Such files will be evaluated by relying on the previously established methodology.
Challenge
Download the redacted files and analyze it carefully. Extract the hidden information and send it to vladislav.mladenov@rub.de.
Requirements
- Python
- Message-Level Security
Contact
Supervision: Christian Mainka, Vladislav Mladenov, Simon Rohlmann
Contact: vladislav.mladenov@rub.de
Start date: immediately