Advancing software fuzzing techniques through the exploration of cryptographic concepts and machine learning

Student thesis: PhD Thesis

Abstract

Modern software and networks are the backbone of our digital society, yet they are increasingly susceptible to security vulnerabilities exploitable by malicious actors. Addressing these vulnerabilities requires proactive, automated strategies to identify and mitigate risks, particularly within large-scale datasets. Fuzzing has emerged as a pivotal technique in this domain; however, traditional methods face significant challenges in deep bug discovery, input quality, and scalability. Novel mutation strategies, coupled with machine learning (ML) techniques—including advanced architectures such as Long Short-Term Memory (LSTM), Generative Adversarial Networks (GANs), and Gated Recurrent Units (GRUs)—offer a systematic framework for addressing these challenges. However, existing approaches often lack either mutation strategies in fuzzers or optimised ML-based models that leverage high-entropy techniques and mitigate vanishing gradient issues, as well as systematic integration of such transformations, limiting their effectiveness in diversifying input spaces and improving code coverage. This dissertation begins by categorising the integration of various ML models, including Traditional ML (TML), Deep Learning (DL), Reinforcement Learning (RL), and Deep Reinforcement Learning (DRL), and reviews the advancements, methodologies, and challenges in applying these paradigms to fuzzing.

Building on this foundation, we introduce novel enhancements to fuzzing mutation techniques by through the integration of cryptographic structures. Specifically, substitution-permutation networks (SPNs) and Feistel networks (FNs) were embedded into the custom mutator of the AFL++ framework, termed the HonggFuzz library. This led to the development of HonggFuzz+, a new custom mutator for AFL++, which achieves improved performance in identifying software bugs and discovering new code edges through optimised search-space exploration. Preliminary experiments, focusing on the number of unique bugs identified across various targets, validate the effectiveness of these methods in diversifying memory region relationships and advancing fuzzing tool development.

Subsequently, we extended our experiments to a wider range of targets and aspects using the FuzzBench benchmarking suite, and we optimised Feistel-inspired transformations (Feistel swaps) for the arithmetic operations of the main mutator by integrating them directly into the AFL++ baseline rather than into custom mutators, while also enhancing the entropy of the random number generator (RNG) by incorporating the Permuted Congruential Generator (PCG) into AFL++. We present three innovative fuzzing models—CAFL++, PCGAFL++, and CPCGAFL++—that integrate Feistel-inspired transformations and unbiased RNG mechanisms into AFL++, resulting in improved code coverage and stability. This approach addresses challenges related to studying the behaviour of targets in code coverage on a large-scale benchmark and mitigates bias caused by the current RNG embedded in AFL++. Additionally, it eliminates the need for a custom mutator, streamlines the integration of cryptographic mutators.

Finally, neural network (NN) optimisations in Multi-task fuzzing were investigated, employing techniques such as LReLU to counteract gradient vanishing, Nesterov-accelerated Adaptive Moment Estimation (Nadam) for refined weight updates, and sensitivity analysis for model refinement. These innovations, combined with game-theoretic insights into dominant strategies, significantly improve fuzzing efficacy, achieving better accuracy, edge coverage, and unique bug identification on specific targets compared to baseline methods, as detailed and evaluated against the state-of-the-art fuzzer MTFuzz in this thesis. This dissertation thus contributes novel methodologies and insights to advance the state-of-the-art in software fuzzing, enhancing both effectiveness and reliability in the evolving cybersecurity landscape.
Date of Award23 Aug 2025
Original languageEnglish
Awarding Institution
  • University of the West of England
SponsorsUniversity of the West of England
SupervisorPhil Legg (Director of Studies), Jun Hong (Second Supervisor 1) & Michail-Antisthenis Tsompanas (Second Supervisor 2)

Keywords

  • Fuzzing
  • Software Testing
  • Fuzz Testing
  • Automated Vulnerability Assessment

Cite this

'