Binary function similarity, which often relies on
learning-based algorithms to identify what functions in a
pool are most similar to a given query function, is a soughtafter topic in different communities, including machine
learning, software engineering, and security. Its importance
stems from the impact it has in facilitating several crucial
tasks, from reverse engineering and malware analysis to
automated vulnerability detection. Whereas recent work cast
light around performance on this long-studied problem, the
research landscape remains largely lackluster in understanding the resiliency of the state-of-the-art machine learning
models against adversarial attacks. As security requires
to reason about adversaries, in this work we assess the
robustness of such models through a simple yet effective
black-box greedy attack, which modifies the topology and
the content of the control flow of the attacked functions. We
demonstrate that this attack is successful in compromising
all the models, achieving average attack success rates of
57.06% and 95.81% depending on the problem settings
(targeted and untargeted attacks). Our findings are insightful: top performance on clean data does not necessarily
relate to top robustness properties, which explicitly highlights
performance-robustness trade-offs one should consider when
deploying such models, calling for further research.
Dettaglio pubblicazione
2025, 10th IEEE European Symposium on Security and Privacy, Pages -
On the Lack of Robustness of Binary Function Similarity Systems (04b Atto di convegno in volume)
Capozzi Gianluca, Tang Tong, Wan Jie, Yang Ziqi, D'Elia DANIELE CONO, DI LUNA GIUSEPPE ANTONIO, Cavallaro Lorenzo, Querzoni Leonardo
keywords