Offensive AI Lab

Gilad Gressel

Gilad Gressel is a PhD student who focuses on the cyber-security implications of advanced AI systems. His research examines how large language models and autonomous agents can be weaponized for social engineering, fraud, cybercrime automation, and post-release model misuse, as well as how these systems can be monitored, attributed, and defended.

On the threat side, Gilad studies the emerging use of LLMs in organized cybercrime. His USENIX 2026 paper, Love, Lies, and Language Models: Investigating AI’s Role in Romance-Baiting Scams, investigates how LLM agents can automate long-horizon trust-building, emotional manipulation, and financial grooming in romance-baiting scams. Drawing on insider interviews, victim accounts, and a controlled long-term conversation study, the work shows how AI agents may increase both the scale and effectiveness of text-based social engineering attacks.

On the defense side, Gilad develops practical mechanisms for AI security, accountability, and governance. He is a co-author of GAVEL, published at ICLR 2026 and to be presented at Black Hat USA 2026 as Rules for Neural Traffic: A New Defensive Layer for LLMs. GAVEL introduces rule-based activation monitoring as a new defensive layer for LLMs, enabling interpretable, configurable safeguards that inspect model behavior below the surface-text level. His broader defensive work also includes PRISM, a framework for recovering the active instruction set steering an LLM from its internal activations; agent attribution methods for tracing harmful AI agents back to their operators; and adaptive evaluations showing why current defenses against malicious fine-tuning can fail under realistic adversaries.

Gilad is currently pursuing a PhD and holds an M.S. in Computer Science with a specialization in Machine Learning from Georgia Tech and a B.A. in Linguistics from the University of California, Santa Cruz. His research is driven by the goal of building secure, interpretable, and accountable AI systems for the next generation of cyber defense.

Email: gresselg@post.bgu.ac.il