Loading…
Venue: Hall G2 (Level -2) clear filter
Thursday, June 25
 

10:30am CEST

Scanning Agentic AI Systems: Beyond Traditional LLM Red Teaming
Thursday June 25, 2026 10:30am - 11:15am CEST
As agentic AI systems evolve from simple LLM interfaces into autonomous and multi-agent workflows. Given the high autonomy of agentic AI systems, there is a growing need to perform a detailed risk assessment, which means traditional LLM-focused red teaming is no longer enough. Unlike standalone LLMs with text input and output, agentic systems interact with tools, memory, external data, and other agents, creating many new attack surfaces. Attacks may be introduced through emails, tool descriptions, or environmental content, and their impact can go beyond model responses to affect system behavior, planning, and perform harmful real-world actions.

In this talk, we share our hands-on journey building a comprehensive red teaming scanning solution tailored for agentic AI systems. We begin by analyzing why current scanning tools fall short, specifically their emphasis on structured components (e.g., protocols like MCP, A2A, and Skills) while overlooking unstructured and highly dynamic attack vectors where most real-world risks emerge. We then walk through the technical challenges of simulating realistic attacks without harming production environments, handling the diversity of agent architectures, frameworks, and agency-levels, and designing scanners that generalize across heterogeneous systems.

We present a practical full scanning pipeline that creates a novel holistic solution, including sandboxing and emulation strategies, automated system discovery pipelines, abstraction-based scanning mechanisms, and a risk-aware robustness scoring framework that goes beyond binary attack success. Throughout the talk, we highlight concrete lessons learned, trade-offs between cost and reliability, and real examples of agent-specific vulnerabilities.
We conclude with a concrete end-to-end scanning workflow and discuss open challenges such as adaptive scanner generation and black-box agent discovery. Attendees will leave with a deep understanding of why agentic AI requires fundamentally new red teaming methodologies and with actionable techniques for securing real-world autonomous AI systems.
Speakers
avatar for Roman Vainshtein

Roman Vainshtein

Research Director, GenAI Trust, Fujitsu Research of Europe

I am Research Director of the Generative AI Trust and Security Research team at Fujitsu Research of Europe, where I lead efforts to enhance the security, trustworthiness, and resilience of Generative AI systems. My work focuses on bridging the gap between AI security, red-teaming... Read More →
avatar for Amit Giloni

Amit Giloni

Principal Researcher, GenAI Trust team, Fujitsu Research

Dr. Amit Giloni is a Principal Researcher at Fujitsu Research of Europe, where she is part of the GenAI Trust team.
Her research spans multiple areas of machine learning, including classical ML, deep learning, generative AI, and agentic AI. She focuses on key challenges in trustworthy AI, such as bias and fairness, explainability, adversarial machine learning, robustness to abnormalities, and confidentiality... Read More →
avatar for Roy Betser

Roy Betser

Senior Researcher, GenAI Trust team, Fujitsu Research

Roy Betser is a PhD candidate int he Technion and an AI security senior researcher in Fujitsu Research of Europe, where heis part of the GenAI Trust team. His research focuses on analyzing representation and embedding spaces in foundation models and on developing practical trust and... Read More →
Thursday June 25, 2026 10:30am - 11:15am CEST
Hall G2 (Level -2)
  Testing

11:30am CEST

Developing Effective Security Testing Skills with Objective Structured Assessments
Thursday June 25, 2026 11:30am - 12:15pm CEST
Technical skill development and evaluation for application (software) security testers remains underdeveloped. There is no widely adopted framework defining core competencies, proficiency levels, or objective assessment criteria. In the absence of such standards, the industry has defaulted to a fragmented ecosystem of private organizations offering training and certifications that insufficiently prepare the next generation of security testers for real-world testing.

This environment disproportionately rewards those who benefit from exceptional mentorship or possess the time, resources, and aptitude for intensive self-directed learning. The popular mantra “Try Harder” reflects this culture of self-made expertise, but it also serves as a substitute for formalized training models. Further, aspiring security professionals are left to

In contrast, more mature, life-critical disciplines that demand high levels of technical skill (such as aviation and surgery) are built upon standardized curricula, clearly defined skill progressions, and objective methods for evaluating competence. This is not by chance; over many decades, these (and related) fields have honed in how to achieve optimal outcomes through evidence-based training programs and practices.

In this talk, we will examine the past, present, and prospective future of application security tester training in comparison to more mature professions that demand a high level of technical skill. We will introduce a novel framework for evaluating technical skills and demonstrate its application in combination with a comprehensive AppSec curriculum. Both the assessment framework and the curriculum will be released to the open-source community at the time of presentation.
Speakers
avatar for Ryan Armstrong

Ryan Armstrong

AppSec Manager, Tester, and Teacher, Digital Boundary Group (DBG)
Ryan Armstrong is the Manager of Application Security Services at Digital Boundary Group (DBG). Ryan began with DBG as an application penetration tester and security consultant following completion of his PhD in Biomedical Engineering at Western University in 2016. With a passion... Read More →
Thursday June 25, 2026 11:30am - 12:15pm CEST
Hall G2 (Level -2)

2:15pm CEST

This Build can Break You - Evil Runners and eBPF for Detection
Thursday June 25, 2026 2:15pm - 3:00pm CEST
CI/CD pipelines play an important role in modern software development. From a security perspective, this methodology contributes to more secure products, as automated checks can be applied on every run. Developers define tasks in a metadata file, and the system executes the defined jobs automatically. But what if the build chain itself becomes the security problem, allowing attackers to manipulate artifacts or take control of backend infrastructure? Let’s take a deep dive into “Poisoned Pipeline Execution” (OWASP CICD-SEC-4).

Builds are typically carried out in multiple steps using Runners—agents that pick up jobs and execute build instructions. These instructions, such as compiling a program or building a container image, are usually performed inside containers. Containers may provide isolation, but the effectiveness in terms of security strongly depends on the Runner’s configuration. Attackers can abuse Runners to execute arbitrary commands, leading to information disclosure or privilege escalation. While such attacks are well documented, effective detection mechanisms are often lacking.

Any viable detection method must be independent of the source code, language-agnostic, and container-friendly. The eBPF technology, which enables tracing of kernel-level activity, is well suited for this purpose. In this talk, we explore security vulnerabilities in CI Runners, how they become targets for attackers, and how malicious activities can be detected using eBPF.
Speakers
avatar for Reinhard Kugler

Reinhard Kugler

Principal Security Consultant, SBA Research

Reinhard’s focus relies on security testing of IT and industrial cyber-physical systems. Based on his prior experience in cyber defense, he works with companies to develop security capabilities and secure products. Reinhard is an experienced instructor and develops tailored security... Read More →
Thursday June 25, 2026 2:15pm - 3:00pm CEST
Hall G2 (Level -2)

3:30pm CEST

Boiling the Ocean for Signal: Lessons from High-Volume OSS Malware Detection
Thursday June 25, 2026 3:30pm - 4:15pm CEST
Malicious open source packages are on the rise, targeting more and more ecosystems. And while open source maintainers and users struggle to secure the immense attack surface of today’s software development practice, attackers continue to evolve their techniques.

This talk presents lessons learned from developing and operating an end-to-end malware detection pipeline in an enterprise setup that automatically scans tens of thousands packages a day, and is followed by human review of reported malware. It provides an overview about and fundamental design decisions, starting from a suitable classification scheme and the selection of meaningful signals with a low signal-to-noise ratio, to the compilation of Indicators of Compromise and the final reporting of confirmed malicious packages to the respective registries and third-party databases like OSV. The individual sections and learnings will be motivated and illustrated through real-world samples as well as descriptive statistics obtained from our system.

Session attendees will learn about:
- Latest open source malware trends,
- common evasion techniques used by attackers, from encoding techniques, code transformations and payload splitting to prompt instructions aiming to sabotage LLM-based detectors,
- the shortcomings of current malware datasets in regard to supporting developers in the evaluation of malware scanners, e.g., the lack of accompanying metadata and qualitative descriptions,
- the importance and complementarity of code and metadata-based detection signals,
- requirements and design decisions for an end-to-end OSS malware scanner, e.g., the realization that a binary classification benign/malicious is not colorful enough for the breadth of software distributed through OSS registries like npm or PyPI, and
- descriptive statistics obtained from our system, showing the prevalence of techniques used in the wild, e.g., the prevalence of different malware triggers and targeted platforms.

As such, the presentation targets both open source users interested in the latest malware trends and safeguards, as well as builders wanting to create an end-to-end OSS scan pipeline, e.g., because their ecosystem is already targeted by attackers but not yet or not sufficiently covered by state-of-the-art scanners.
Speakers
avatar for Henrik Plate

Henrik Plate

Security Researcher, Endor Labs

In his current position, Henrik aims at improving the security of today’s software supply chains, and in particular the secure consumption of open source. He formerly worked for SAP Security Research, where he led the focus topic "open source security" starting in 2014. He co-authored... Read More →
Thursday June 25, 2026 3:30pm - 4:15pm CEST
Hall G2 (Level -2)
 
Friday, June 26
 

10:30am CEST

Your Localhost Is Lying to You: Trust Boundary Failures in Enterprise SSO
Friday June 26, 2026 10:30am - 11:15am CEST
When an attacker lands on a user’s machine, your SSO should not hand them the keys to your network. Yet many enterprise systems do because they assume localhost subdomains are safe. They are not.

This talk shows how a common DNS misconfiguration (localhost.target.com → 127.0.0.1), combined with domain-wide cookies (Domain=.target.com), allows a locally executed request context to inherit an authenticated session. No XSS. No phishing. Just browser-native behavior.

This flaw is rarely detected by scanners or standard penetration tests, yet it appears in real enterprise deployments today. The session presents a practical testing methodology, a defensive checklist, and research-based validation techniques to assess this class of trust boundary failure safely.

Attendees will leave able to identify and fix this issue in their own SSO deployments next week.
Speakers
avatar for Rupesh Kumar

Rupesh Kumar

Application Security Researcher | Red Team Practitioner

Rupesh Kumar is an offensive security researcher with 1.5 years of experience in web application testing, vulnerability research, and red team operations. He has reported critical and high-severity vulnerabilities to organizations across government, defense, healthcare, and critical... Read More →
Friday June 26, 2026 10:30am - 11:15am CEST
Hall G2 (Level -2)

11:30am CEST

Effort is All You Need: Testing LLM Applications in the Real World
Friday June 26, 2026 11:30am - 12:15pm CEST
Security testing of GenAI systems is often reduced to "LLM red teaming": probing a model in isolation to see what unsafe/offensive content it will generate. In practice, this approach falls short. As security practitioners, we need to assess complete LLM application use cases, focusing on how inputs and outputs propagate through application logic and enable concrete security risks such as data exfiltration, cross-site scripting, and authorization bypass.

In this talk, we share practical experience and supporting open-source tooling we developed for assessing LLM applications. These focus on testing systems where the LLM is embedded in application logic rather than exposed as a simple inference endpoint.

It covers approaches for testing non-conversational GenAI workflows, WebSockets, and custom APIs; building scoped prompt injection datasets aligned with application logic and engagement constraints; applying effort-based jailbreak techniques (e.g. anti-spotlighting, best-of-n, crescendo, ...) to evaluate guardrail robustness and demonstrate practical bypasses; and conducting meaningful testing in isolated or air-gapped environments.

Speakers
avatar for Donato Capitella

Donato Capitella

Principal Security Consultant, Reversec

Donato Capitella is a Software Engineer and Principal Security Consultant at Reversec, with over 15 years of experience in offensive security and software engineering. Donato spent the past 3 years conducting research and assessments on Generative AI applications, covering topics... Read More →
avatar for Thomas Cross

Thomas Cross

Security Consultant, Reversec

Friday June 26, 2026 11:30am - 12:15pm CEST
Hall G2 (Level -2)

1:15pm CEST

What Our Pen Tests Never Found — And How Attackers Did
Friday June 26, 2026 1:15pm - 2:00pm CEST
Penetration testing is a crucial part of application security practices, yet attackers often succeed in ways no test ever reported. No injection, no memory corruption, no failed authentication. The applications behaved exactly as designed — and that was enough.

In this talk, we will explore what penetrating testing is intended to detect and how attackers actually compromise the systems. This talk will address why well-scoped penetration testing frequently revealed "no critical findings" while attackers later leveraged legitimate workflows, permission assumptions, and trust boundaries to cause serious harm.
Based on real world examples and post incident analysis, this talk will walk through security issues that were frequently overlooked during testing, not because testers lacked skills, but because the testing process made assumptions that attackers did not follow. We will focus on examining the blind spots in the penetration testing process, which include behaviors that only appear in production, cross-feature chaining, abuse of business logic, and trust assumptions built into system architecture.

The objective of this talk will be to comprehend where pen testing ends and how defenders might modify their testing tactics accordingly, rather than to replace it. This talk will break down the classes of issues pen tests routinely miss, how attackers discover them post-deployment, and what changed when testing strategies shifted from endpoint coverage to adversary-aware validation.

Attendees will leave with practical techniques to evolve their AppSec testing without increasing cost or abandoning penetration testing.
Speakers
avatar for Ramya M

Ramya M

Application Analyst, Okta, Inc,

Ramya M is a cybersecurity professional, currently working at Okta, Inc., specializing in application security, product security, identity security, and secure SDLC automation. She has led enterprise-scale initiatives across secure coding, DevSecOps hardening, vulnerability triage... Read More →
Friday June 26, 2026 1:15pm - 2:00pm CEST
Hall G2 (Level -2)

2:15pm CEST

Trust No History: Why Every "Remembered" Interaction is a Potential Backdoor
Friday June 26, 2026 2:15pm - 3:00pm CEST
As AI transitions from stateless tools to autonomous agents, the context window has become the primary attack surface. By giving agents the ability to remember, summarize, and collaborate, we have created a machine that can be gaslit. This session moves beyond transient prompt injections into the realm of persistent memory corruption. We explore how an adversary can rewrite an agent’s history, bias its knowledge base, and plant sleeper instructions that trigger long after the initial interaction. We will dissect the systematic subversion of the agentic memory stack and demonstrate why developers must stop treating agent memory as a passive data store and start defending it as the engine of the agent’s survival
Speakers
avatar for Rico Komenda

Rico Komenda

Senior Security Consultant

Rico is a senior product security engineer. His main security areas are in application security, cloud security, offensive security and AI security.

For him, general security intelligence in various aspects is a top priority. Today’s security world is constantly changing and you... Read More →
avatar for Barno Kaharova

Barno Kaharova

Senior Consultant, AI Security Expert, adesso SE

Barno is a expert specializing in data engineering, data modeling, and machine learning security. Driven by a passion for innovation, she develops cutting-edge methodologies to protect AI systems from adversarial threats, pushing the boundaries of what’s possible in AI security... Read More →
Friday June 26, 2026 2:15pm - 3:00pm CEST
Hall G2 (Level -2)

3:30pm CEST

Rewriting DAST Playbook: AI Agents and the Future of Web App Security
Friday June 26, 2026 3:30pm - 4:15pm CEST
The landscape of DAST (Dynamic Application Security Testing) tools is evolving to address modern web application complexities. While these tools are effective at detecting classic vulnerabilities like injection flaws, misconfigurations, and broken access control, they struggle with JavaScript-heavy SPAs, complex workflows, file upload/download analysis, and second-order vulnerabilities. To improve, modern DAST solutions are beginning to integrate AI-driven agentic browsers (e.g., Playwright + AI), out-of-band payloads, timing-based testing, and workflow-aware automation to better simulate real user behavior and detect deeper, context-sensitive issues.
Speakers
avatar for Divyansh Jain

Divyansh Jain

Application Security Analyst, Checkmarx Ltd.

Divyansh Jain is a passionate security engineer with experience in building and enhancing automated vulnerability scanners, focusing on issues like IDOR, broken access control, and authentication flaws. He has contributed extensively to open-source security tools, improved detection... Read More →
avatar for Aditya Dixit

Aditya Dixit

Application Security Analyst, Checkmarx Ltd.

Security Analyst with a hybrid background in software engineering, artificial intelligence, and cybersecurity. Experienced in developing AI/ML solutions and now focused on securing intelligent systems against emerging threats. Areas of interest include application security, adversarial... Read More →
Friday June 26, 2026 3:30pm - 4:15pm CEST
Hall G2 (Level -2)
  Testing
 
Share Modal

Share this link via

Or copy link

Filter sessions
Apply filters to sessions.