Publications

Preprint
Evaluation Guidelines for Empirical Studies in Software Engineering involving LLMs
Sebastian Baltes, Florian Angermeir, Chetan Arora, Marvin Muñoz Barón, Lukas Böhme, Fabio Calefato, Chunyang Chen, Neil Ernst, Davide Falessi, Brian Fitzgerald, Davide Fucci , Marcos Kalinowski, Stefano, Lambiase, Mircea Lungu, Lutz Prechelt, Paul Ralph, Christoph Treude, Stefan Wagner
preprint, 2025
Large language models (LLMs) are increasingly being integrated into software engineering (SE) research and practice, yet their non-determinism, opaque training data, and evolving architectures complicate the reproduction and replication of empirical studies. We present a community effort to scope this space, introducing a taxonomy of LLM-based study types in SE together with eight guidelines for designing and reporting empirical studies involving LLMs. The guidelines present essential (MUST) criteria as well as desired (SHOULD) criteria and target transparency throughout the research process: declaring LLM usage and role; reporting model versions, configurations and fine-tuning; documenting tool architectures; disclosing prompts and interaction logs; using human validation; employing an open LLM as a baseline; reporting suitable baselines, benchmarks, and metrics; and articulating limitations and mitigations. Our goal is to enable reproducibility and replicability despite LLM-specific barriers to open science. We maintain the study types and guidelines online as a living resource for the community to use and shape (see llm-guidelines.org).
Conference
Towards Automated Continuous Security Compliance
Florian Angermeir, Jannik Fischbach, Fabiola Moyón, Daniel Mendez
ESEM, 2024
Continuous Security Compliance is crucial for adopting Continuous Software Engineering in highly regulated domains, but traditional manual compliance methods are resource-intensive and error-prone, and the field lacks sufficient research. This paper defines continuous security compliance, outlines key challenges through a tertiary study, and proposes a research roadmap developed in collaboration with an industry partner to advance automation in this area.
Preprint
No Free Lunch: Research Software Testing in Teaching
Michael Dorner, Andreas Bauer, Florian Angermeir
preprint, 2024
This study explores the integration of research software testing into teaching, demonstrating that such efforts can improve research software quality—particularly documentation and dependency management—while also exposing students to real-world research software engineering. However, despite thoughtful student contributions, challenges such as unclear intellectual property rights and lack of incentives hinder the direct reuse of student code, limiting the full potential of this approach.
Conference
Automated Security Findings Management: A Case Study in Industrial DevOps
Markus Voggenreiter, Florian Angermeir, Fabiola Moyón, Ulrich Schöpp, Pierre Bonvin
ICSE-SEIP, 2024
Through an industrial case study, we examine how automated security findings management can be integrated into DevOps workflows. Our findings show significant improvements in vulnerability response times and developer productivity.
Conference
Industrial Challenges in Secure Continuous Development
Fabiola Moyón, Florian Angermeir, Daniel Mendez
ICSE-SEIP, 2024
The intersection of security and continuous software engineering remains a critical focus as agile and DevOps practices continue to shape development processes, with growing academic and practical interest in secure methodologies. This work summarizes validated challenges identified through practitioner engagement and outlines four key research directions to guide future efforts in scalable, secure continuous software engineering.
Conference
Enterprise-Driven Open Source Software: A Case Study on Security Automation
Florian Angermeir, Markus Voggenreiter, Fabiola Moyón, Daniel Mendez
ICSE-SEIP, 2021
This study investigates the integration of automated security activities within CI pipelines of enterprise-driven open source projects, revealing that such practices are rare despite maintainers recognizing the importance of security. By analyzing over 8,000 repositories and surveying project maintainers, the research highlights a significant gap in security automation and suggests areas for practical improvement and further study.
Technical Report
RefA: Reference Architecture for Security-compliant DevOps
Fabiola Moyón, Daniel Mendez, Tony Gorschek, Florian Angermeir, Pierre-Louis Bonvin, Markus Voggenreiter
Technical Report, 2023
This report introduces RefA, a reference architecture designed to support security-compliant DevOps by outlining relevant artefacts, practice areas, and the roles of people, processes, and technology. Developed through standards analysis, literature review, and industrial experience, RefA serves both practitioners aiming to assess secure DevOps lifecycles and researchers shaping future security-focused studies.