Publications

Conference

Reflections on the Reproducibility of Commercial LLM Performance in Empirical Software Engineering Studies

Florian Angermeir, Maximilian Amougou, Mark Kreitz, Andreas Bauer, Matthias Linhuber, Davide Fucci, Daniel Mendez, Tony Gorschek

ICSE, 2026

The paper finds that reproducibility in LLM-centric empirical software engineering studies is very poor: out of 69 papers using OpenAI models only 5 were runnable, none of which reproduced fully. It concludes that missing or incomplete artefacts, undocumented experimental details, dependency issues, and deprecated commercial models undermine reproducibility, and that current ACM artefact badges do not reliably indicate reproducible research.

Preprint DOI

Preprint

Evaluation Guidelines for Empirical Studies in Software Engineering involving LLMs

Sebastian Baltes, Florian Angermeir, Chetan Arora, Marvin Muñoz Barón, Lukas Böhme, Fabio Calefato, Chunyang Chen, Neil Ernst, Davide Falessi, Brian Fitzgerald, Davide Fucci , Marcos Kalinowski, Stefano, Lambiase, Mircea Lungu, Lutz Prechelt, Paul Ralph, Christoph Treude, Stefan Wagner

preprint, 2025

This paper introduces community-developed guidelines for conducting and reporting empirical software engineering studies involving large language models (LLMs). It presents a taxonomy of study types and eight key recommendations to improve transparency, reproducibility, and replicability—covering areas such as model documentation, prompt reporting, tool architecture, and human validation.

Preprint DOI

Conference

Towards Automated Continuous Security Compliance

Florian Angermeir, Jannik Fischbach, Fabiola Moyón, Daniel Mendez

ESEM, 2024

Continuous Security Compliance is crucial for adopting Continuous Software Engineering in highly regulated domains, but traditional manual compliance methods are resource-intensive and error-prone, and the field lacks sufficient research. This paper defines continuous security compliance, outlines key challenges through a tertiary study, and proposes a research roadmap developed in collaboration with an industry partner to advance automation in this area.

Preprint DOI

Preprint

No Free Lunch: Research Software Testing in Teaching

Michael Dorner, Andreas Bauer, Florian Angermeir

preprint, 2024

This study explores the integration of research software testing into teaching, demonstrating that such efforts can improve research software quality—particularly documentation and dependency management—while also exposing students to real-world research software engineering. However, despite thoughtful student contributions, challenges such as unclear intellectual property rights and lack of incentives hinder the direct reuse of student code, limiting the full potential of this approach.

Preprint DOI

Conference

Automated Security Findings Management: A Case Study in Industrial DevOps

Markus Voggenreiter, Florian Angermeir, Fabiola Moyón, Ulrich Schöpp, Pierre Bonvin

ICSE, 2024

Through an industrial case study, we examine how automated security findings management can be integrated into DevOps workflows. Our findings show significant improvements in vulnerability response times and developer productivity.

Preprint DOI

Conference

Industrial Challenges in Secure Continuous Development

Fabiola Moyón, Florian Angermeir, Daniel Mendez

ICSE, 2024

The intersection of security and continuous software engineering remains a critical focus as agile and DevOps practices continue to shape development processes, with growing academic and practical interest in secure methodologies. This work summarizes validated challenges identified through practitioner engagement and outlines four key research directions to guide future efforts in scalable, secure continuous software engineering.

Preprint DOI

Technical Report

RefA: Reference Architecture for Security-compliant DevOps

Fabiola Moyón, Daniel Mendez, Tony Gorschek, Florian Angermeir, Pierre-Louis Bonvin, Markus Voggenreiter

Technical Report, 2023

This report introduces RefA, a reference architecture designed to support security-compliant DevOps by outlining relevant artefacts, practice areas, and the roles of people, processes, and technology. Developed through standards analysis, literature review, and industrial experience, RefA serves both practitioners aiming to assess secure DevOps lifecycles and researchers shaping future security-focused studies.

Preprint

Conference

Enterprise-Driven Open Source Software: A Case Study on Security Automation

Florian Angermeir, Markus Voggenreiter, Fabiola Moyón, Daniel Mendez

ICSE, 2021

This study investigates the integration of automated security activities within CI pipelines of enterprise-driven open source projects, revealing that such practices are rare despite maintainers recognizing the importance of security. By analyzing over 8,000 repositories and surveying project maintainers, the research highlights a significant gap in security automation and suggests areas for practical improvement and further study.

Preprint DOI

Publications

BibTeX Citation