1st International Workshop on Mining Software Repositories for Software Architecture (MSR4SA’21)

  • Co-located with ECSA 2021
  • Virtual
  • September, 2021

Theme & Goals

Mining software repositories (MSR) became essential to support several software architectural design activities, such as architectural recovery, architectural knowledge capturing, mining architectural tactics, etc. Nevertheless, MSR to support software architecture (SA) is a challenging task, which requires expertise and focus. Moreover, MSR and SA are two separate communities with a small intersection. On the one hand, SA researchers are concerned with proposing approaches to support architectural activities. On the other hand, the MSR community exploits different software repositories, and applies systematically many data mining techniques.

The goal of MSR4SA is to gather researchers and practitioners from both the MSR and SA communities to create the starting point for a dedicated research community, which focusses on applying MSR techniques to resolve architectural problems. Moreover, we aim to create the first future research agenda for the research area of MSR for SA.

Topics of Interest

MSR4SA 2021 seeks contributions addressing, but not limited to, the following topics related to MSR and software architecture:

  • Analysis of challenges in applying MSR for SA
  • Datasets for software architecture
  • Machine learning techniques for mining software architectures
  • Natural language processing for software architecture
  • Software architecture in DevOps
  • Qualitative studies on software repositories, regarding software architecture
  • Use cases and case studies on applying MSR for SA
  • Processes and tools supporting the development of approaches to MSR for SA
  • State-of-the-art analysis which relates MSR and SA
  • Best practices and lessons learned for applying MSR for SA
  • Mining and integrating multiple data sources for software architecture
  • Mining software repositories for assessing architectural technical debt
  • Mining software repositories for architectural analysis
  • Mining architectural tactics
  • Mining architectural patterns and styles
  • Mining reference architectures
  • Architecture visualization from software repositories
  • Integration of structured and unstructured data for software architecture
  • Mining architectural knowledge, decisions, and rationale
  • Mining software repositories for non-functional quality attributes
  • Mining software ecosystems and systems-of-systems
  • Foundational principles of MSR for SA
  • Mining architecturally-relevant requirements
  • MSR and architecture description languages
  • Mining architecture views and viewpoints
  • Mining microservice-based systems
  • Systematic literature reviews on MSR for SA
  • Empirical studies on MSR for SA
  • Industrial experiences and challenges on MSR for SA
  • Model-Driven Engineering techniques for MSR for SA

Keynote

We are happy to host the following keynote talks.

Nenad Medvidovic Nenad Medvidovic - Professor of Computer Science, University of Southern California

Mining Architectural Information to Stem Technical Debt - Software engineers tend to document their systems’ architectures sporadically and superficially. Additionally, engineers frequently neglect to carefully consider the architectural impact of their changes to a system’s implementation. As a result, an existing system’s architecture will over time deviate from the engineers’ intent, and it will decay through unplanned introduction of new and/or invalidation of existing design decisions. Technical debt accumulates through architectural decay, increasing the cost of making modifications to a system and decreasing the system’s dependability. In this talk, I will focus on isolating three types of architectural information from the details readily available about a system’s implementation: architectural design decisions, change, and decay. I will show how this information can be used to identify the locations in a software system’s implementation that reflect the architectural decay, the points in a system’s lifetime when that decay tends to occur, and the reasons why it occurs. I will show how architectural decay tends to correlate with the occurrence of commonly reported implementation-level issues, and how it can be predicted. Finally, I will identify steps that engineers can take to manage the accumulated technical debt by stemming and reversing the decay. Data obtained by analyzing hundreds of versions of several well-known systems — Android, Hadoop, Cassandra, Struts, etc. — will be used to illustrate the discussion throughout.

Workshop Program

  • Ilaria Pigazzini , Davide Foppiani and Francesca Arcelli Fontana. Two different facets of architectural smells criticality: an empirical study - Architectural smells (AS) represent symptoms of problems at architectural level that have an impact on architectural debt. It is important to identify among them the most critical ones, so that developers can prioritize them for their removal. In order to evaluate the criticality of AS, in this paper we consider two facets: the PageRank metric, to assess the centrality of a smell in a project, and Severity, a metric to estimate the cost-solving of smells. We have proposed these two metrics in a previous work and here we perform an empirical analysis of the evolution and correlation of these metrics in the version history of 10 projects (at least 22 versions each, 264 projects in total). The analysis of the evolution is useful in order to identify which architectural smells types tend to become more critical. The analysis of the correlation is useful to study whether the criticality of a smell has an influence on how much it costs to remove it, and vice-versa.
  • Carlos Paradis , Rick Kazman. Design Choices in Building an MSR Tool: The Case of Kaiaulu - Background: Since Alitheia Core was proposed and subsequently retired, tools that support empirical studies of software continue to be proposed, such as Codeface, Codeface4Smells, GrimoireLab and SmartSHARK, but they all make different design choices with overlapping functionality. Aims: We seek to understand the design decisions adopted on these tools– good and bad–and their consequences to understand why their authors reinvented functionality already present in other tools, and to help inform the design of future tools. Method: We used action research to evaluate the tools, and determine principles and anti-patterns to motivate a new tool design. Results: We identified 7 major design choices among the tools: 1) Abstraction Debt, 2) the use of Project Configuration Files, 3) the choice of Batch or Interactive Mode, 4) Minimal Paths to Data, 5) Familiar Software Abstractions, 6) Licensing and 7) the Perils of Code Reuse. Building on the observed good and bad design decisions, we created our own architecture and implemented it as an R package. Conclusions: Tools should not require onerous setup for users to obtain data. Authors should consider the conventions and abstractions used by their chosen language and build upon these instead of redefining them. Tools should encourage best practices in experiment reproducibility by leveraging self-contained and readable schemas that are used for tool automation, and reuse must be done with care to avoid depending on dead code.

Submission Guidelines

Prospective participants are invited to submit:

  • Research papers presenting novel contributions on mining software repositories for software architecture (max. 10 pages)
  • Industry papers which shows the application and challenges of MSR techniques for SA (max. 6 pages)
  • Dataset and tool papers on MSR for SA (max. 6 pages)

Workshop papers must follow the two-column CEUR-ART style Format and Submission Guidelines. All submitted papers will be reviewed on the basis of technical quality, relevance, significance, and clarity by the program committee. All workshop papers should be submitted electronically in PDF format through the EasyChair workshop website. All accepted papers will be published at CEUR. After the workshop, we will organize post-proceedings of selected and extended papers of workshops that will be published in a Springer LNCS volume (16 to 18 pages). Candidate papers for post-proceedings will be identified by the reviewers and the workshop chairs, based on the paper quality and its relevance to the workshop. Workshop papers submitted for the post-proceedings will undergo an additional review cycle.

In addition to the workshop publications, we aim to publish the first research agenda to MSR for SA. This will be based on the discussions within the MSR4SA 2021 workshop, and thus all workshop participants will be included as authors in a joint publication.

Important Dates

  • Paper submissions due: June 25, 2021July 02, 2021 (AoE time)
  • Notification to authors: July 22, 2021 (AoE time)
  • Camera-ready copies due: July 29, 2021 (AoE time)

Organizing Committee

Program Committee

  • Michel Albonico, Vrije Universiteit Amsterdam (The Netherlands), Federal University of Technology - Parana, UTFPR (Brazil)
  • Muhammad “Ali Babar”, University of Adeleide, Australia
  • Mauricio Aniche, Delft University of Technology, The Netherlands
  • Francesca “Arcelli Fontana”, University of Milano-Bicocca, Italy
  • Paris Avgeriou, University of Groningen, The Netherlands
  • Magiel Bruntink, Software Improvement Group, The Netherlands
  • Barbora Buhnova, Masaryk University, Czech Republic
  • Yuanfang Cai, Drexel University, USA
  • Massimiliano Di Penta, University of Sannio, Italy
  • Neil Ernst, University of Victoria, Canada
  • Matthias Galster, University of Canterbury, New Zealand
  • Joshua Garcia, University of California, Irvine, USA
  • Ian Gorton, Northeastern University, USA
  • Rick Kazman, University of Hawaii , USA
  • Heiko Koziolek, ABB Corporate Research, Germany
  • Peng Liang, Wuhan University, China
  • David Lo, Singapore Management University, Singapore
  • Patrick Mäder, Technische Universität Ilmenau, Germany
  • Ipek Ozkaya, SEI - Carnegie Mellon University, USA
  • Antonino Sabetta, SAP Labs, France
  • Bradley Schmerl, Carnegie Mellon University, USA
  • Arie “van Deursen”, Delft University of Technology, The Netherlands
  • Roberto Verdecchia, Vrije Universiteit Amsterdam, The Netherlands
  • Stefan Wagner, University of Stuttgart, Germany
  • Elisa “Yumi Nakagawa”, University of São Paulo, Brasil

Call for Papers