menu

Semantically Enriched Jira Issue Tracking Dataset

This is the webpage of the Semantically Enriched Jira Issue Tracking Dataset, a dataset of Jira issues, including extracted topics, which can be used for semantic mining tasks.

The data stored in online project management systems incorporate semantics that can prove useful for answering several interesting questions, such as finding the most suitable developer for a task or determining the priority of an issue or the workload of the software team. In this context, we have built a dataset that includes more than a million issues retrieved from the Jira infrastructure of the Apache Software Foundation. The dataset has been further analyzed to extract the metadata of issues (users, comments, events, worklogs) as well as to distribute them to semantic topics.

In this webpage you can find all dataset releases as well as useful information for understanding and using the dataset. Furthermore, you can find publications using the dataset and relevant information in order to cite it in your work.

The dataset has been developed by the Intelligent Systems & Software Engineering Labgroup of the Aristotle University of Thessaloniki. It has been created and maintained by Themistoklis Diamantopoulos, Dimitrios-Nikitas Nastos, and Andreas Symeonidis.