Python Systems for Empirical Analysis

URL

Authors

Matteo Orrù, Ewan Tempero, Michele Marchesi, Roberto Tonelli, Giuseppe Destefanis

Change Log

When What
Aug 14th, 2015 Donated by Matteo Orrù

Reference

Studies who have been using the data (in any form) are required to include the following reference:

@inproceedings{Orru2015,
abstract = {The aim of this paper is to present a dataset of metrics associated to the first release of a curated collection of Python
software systems. We describe the dataset along with the adopted criteria and the issues we faced while building such
corpus. This dataset can enhance the reliability of empirical studies, enabling their reproducibility, reducing their cost,
and it can foster further research on Python software.},
author = {Orrú, Matteo and Tempero, Ewan and Marchesi, Michele and Tonelli, Roberto and Destefanis, Giuseppe},
booktitle = {Submitted to PROMISE '15},
keywords = {Python, Empirical Studies, Curated Code Collection},
title = {A Curated Benchmark Collection of Python Systems for Empirical Studies on Software Engineering},
year = {2015}
}

About the Data

Overview

This paper presents a dataset of metrics taken from a curated collection of 51 popular Python software systems.

The dataset reports 41 metrics of different categories: volume/size, complexity and object oriented metrics. These metrics and computed both at file and class level. We provide metrics for any file and class of each system and global metrics (computed on the entire system). Moreover we provide 14 meta-data for each system.

Paper Abstract

The aim of this paper is to present a dataset of metrics associated to the first release of a curated collection of Python software systems. We describe the dataset along with the adopted criteria and the issues we faced while building such corpus. This dataset can enhance the reliability of empirical studies, enabling their reproducibility, reducing their cost, and it can foster further research on Python software.