Back to Portal

Data Categories

  • Code Analysis 11
  • Defect 61
  • Bad Smells 2
  • Bug Reports 6
  • CK 33
  • McCabe & Halsted 14
  • Other 6
  • Dump 5
  • Effort 14
  • Cobol 1
  • Cocomo 3
  • Function Points Analysis 4
  • ISBSG 2
  • Personnel 1
  • Other 3
  • Green Mining 3
  • Issues 16
  • Model 5
  • MSR 11
  • Performance Predict 1
  • Refactoring 2
  • Requirements 8
  • NRP 2
  • Other 6
  • Search-Based SE 4
  • Social Analysis 6
  • Software Aging 2
  • Software Maintenance 2
  • Spreadsheet 5
  • Test Generation 10
  • Other 35

tera-PROMISE Home
  • Dataset Categories
  • About
  • People
  • Contribute
    • By donating data
    • By finding data
  • repo

Welcome to one of the largest repositories of SE research data

The tera-PROMISE Repository is a research dataset repository specializing in software engineering research datasets. We offer free and long-term storage for your research artifacts. Learn more on our about page.

How to Reference Us:

  • Menzies, T., Krishna, R., Pryor, D. (2016). The Promise Repository of Empirical Software Engineering Data; https://openscience.us/repo. North Carolina State University, Department of Computer Science bibtex.

You can view all of our datasets in the categories listed on the left and on the categories page.


Find research datasets

We have everything from McCabe & Halsted to Spreadsheets to Green Mining.

View categories

Contribute your data

Learn how to contribute your research data, whether you're a researcher or a student.

Learn how


Featured dataset: EUSES

This is a large sample of spreadsheets (5607 total files, 4499 of which are unique and suitable for automated processing in Excel) that researchers can use to evaluate their methodologies and tools for creating and maintaining spreadsheets.

Latest News

June 06, 2016: The OpenScience Repository has experienced a server outage. For help, contact us at our contact form.

September 23, 2015: The tera-PROMISE team welcomes new member, David Pryor!

May 05, 2015: The tera-PROMISE frontend has been rebuilt with Bootstrap.

December 06, 2014: 80% of data migrated from Google Code site.

November 01, 2014: N.C. State approves the new repository space. One terabyte!

Most Recently Added Datasets (more here)

Use Case Points Benchmark Dataset

Analysis and selection of a regression model for the Use Case Points method using a stepwise approach

Qualitas Corpus

The Qualitas Corpus is a curated collection of software systems intended to be used for empirical studies of code artefacts.

Mutation Testing Operators

An intuitive approach to determine test adequacy in safety-critical software

RefactDataSetValidated

A Manually Validated Code Refactoring Dataset and Its Assessment Regarding Software Maintainability

Maline

Evaluation of Android Malware Detection Based on System Calls


Site design by Mitch Rees-Jones
  • About
  • People
  • Contribute
  • Privacy
Supported by the National Science Foundation