Back to Portal

Data Categories

  • Code Analysis 11
  • Defect 61
  • Bad Smells 2
  • Bug Reports 6
  • CK 33
  • McCabe & Halsted 14
  • Other 6
  • Dump 5
  • Effort 14
  • Cobol 1
  • Cocomo 3
  • Function Points Analysis 4
  • ISBSG 2
  • Personnel 1
  • Other 3
  • Green Mining 3
  • Issues 16
  • Model 5
  • MSR 11
  • Performance Predict 1
  • Refactoring 2
  • Requirements 8
  • NRP 2
  • Other 6
  • Search-Based SE 4
  • Social Analysis 6
  • Software Aging 2
  • Software Maintenance 2
  • Spreadsheet 5
  • Test Generation 10
  • Other 35

tera-PROMISE Home
  • Dataset Categories
  • About
  • People
  • Contribute
    • By donating data
    • By finding data
  • repo
  • contribute

How to Contribute

Looking for the hackathon walkthrough? Click Here

We are always looking for new data to add to the PROMISE repository. The OpenScience project depends on the many people like you to put time and effort into maintaining and expanding this repository. People like you can make that happen by contributing! If you are a researcher who would like to donate data or if you would like to help expand the repository by attending a hackathon (approximately monthly events where we go through SE conferences to find datasets), read on.

There are two ways of contributing to the OpenScience project:

1. If you have software engineering-related data sets: Click here

You fall under this category if, for example, you are a researcher with a set of CK/spreadsheet/Cocomo data (just to name a few).

2. If you want to help us find datasets: Click here

You fall under this category if you are interested in helping us add datasets to tera-PROMISE by going through software engineering conferences to look for research papers that have SE datasets associated with them.


Help us find data

We will be periodically hosting what we call hackathons, where we systematically look through software engineering conferences and look for papers with associated datasets. If you are interested in joining one, either remotely with Google Hangouts or in person in Engineering Building II at North Carolina State University, fill out this Google interest form and check out the instructions on how to help during the hackathon. The next hackathon has not yet been scheduled. Check back later for updates.


Context Notes

If you are a researcher emailing us your data, please include the following sections of information about your data/research in the email.

If you are participating in a hackathon, please comment on the GitHub issue associated with the data, including the following sections of information in your comment.

1. Data source

  • Example: paper title, if this data is closely related to some original paper.

2. Link to the material associated with the dataset (if available)

  • Examples:
    • A link to the ACM digital library (for example, https://dl.acm.org/citation.cfm?id=2635868.2635905)
    • A link to the IEEE digital library (for example, https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6982619)
    • Any other documentation

3. Attribution list for the material

  • Include at least one contact email
  • Example: author list for a paper

4. BibTeX reference

  • NOTE: This will be how others will cite the data.
  • Example: BibTeX for the paper

5. Link to the datasets

  • If multiple files come with it, include links to all files, not just the parent file
    • Red flags should go up if, after downloading all the data, it is larger than 100 MB (compressed).

6. PROMISE repo category (effort, requirements, model, defect, etc.)

  • Categorize the data into one of the categories in the navbar on the OpenScience website.
    • If it doesn’t fit nicely into one of these categories, put “other” and propose a category name.

7. General overview of the data

  • Note that, if the data is closely related to a paper, this will not be the abstract of the paper.
  • If you can find an overview of the dataset in the author’s own words, just copy those words. If not, make your own overview. This should be at least a paragraph long.

8. Attribute info

  • If there are clearly defined columns or attributes to the data, describe each

9. Paper abstract (if appropriate)

10. Is this dataset part of a larger series or collection?

  • If so, link to the master index of the series or collection (if possible)

Site design by Mitch Rees-Jones
  • About
  • People
  • Contribute
  • Privacy
Supported by the National Science Foundation