Back to Portal

Data Categories

  • Code Analysis 11
  • Defect 61
  • Bad Smells 2
  • Bug Reports 6
  • CK 33
  • McCabe & Halsted 14
  • Other 6
  • Dump 5
  • Effort 14
  • Cobol 1
  • Cocomo 3
  • Function Points Analysis 4
  • ISBSG 2
  • Personnel 1
  • Other 3
  • Green Mining 3
  • Issues 16
  • Model 5
  • MSR 11
  • Performance Predict 1
  • Refactoring 2
  • Requirements 8
  • NRP 2
  • Other 6
  • Search-Based SE 4
  • Social Analysis 6
  • Software Aging 2
  • Software Maintenance 2
  • Spreadsheet 5
  • Test Generation 10
  • Other 35

tera-PROMISE Home
  • Dataset Categories
  • About
  • People
  • Contribute
    • By donating data
    • By finding data
  • repo
  • contribute
  • contextnotes.html

Context Notes

Datasets added to the PROMISE repository must have context notes so that people accessing the datasets in the repository know what it’s about. As we centralize research data, it is important to also centralize the knowledge associated with that data in the form of context notes outlined here.

If you are a researcher emailing us your data, please include the following sections of information about your data/research in the email.

If you are participating in a hackathon, please comment on the GitHub issue associated with the data, including the following sections of information in your comment.

1. Data source

  • Example: paper title, if this data is closely related to some original paper.

2. Link to the material associated with the dataset (if available)

  • Examples:
    • A link to the ACM digital library (for example, https://dl.acm.org/citation.cfm?id=2635868.2635905)
    • A link to the IEEE digital library (for example, https://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6982619)
    • Any other documentation

3. Attribution list for the material

  • Include at least one contact email
  • Example: author list for a paper

4. BibTeX reference

  • NOTE: This will be how others will cite the data.
  • Example: BibTeX for the paper

5. Link to the datasets

  • If multiple files come with it, include links to all files, not just the parent file
    • Red flags should go up if, after downloading all the data, it is larger than 100 MB (compressed).

6. PROMISE repo category (effort, requirements, model, defect, etc.)

  • Categorize the data into one of the categories in the navbar on the OpenScience website.
    • If it doesn’t fit nicely into one of these categories, put “other” and propose a category name.

7. General overview of the data

  • Note that, if the data is closely related to a paper, this will not be the abstract of the paper.
  • If you can find an overview of the dataset in the author’s own words, just copy those words. If not, make your own overview. This should be at least a paragraph long.

8. Attribute info

  • If there are clearly defined columns or attributes to the data, describe each

9. Paper abstract (if appropriate)

10. Is this dataset part of a larger series or collection?

  • If so, link to the master index of the series or collection (if possible)

Site design by Mitch Rees-Jones
  • About
  • People
  • Contribute
  • Privacy
Supported by the National Science Foundation