nickel

URL

Change Log

When What
February 6, 2016 Updated BibTeX, datalinks
March 31, 2005 Donated by Bart Massey

Notes from the Author

This is a PROMISE Software Engineering Repository data set made publicly available in order to encourage repeatable, verifiable, refutable, and/or improvable predictive models of software engineering.

  1. Title: Nickle Repository Transaction Data

  2. Sources: (a) Original creators of database: Bart Massey 01 503 725-5393 Computer Science Dept. PO Box 751 MS CMPS Portland State University Portland, OR USA 97207-0751 bart@cs.pdx.edu

(b) Donor of database: owner

(c) Date received: 31 March 2005

  1. Past Usage: none

  2. Relevant Information:

This dataset was assembled by analyzing the publicly-available CVS archives of the Nickle programming language (http://nickle.org) using a modified version of CVSAnalY (http://metricsgrimoire.github.io/CVSAnalY). It is intended for a wide variety of uses, and thus no dependent variable is specified. See Massey’s PROMISE 2005 paper, “Longitudinal Analysis of Long-Timescale Open Source Repository Data”, for further information.

  1. Number of Instances: 2972

  2. Number of Attributes: 10 (incl. 2 inferred)

  3. Attribute information:

    • Inferred filetype:
      • non-numeric—nominal
      • 9 values (documentation,images,i18n,ui,multimedia,code,build,devel-doc,unknown)
    • File Pathname: UNIX relative path name
      • non-numeric—structured
      • 15 directories, 265 files
    • Revision: dotted-decimal revision string
      • non-numeric—structured
      • 161 unique revision numbers
    • Author ID: integer identifier
      • non-numeric—nominal
      • 6 unique authors
    • Lines added: integer
      • numeric—integer
      • MIN: 0 MAX: 1011 MEAN: 24.0148 STDEV: 65.6858
    • Lines removed: integer
      • numeric—integer
      • MIN: 0 MAX: 678 MEAN: 11.6413 STDEV: 42.4373)
    • File has since been removed (“in Attic”): boolean (0, 1)
      • non-numeric—nominal
      • 11.98% positive
    • Commit has CVS_SILENT flag: boolean (0, 1)
      • non-numeric—nominal
      • never set
    • Inferred that committer was not author: boolean (0, 1)
      • non-numeric—nominal
      • never set
    • Commit date: date
      • MIN: “1999-01-13 06:22:11” MAX: “2005-01-14 16:58:53”
      • Note: CVS has trouble with timezones; we assume all dates are UTC.
  4. Missing Attribute Values:

  • unknown inferred filetype (attr 1): 69
  1. Class Distribution: N/A

Reference

This is one of the datasets contributed by Bart Massey.
Longitudinal analysis of long-timescale open source repository data

@inproceedings{Massey:2005:LAL:1083165.1083167,
 author = {Massey, Bart},
 title = {Longitudinal Analysis of Long-timescale Open Source Repository Data},
 booktitle = {Proceedings of the 2005 Workshop on Predictor Models in Software Engineering},
 series = {PROMISE '05},
 year = {2005},
 isbn = {-159593-125-2},
 location = {St. Louis, Missouri},
 pages = {1--5},
 numpages = {5},
 url = {http://doi.acm.org/10.1145/1082983.1083167},
 doi = {10.1145/1082983.1083167},
 acmid = {1083167},
 publisher = {ACM},
 address = {New York, NY, USA},
 keywords = {open source, source repositories},
}