nasa93

This is a cross-listed data set that contains effort and defect information.

This data uses the attributes defined in the model.

URL

Change Log

When What
March 2010 JPL experts added expected number of defects and months to create nasa93-dem
February 8, 2006 Donated by Tim Menzies

Notes from the Author

  • Title/Topic: COCOMO NASA 2 / Software cost estimation

Sources

  • 93 NASA projects from different centers for projects from the following years:
n year
1 1971
1 1974
2 1975
2 1976
10 1977
4 1978
19 1979
11 1980
13 1982
7 1983
7 1984
6 1985
8 1986
2 1987
  • Collected by: Jairus Hihn, JPL, NASA, Manager SQIP Measurement & Benchmarking Element. Phone (818) 354-1248 (Jairus.M.Hihn@jpl.nasa.gov)

Past Usage

  • None with this specific data set. But for older work on similar data, see:
    • “Validation Methods for Calibrating Software Effort Models”, T. Menzies and D. Port and Z. Chen and J. Hihn and S. Stukes, Proceedings ICSE 2005,http://menzies.us/pdf/04coconut.pdf
    • Results:
      • Given background knowledge on 60 prior projects, a new cost model can be tuned to local data using as little as 20 new projects.
      • A very simple calibration method (COCONUT) can achieve PRED(30)=7% or PRED(20)=50% (after 20 projects). These are results seen in 30 repeats of an incremental cross-validation study.
      • Two cost models are compared; one based on just lines of code and one using over a dozen “effort multipliers”. Just using lines of code loses 10 to 20 PRED(N) points.
  • Additional Usage:
    • “Feature Subset Selection Can Improve Software Cost Estimation Accuracy” Zhihao Chen, Tim Menzies, Dan Port and Barry Boehm Proceedings PROMISE Workshop 2005,http://promise.site.uottawa.ca/proceedings/pdf/1.pdf P02, P03, P04 are used in this paper.
    • Results
      • To the best of our knowledge, this is the first report of applying feature subset selection (FSS) to software effort data.
      • FSS can dramatically improve cost estimation.
      • T-tests are applied to the results to demonstrate that always in our data sets, removing attributes improves performance without increasing the variance in model behavior.

Relevant Information

Number of instances

93

Number of attributes

24:

  • 15 standard COCOMO-I discrete attributes in the range Very_Low to Extra_High
  • 7 others describing the project
  • one lines of code measure
  • one goal field being the actual effort in person months.

Reference

See above section “Relevant Information” for further reference.

People