msr13

URL

Change Log

When What
April 8th, 2015 Donated by Alberto Bacchelli

About the Data

Overview of Data

Studies who have been using the data (in any form) are required to add the following reference to their report/paper:

@INPROCEEDINGS{MSRChallenge2013,
   author={Alberto Bacchelli},
   title={Mining Challenge 2013: Stack Overflow},
   booktitle={The 10th Working Conference on Mining Software Repositories},
   year={2013},
   pages={to appear}
}

Data

The 2013 data is all from Stack Overflow. They have an XML-formatted “dump” from Stack Overflow with irrelevant entries removed, though I’m not sure what exactly this is a dump of. (file size >7GB) They also provide a progresqul-formatted dump of the same data.

XML: 201208_stack_overflow_official.tar.bz

progresql: 201208_stack_overflow_postgres_dump.tar.bz