On Kaggle there is a data set published named "IBM HR Analytics Employee Attrition & Performance" to predict attrition of your valuable employees. This is a very popular dataset and has usability index of 8.8.
https://www.kaggle.com/pavansubhasht/ibm-hr-analytics-attrition-dataset
But this data set has only 1470 rows whereas we need, sometimes, a large data set for testing. So, I have generated the files upto 5 million records.
Disclaimer – The datasets are generated through random logic in VBA. These are not real HR data and should not be used for any other purpose other than testing.
Other data set – Human Resources Credit Card Sales Bank Transactions
Note – I have been approached for the permission to use data set by individuals / organizations. I just want to clarify one thing. Anything published on this is completely copyright free. You can use anything from this site without any obligation. You can even call the content from this site as your own. Hope, it clarifies. There is absolutely no need to ask for permission for use of this data set.
You can download sample csv files ranging from 100 records to 5000000 records. 5 Million records will cross 1 million limit of Excel. But 5 Million Records are useful for Power Query / Power Pivot.
Below are the fields which appear as part of these csv files as first line.
The glossary is also below for some fields
The Excel containing Macro to generate these records can be downloaded from IBM HR Analytics Employee Attrition & Performance. The result data will be populated in Master tab.
100 HRA Records | 1000 HRA Records | 5000 HRA Records | 10000 HRA Records |
zip, 5KB | zip, 40KB | zip, 194KB | zip, 386KB |
50000 HRA Records | 1000000 HRA Records | 500000 HRA Records | 1000000 HRA Records |
zip, 1.93MB | zip, 3.85MB | zip, 19.27MB | zip, 38.54MB |
1500000 HRA Records | 2000000 HRA Records |
5000000 HRA Records | |
zip, 58.81MB | zip, 77.08MB | zip, 193.71MB |
.
..