The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist
About this Dataset
Title | NIST Diverse Community Excerpts Data |
---|---|
Description | The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist |
Modified | 2022-12-05 00:00:00 |
Publisher Name | National Institute of Standards and Technology |
Contact | mailto:[email protected] |
Keywords | privacy , synthetic data , demographic data , American Community Survey , SDNist |
{ "identifier": "ark:\/88434\/mds2-2895", "accessLevel": "public", "contactPoint": { "hasEmail": "mailto:[email protected]", "fn": "Gary Howarth II" }, "programCode": [ "006:045" ], "landingPage": "https:\/\/github.com\/usnistgov\/SDNist\/tree\/main\/nist%20diverse%20communities%20data%20excerpts", "title": "NIST Diverse Community Excerpts Data", "description": "The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the \"SDNist: Synthetic Data Report Tool\", a package for evaluating synthetic data generators: https:\/\/github.com\/usnistgov\/SDNist", "language": [ "en" ], "distribution": [ { "accessURL": "https:\/\/github.com\/usnistgov\/SDNist\/tree\/main\/nist%20diverse%20communities%20data%20excerpts", "format": "A data respository", "description": "This repository is contains limited feature (22 columns) excerpts from the American Community Survey partitioned into three geographic regions. It is available as part of the SDNist synthetic data evaluation package.", "title": "NIST Diverse Communities Excerpt Data" } ], "bureauCode": [ "006:55" ], "modified": "2022-12-05 00:00:00", "publisher": { "@type": "org:Organization", "name": "National Institute of Standards and Technology" }, "theme": [ "Information Technology:Privacy", "Information Technology:Data and informatics", "Information Technology:Software research" ], "keyword": [ "privacy", "synthetic data", "demographic data", "American Community Survey", "SDNist" ] }