The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.
About this Dataset
Title | NIST Excerpts Benchmark Data |
---|---|
Description | The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products. |
Modified | 2025-01-31 00:00:00 |
Publisher Name | National Institute of Standards and Technology |
Contact | mailto:[email protected] |
Keywords | privacy , synthetic data , demographic data , American Community Survey , SDNist |
{ "identifier": "ark:\/88434\/mds2-2895", "accessLevel": "public", "contactPoint": { "hasEmail": "mailto:[email protected]", "fn": "Gary Howarth II" }, "programCode": [ "006:045" ], "landingPage": "https:\/\/data.nist.gov\/od\/id\/mds2-2895", "title": "NIST Excerpts Benchmark Data", "description": "The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with \"SDNist: Synthetic Data Report Tool\", a package for evaluating synthetic data generators: https:\/\/github.com\/usnistgov\/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.", "language": [ "en" ], "distribution": [ { "accessURL": "https:\/\/github.com\/usnistgov\/SDNist\/tree\/main\/BenchmarkData", "format": "A data respository", "description": "The NIST Data Excerpts are curated subsets of publicly released tabular data sets, drawn from real households and businesses in the U.S. The Excerpts serve as benchmark data for the [SDNist v2: Deidentified Data Report Tool](https:\/\/github.com\/usnistgov\/SDNist\/) .", "title": "NIST Excerpt Benchmark Data" } ], "bureauCode": [ "006:55" ], "modified": "2025-01-31 00:00:00", "publisher": { "@type": "org:Organization", "name": "National Institute of Standards and Technology" }, "theme": [ "Information Technology:Privacy", "Information Technology:Data and informatics", "Information Technology:Software research" ], "keyword": [ "privacy", "synthetic data", "demographic data", "American Community Survey", "SDNist" ] }