U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Breadcrumb

  1. Home

NIST Excerpts Benchmark Data

The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.

About this Dataset

Updated: 2025-04-06
Metadata Last Updated: 2025-01-31 00:00:00
Date Created: N/A
Data Provided by:
Dataset Owner: N/A

Access this data

Contact dataset owner Access URL
Landing Page URL
Table representation of structured data
Title NIST Excerpts Benchmark Data
Description The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.
Modified 2025-01-31 00:00:00
Publisher Name National Institute of Standards and Technology
Contact mailto:[email protected]
Keywords privacy , synthetic data , demographic data , American Community Survey , SDNist
{
    "identifier": "ark:\/88434\/mds2-2895",
    "accessLevel": "public",
    "contactPoint": {
        "hasEmail": "mailto:[email protected]",
        "fn": "Gary Howarth II"
    },
    "programCode": [
        "006:045"
    ],
    "landingPage": "https:\/\/data.nist.gov\/od\/id\/mds2-2895",
    "title": "NIST Excerpts Benchmark Data",
    "description": "The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with \"SDNist: Synthetic Data Report Tool\", a package for evaluating synthetic data generators: https:\/\/github.com\/usnistgov\/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records,- NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k recordsThe data are curated subsets of U.S. Census Bureau products.",
    "language": [
        "en"
    ],
    "distribution": [
        {
            "accessURL": "https:\/\/github.com\/usnistgov\/SDNist\/tree\/main\/BenchmarkData",
            "format": "A data respository",
            "description": "The NIST Data Excerpts are curated subsets of publicly released tabular data sets, drawn from real households and businesses in the U.S. The Excerpts serve as benchmark data for the [SDNist v2: Deidentified Data Report Tool](https:\/\/github.com\/usnistgov\/SDNist\/) .",
            "title": "NIST Excerpt Benchmark Data"
        }
    ],
    "bureauCode": [
        "006:55"
    ],
    "modified": "2025-01-31 00:00:00",
    "publisher": {
        "@type": "org:Organization",
        "name": "National Institute of Standards and Technology"
    },
    "theme": [
        "Information Technology:Privacy",
        "Information Technology:Data and informatics",
        "Information Technology:Software research"
    ],
    "keyword": [
        "privacy",
        "synthetic data",
        "demographic data",
        "American Community Survey",
        "SDNist"
    ]
}