U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Breadcrumb

  1. Home

NIST Diverse Community Excerpts Data

The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist

About this Dataset

Updated: 2024-02-22
Metadata Last Updated: 2022-12-05 00:00:00
Date Created: N/A
Views:
Data Provided by:
privacy
Dataset Owner: N/A

Access this data

Contact dataset owner Access URL
Landing Page URL
Table representation of structured data
Title NIST Diverse Community Excerpts Data
Description The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist
Modified 2022-12-05 00:00:00
Publisher Name National Institute of Standards and Technology
Contact mailto:[email protected]
Keywords privacy , synthetic data , demographic data , American Community Survey , SDNist
{
    "identifier": "ark:\/88434\/mds2-2895",
    "accessLevel": "public",
    "contactPoint": {
        "hasEmail": "mailto:[email protected]",
        "fn": "Gary Howarth II"
    },
    "programCode": [
        "006:045"
    ],
    "@type": "dcat:Dataset",
    "landingPage": "https:\/\/data.nist.gov\/od\/id\/mds2-2895",
    "description": "The Diverse Community Excerpts are a set of tabular demographic data of households in the United States drawn from real records released in the American Community Survey, a product of the US Census Bureau. The data contain 24 features and are partitioned into three geographic regions: Boston area (7634 records), Dallas-Forth Worth area (9276 records), and US national (27254 records). The feature set is identical for all partitions, but the demographics vary radically between the geographic regions. Therefore, these data are well suited for comparisons of synthetic demographic data generator performance.Detailed documentation for usage, design, and purpose of the data are included in the repository including brief descriptions of localities that the data represent.These data are incorporated into the \"SDNist: Synthetic Data Report Tool\", a package for evaluating synthetic data generators: https:\/\/github.com\/usnistgov\/SDNist",
    "language": [
        "en"
    ],
    "title": "NIST Diverse Community Excerpts Data",
    "distribution": [
        {
            "accessURL": "https:\/\/github.com\/usnistgov\/SDNist\/tree\/main\/nist%20diverse%20communities%20data%20excerpts",
            "format": "A data respository",
            "description": "This repository is contains limited feature (22 columns) excerpts from the American Community Survey partitioned into three geographic regions. It is available as part of the SDNist synthetic data evaluation package.",
            "title": "NIST Diverse Communities Excerpt Data"
        }
    ],
    "license": "https:\/\/www.nist.gov\/open\/license",
    "bureauCode": [
        "006:55"
    ],
    "modified": "2022-12-05 00:00:00",
    "publisher": {
        "@type": "org:Organization",
        "name": "National Institute of Standards and Technology"
    },
    "describedBy": "https:\/\/raw.githubusercontent.com\/usnistgov\/SDNist\/main\/nist%20diverse%20communities%20data%20excerpts\/data_dictionary.json",
    "theme": [
        "Information Technology:Privacy",
        "Information Technology:Data and informatics",
        "Information Technology:Software research"
    ],
    "issued": "2023-06-02",
    "keyword": [
        "privacy",
        "synthetic data",
        "demographic data",
        "American Community Survey",
        "SDNist"
    ]
}

Was this page helpful?