U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Breadcrumb

  1. Home

Toolkit and Curated Archive for COVID-19 Research Challenge Dataset

This GitHub repository contains a downloadable snapshot of National Institute of Standards and Technology's COVID-19 Data Repository, curated from the COVID-19 Open Research Dataset (CORD-19) provided by the Allen Institute for AI. Curated Archive for Covid-19 Research Challenge Dataset- The COVID-19 Data Repository provides searchable CORD-19 data and metadata, including full-text extracted from the original CORD-19 JavaScript Object Notation (JSON) files. It is built using the Configurable Data Curation System (CDCS) developed at NIST.

About this Dataset

Updated: 2025-04-06
Metadata Last Updated: 2020-04-03 00:00:00
Date Created: N/A
Data Provided by:
Dataset Owner: N/A

Access this data

Contact dataset owner Access URL
Landing Page URL
Table representation of structured data
Title Toolkit and Curated Archive for COVID-19 Research Challenge Dataset
Description This GitHub repository contains a downloadable snapshot of National Institute of Standards and Technology's COVID-19 Data Repository, curated from the COVID-19 Open Research Dataset (CORD-19) provided by the Allen Institute for AI. Curated Archive for Covid-19 Research Challenge Dataset- The COVID-19 Data Repository provides searchable CORD-19 data and metadata, including full-text extracted from the original CORD-19 JavaScript Object Notation (JSON) files. It is built using the Configurable Data Curation System (CDCS) developed at NIST.
Modified 2020-04-03 00:00:00
Publisher Name National Institute of Standards and Technology
Contact mailto:[email protected]
Keywords virus , covid19 , natural language processing , nlp , embedding
{
    "identifier": "ark:\/88434\/mds2-2201",
    "accessLevel": "public",
    "contactPoint": {
        "hasEmail": "mailto:[email protected]",
        "fn": "Rachael Sexton"
    },
    "programCode": [
        "006:045"
    ],
    "landingPage": "https:\/\/github.com\/usnistgov\/cv-py",
    "title": "Toolkit and Curated Archive for COVID-19 Research Challenge Dataset",
    "description": "This GitHub repository contains a downloadable snapshot of National Institute of Standards and Technology's COVID-19 Data Repository, curated from the COVID-19 Open Research Dataset (CORD-19) provided by the Allen Institute for AI. Curated Archive for Covid-19 Research Challenge Dataset- The COVID-19 Data Repository provides searchable CORD-19 data and metadata, including full-text extracted from the original CORD-19 JavaScript Object Notation (JSON) files. It is built using the Configurable Data Curation System (CDCS) developed at NIST.",
    "language": [
        "en"
    ],
    "distribution": [
        {
            "accessURL": "https:\/\/doi.org\/10.18434\/M32201",
            "title": "DOI access to Toolkit and Curated Archive for COVID-19 Research Challenge Dataset"
        },
        {
            "accessURL": "https:\/\/github.com\/usnistgov\/cord19-cdcs-nist",
            "format": "json, xml, csv",
            "description": "This GitHub repository contains a downloadable snapshot of National Institute of Standards and Technology's COVID-19 Data Repository, curated from the COVID-19 Open Research Dataset (CORD-19) provided by the Allen Institute for AI. About the Curated Archive for Covid-19 Research Challenge Dataset - The COVID-19 Data Repository provides searchable CORD-19 data and metadata, including full-text extracted from the original CORD-19 JavaScript Object Notation (JSON) files. It is built using the Configurable Data Curation System (CDCS) developed at NIST",
            "title": "cord19-cdcs-nist"
        }
    ],
    "bureauCode": [
        "006:55"
    ],
    "modified": "2020-04-03 00:00:00",
    "publisher": {
        "@type": "org:Organization",
        "name": "National Institute of Standards and Technology"
    },
    "theme": [
        "Information Technology:Data and informatics",
        "Health:Clinical diagnostics"
    ],
    "keyword": [
        "virus",
        "covid19",
        "natural language processing",
        "nlp",
        "embedding"
    ]
}