U.S. flag

An official website of the United States government

Dot gov

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Https

Secure .gov websites use HTTPS
A lock () or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Breadcrumb

  1. Home

Code used to produce terms list in the work "NLP-Driven Electron Microscopy Ontology Development"

This is a collection of code written by Maurice Curran that was used to process the Microscopy and Microanalysis conference proceeding corpus into word products described in the publication "NLP-Driven Electron Microscopy Ontology Development". The scripts are written in Python, to be used in the following order:1. SettingUpTextFiles.py and CopyingText.py to get the raw text files; 2. SentenceConversion.py; 3. reference_remover.py; 4. testing.py and testingavg.py; 5. SentenceCreator.py; 6. matscholar_model.py to get matscholar tags; 7. training_model_gensim.py to get gensim model;8. word2vecscript.py and gensim_visual.py;

About this Dataset

Updated: 2025-04-06
Metadata Last Updated: 2021-12-31 00:00:00
Date Created: N/A
Data Provided by:
Dataset Owner: N/A

Access this data

Contact dataset owner Landing Page URL
Download URL
Table representation of structured data
Title Code used to produce terms list in the work "NLP-Driven Electron Microscopy Ontology Development"
Description This is a collection of code written by Maurice Curran that was used to process the Microscopy and Microanalysis conference proceeding corpus into word products described in the publication "NLP-Driven Electron Microscopy Ontology Development". The scripts are written in Python, to be used in the following order:1. SettingUpTextFiles.py and CopyingText.py to get the raw text files; 2. SentenceConversion.py; 3. reference_remover.py; 4. testing.py and testingavg.py; 5. SentenceCreator.py; 6. matscholar_model.py to get matscholar tags; 7. training_model_gensim.py to get gensim model;8. word2vecscript.py and gensim_visual.py;
Modified 2021-12-31 00:00:00
Publisher Name National Institute of Standards and Technology
Contact mailto:[email protected]
Keywords Natural language processing , NLP , electron microscopy , controlled vocabulary , ontology
{
    "identifier": "ark:\/88434\/mds2-3198",
    "accessLevel": "public",
    "contactPoint": {
        "hasEmail": "mailto:[email protected]",
        "fn": "June W. Lau"
    },
    "programCode": [
        "006:045"
    ],
    "landingPage": "https:\/\/data.nist.gov\/od\/id\/mds2-3198",
    "title": "Code used to produce terms list in the work \"NLP-Driven Electron Microscopy Ontology Development\"",
    "description": "This is a collection of code written by Maurice Curran that was used to process the Microscopy and Microanalysis conference proceeding corpus into word products described in the publication \"NLP-Driven Electron Microscopy Ontology Development\". The scripts are written in Python, to  be used in the following order:1. SettingUpTextFiles.py and CopyingText.py to get the raw text files; 2. SentenceConversion.py; 3. reference_remover.py; 4. testing.py and testingavg.py; 5. SentenceCreator.py; 6. matscholar_model.py to get matscholar tags; 7. training_model_gensim.py to get gensim model;8. word2vecscript.py and gensim_visual.py;",
    "language": [
        "en"
    ],
    "distribution": [
        {
            "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-3198\/PythonFiles_Maurice_clean.zip",
            "description": "This zip file contains a set of scripts that extracts frequently occurring words from the conference proceedings of Microscopy & Microanalysis between the years of 2002 and 2019.",
            "mediaType": "application\/zip",
            "title": "NLP code to produce words about electron microscopy"
        }
    ],
    "bureauCode": [
        "006:55"
    ],
    "modified": "2021-12-31 00:00:00",
    "publisher": {
        "@type": "org:Organization",
        "name": "National Institute of Standards and Technology"
    },
    "theme": [
        "Information Technology:Data and informatics",
        "Materials:Modeling and computational material science",
        "Materials:Materials characterization"
    ],
    "keyword": [
        "Natural language processing",
        "NLP",
        "electron microscopy",
        "controlled vocabulary",
        "ontology"
    ]
}