Data provided by National Institute of Standards and Technology
A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see a forthcoming manuscript titled "Leveraging theory for enhanced machine learning," by Debra J. Audus, Austin McDannald and Brian DeCost.
About this Dataset
Updated: 2022-07-29
Metadata Last Updated:
2022-05-06
Date Created:
N/A
Views:
Data Provided by:
National Institute of Standards and Technology
Dataset Owner:
N/A
Title | Theory aware Machine Learning (TaML) |
---|---|
Description | A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see a forthcoming manuscript titled "Leveraging theory for enhanced machine learning," by Debra J. Audus, Austin McDannald and Brian DeCost. |
Modified | N/A |
Publisher Name | National Institute of Standards and Technology |
Contact | mailto:debra.audus@nist.gov |
Keywords | polymers , machine learning , transfer learning , theory |
{ "identifier": "ark:\/88434\/mds2-2637", "accessLevel": "public", "contactPoint": { "hasEmail": "mailto:debra.audus@nist.gov", "fn": "Debra Audus" }, "programCode": [ "006:045" ], "@type": "dcat:Dataset", "landingPage": "https:\/\/data.nist.gov\/od\/id\/mds2-2637", "description": "A code repository and accompanying data for incorporating imperfect theory into machine learning for improved prediction and explainability. Specifically, it focuses on the case study of the dimensions of a polymer chain in different solvent qualities. Jupyter Notebooks for quickly testing concepts and reproducing figures, as well as source code that computes the mean squared error as a function of dataset size for various machine learning models are included.For additional details on the data, please refer to the README.md associated with the data. For additional details on the code, please refer to the README.md provided with the code repository (GitHub Repo for Theory aware Machine Learning). For additional details on the methodology, see a forthcoming manuscript titled \"Leveraging theory for enhanced machine learning,\" by Debra J. Audus, Austin McDannald and Brian DeCost.", "language": [ "en" ], "title": "Theory aware Machine Learning (TaML)", "distribution": [ { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_direct.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_direct.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_difference.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_difference.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_quotient.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_quotient.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_linearprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_linearprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_fixedprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_fixedprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_parameterization.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_parameterization.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_direct.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_direct.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_difference.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_difference.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_quotient.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_quotient.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_linearprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_linearprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_fixedprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_fixedprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_parameterization.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_parameterization.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_direct.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_direct.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_difference.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_difference.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_quotient.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_quotient.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_linearprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_linearprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_fixedprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_fixedprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_parameterization.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_parameterization.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/out_theory.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/out_theory.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_direct_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_direct_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_quotient_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_quotient_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_latentvariable_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_latentvariable_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_multitask_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_multitask_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_difference_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_difference_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_quotient_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_quotient_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_latentvariable_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_latentvariable_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_multitask_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_multitask_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_latentvariable.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_latentvariable.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_latentvariable.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_out_latentvariable.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_latentvariable.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_latentvariable.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_latentvariable.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/homo_out_latentvariable.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_difference_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_difference_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_direct_1000_None.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/rf_out_direct_1000_None.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/testtrain\/rgmaindata.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/testtrain\/rgmaindata.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/testtrain\/rgoutlierdata.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/testtrain\/rgoutlierdata.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/README.md.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/README.md", "format": "text", "description": "README file for Theory aware Machine Learning (TaML)", "mediaType": "text\/plain", "title": "README" }, { "accessURL": "https:\/\/github.com\/usnistgov\/TaML", "format": "HTML", "title": "GitHub Repo for Theory aware Machine Learning (TaML)" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_direct.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_direct.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/theory.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/theory.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_difference.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_difference.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_quotient.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_quotient.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_linearprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_linearprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_fixedprior.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_fixedprior.csv.sha256", "mediaType": "text\/plain" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_parameterization.csv", "mediaType": "text\/csv" }, { "downloadURL": "https:\/\/data.nist.gov\/od\/ds\/ark:\/88434\/mds2-2637\/mse\/hetero_parameterization.csv.sha256", "mediaType": "text\/plain" } ], "license": "https:\/\/www.nist.gov\/open\/license", "bureauCode": [ "006:55" ], "modified": "2022-05-06 00:00:00", "publisher": { "@type": "org:Organization", "name": "National Institute of Standards and Technology" }, "accrualPeriodicity": "irregular", "theme": [ "Mathematics and Statistics:Uncertainty quantification", "Materials:Modeling and computational material science", "Information Technology:Data and informatics", "Materials:Polymers" ], "keyword": [ "polymers", "machine learning", "transfer learning", "theory" ] }