The dataset viewer is not available for this dataset.
Error code: ConfigNamesError Exception: DataFilesNotFoundError Message: No (supported) data files found in microsoft/timewarp Traceback: Traceback (most recent call last): File "/src/services/worker/src/worker/job_runners/dataset/config_names.py", line 73, in compute_config_names_response config_names = get_dataset_config_names( File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/inspect.py", line 347, in get_dataset_config_names dataset_module = dataset_module_factory( File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py", line 1873, in dataset_module_factory raise e1 from None File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py", line 1854, in dataset_module_factory return HubDatasetModuleFactoryWithoutScript( File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py", line 1245, in get_module module_name, default_builder_kwargs = infer_module_for_data_files( File "/src/services/worker/.venv/lib/python3.9/site-packages/datasets/load.py", line 595, in infer_module_for_data_files raise DataFilesNotFoundError("No (supported) data files found" + (f" in {path}" if path else "")) datasets.exceptions.DataFilesNotFoundError: No (supported) data files found in microsoft/timewarp
Need help to make the dataset viewer work? Open a discussion for direct support.
Timewarp datasets
This dataset contains molecular dynamics simulation data that was used to train the neural networks in the NeurIPS 2023 paper Timewarp: Transferable Acceleration of Molecular Dynamics by Learning Time-Coarsened Dynamics by Leon Klein, Andrew Y. K. Foong, Tor Erlend Fjelde, Bruno Mlodozeniec, Marc Brockschmidt, Sebastian Nowozin, Frank Noé, and Ryota Tomioka. Please see the accompanying GitHub repository.
This dataset consists of many molecular dynamics trajectories of small peptides (2-4 amino acids) simulated with an implicit water force field. For each protein two files are available:
protein-state0.pdb
: contains the topology and initial 3D XYZ coordinates.protein-arrays.npz
: contains trajectory information.
The datasets are are split into the following directories:
2AA-1-big "Two Amino Acid" data set
This folder contains a data set of all-atom molecular dynamics trajectories for 380
of the 400 dipeptides, i.e. small proteins composed of two amino acids.
This dataset was orginally created missing 20 of the 400 possible dipeptides.
The 2AA-1-complete
dataset completes this by including all 400.
Each peptide is simulated using classical molecular dynamics and the
water is simulated using an implicit water model.
The trajectories are only saved every 10000 MD steps. There is no intermediate
spacing as for the other datasets for the Timewarp project.
2AA-1-complete "Two Amino Acid" data set
This folder contains a data set of all-atom molecular dynamics trajectories for all 400 dipeptides, i.e. small proteins composed of two amino acids. This includes also the peptides missing in the other 2AA datasets. Each peptide is simulated using classical molecular dynamics and the water is simulated using an implicit water model.
4AA-huge "Four Amino Acid" data set, tetrapeptides
This folder contains a data set of all-atom molecular dynamics trajectories for tetrapeptides, i.e. small proteins composed of four amino acids. The data set contains mostly validation and test trajectories as it was mostly used to validation and test purposes. The training trajectories used are usually shorter. Each peptide is simulated for 1 micro second using classical molecular dynamics and the water is simulated using an implicit water model.
4AA-large "Four Amino Acid" data set, tetrapeptides
This folder contains a data set of all-atom molecular dynamics trajectories for 2333 tetrapeptides, i.e. small proteins composed of four amino acids. The data set is split into 1500 tetra-peptides in the train set, 400 in validation, and 433 in test. Each peptide in the train set is simulated for 50ns using classical molecular dynamics and the water is simulated using an implicit water model. Each other peptide is simulated for 500ns.
Responsible AI FAQ
- What is Timewarp?
- Timewarp is a neural network that predicts the future 3D positions of a small peptide (2- 4 amino acids) based on its current state. It is a research project that investigates using deep learning to accelerate molecular dynamics simulations.
- What can Timewarp do?
- Timewarp can be used to sample from the equilibrium distribution of small peptides.
- What is/are Timewarp’s intended use(s)?
- Timewarp is intended for machine learning and molecular dynamics research purposes only.
- How was Timewarp evaluated? What metrics are used to measure performance?
- Timewarp was evaluated by comparing the speed of molecular dynamics sampling with standard molecular dynamics systems that rely on numerical integration. Timewarp is sometimes faster than these standard systems.
- What are the limitations of Timewarp? How can users minimize the impact of Timewarp’s limitations when using the system?
- As a research project, Timewarp has many limitations. The main ones are that it only works for very small peptides (2-4 amino acids), and that it does not lead to a wall-clock speed up for many peptides.
- What operational factors and settings allow for effective and responsible use of Timewarp?
- Timewarp should be used purely for research purposes only.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
- Downloads last month
- 4