Config Templates#

For configuration management in the benchmark, we use hydra. The configuration files are stored in config folder.

Configs are built through composing different individual configs for each component according to a template schema.

Below, we document the different config templates used in the benchmark.

Training#

This config can be used for both training a model from scratch and pre-training a model.

config/train.yaml#
 1# @package _global_
 2
 3# === 1. Set config parameters ===
 4name: "" # default name for the experiment, "" means logger (eg. wandb) will generate a unique name
 5seed: 52 # seed for random number generators in pytorch, numpy and python.random
 6num_workers: 16 # number of subprocesses to use for data loading.
 7
 8# === 2. Specify defaults here. Defaults will be overwritten by equivalently named options in this file ===
 9defaults:
10  - env: default
11  - dataset: cath
12  - features: ca_seq
13  - encoder: egnn
14  - decoder: default
15  - transforms: default
16  - callbacks: default
17  - optimiser: adam
18  - scheduler: none
19  - trainer: gpu
20  - extras: default
21  - hydra: default
22  - metrics: none
23  - task: inverse_folding
24  - logger: csv # Also supported: tensorboard, wandb
25  # debugging config (enable through command line, e.g. `python train.py debug=default)
26  - debug: null
27  - optional hparams: ${encoder}_${features}
28  - _self_ # see: https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order/. Adding _self_ at bottom means values in this file override defaults.
29
30task_name: "train"
31test: False
32#compile: True

Finetuning#

This config should be used to finetune a pre-trained model on a downstream task.

config/finetune.yaml#
 1# @package _global_
 2
 3# === 1. Set config parameters ===
 4name: "" # default name for the experiment, "" means logger (eg. wandb) will generate a unique name
 5seed: 52 # seed for random number generators in pytorch, numpy and python.random
 6num_workers: 16 # number of subprocesses to use for data loading.
 7
 8# === 2. Specify defaults here. Defaults will be overwritten by equivalently named options in this file ===
 9defaults:
10  - env: default
11  - dataset: cath
12  - features: ca_seq
13  - encoder: egnn
14  - decoder: default
15  - transforms: none
16  - callbacks: default
17  - optimiser: adam
18  - scheduler: none
19  - trainer: gpu
20  - extras: default
21  - hydra: default
22  - metrics: none
23  - task: inverse_folding # See: /proteinworkshop/config/task/
24  - logger: wandb # wandb, tensorboard, csv
25  - finetune: default # Specifies finetuning config. See: proteinworkshop/config/finetune/
26  # debugging config (enable through command line, e.g. `python train.py debug=default)
27  - debug: null
28  - optional hparams: ${encoder}_${features}
29  - _self_ # see: https://hydra.cc/docs/upgrades/1.0_to_1.1/default_composition_order/. Adding _self_ at bottom means values in this file override defaults.
30
31task_name: "finetune"
32
33#compile: True
34compile: False
35
36# simply provide checkpoint path to resume training
37ckpt_path: null