Selecting hyperparameters
There are 3 hyperparameters that you can adjust when fine-tuning a model.
Hyperparameter |
Type |
Minimum |
Maximum |
Default |
---|---|---|---|---|
Epochs |
integer |
1 |
5 |
2 |
Learning rate |
float |
1.00E-06 |
1.00E-04 |
1.00E-05 |
Learning rate warmup steps |
integer |
0 |
20 |
10 |
The default epoch number is 2, which works for most cases. In general, larger data sets require less epochs to converge, while smaller data sets require a larger training epoch to converge. A faster convergence might also be achieved by increasing the learning rate, but this is less desirable because it might lead to training instability at convergence. We recommend starting with the default hyperparameters, which are based on our assessment across tasks of different complexity and data sizes.
The learning rate will gradually increase to the set value during warm up, so avoid a large warm up number for small sample training because your learning rate might never reach the set value during the training process. We recommend setting the warmup steps by dividing the dataset size by 640 for Amazon Nova Micro, 160 for Amazon Nova Lite and 320 for Amazon Nova Pro.