Selecting hyperparameters - Amazon Nova

Selecting hyperparameters

There are 3 hyperparameters that you can adjust when fine-tuning a model.

Hyperparameter

Type

Minimum

Maximum

Default

Epochs

integer

1

5

2

Learning rate

float

1.00E-06

1.00E-04

1.00E-05

Learning rate warmup steps

integer

0

20

10

The default epoch number is 2, which works for most cases. In general, larger data sets require less epochs to converge, while smaller data sets require a larger training epoch to converge. A faster convergence might also be achieved by increasing the learning rate, but this is less desirable because it might lead to training instability at convergence. We recommend starting with the default hyperparameters, which are based on our assessment across tasks of different complexity and data sizes.

The learning rate will gradually increase to the set value during warm up, so avoid a large warm up number for small sample training because your learning rate might never reach the set value during the training process. We recommend setting the warmup steps by dividing the dataset size by 640 for Amazon Nova Micro, 160 for Amazon Nova Lite and 320 for Amazon Nova Pro.