Some parameters that exists in GCP PaLM API are:
temperature
Temperature values are ranged from 0.0
to 1.0
. Higher temperature means higher randomness of the model’s response. It means the model will choose unexpected words hence make it more creative.
max_output_tokens
Describes maximum number of tokens the model will generate. A hundred of tokens are roughly consists of 60-80 words.
top_p
Has the value range of 0.0
to 1.0
. This describes the cutoff of cumulative probability of the next generated token. For example, the next tokens are A, B, C, D with probability of 0.3
, 0.2
, 0.1
, 0.4
. If the top_p = 0.7
, then the generated token will be limited to D and A because it is the top tokens that made up 70% of total cumulative probability (0.4 + 0.3
).
Higher top_p
will make the model response less random.
top_k
This params has the value of 0.0 to 40
. This parameters selects the number token selections by filtering the available tokens up to the value of the top_k
. For example, if top_k = 5
, then the model will generate 5 words that are most probable.
This means the fewer the top_k
value is, the less random words will be generated.
In order the importance between top_k
and top_p
, the model will filter out the number of tokens first (filter using top_k
), and then calculate the cumulative probability (filter using top_p
).