Advanced Model Options (Ollama)
Hollama allows you to fine-tune the behavior of Ollama models by adjusting their parameters. These controls are accessible from the Controls tab within a session's prompt editor.
Note: These advanced options are currently only available for Ollama models.
To access these options, click the Settings icon next to the model selector.
Below is a list of the available parameters, based on the OllamaOptions interface defined in the source code.
Model Options
These parameters control the generation process and sampling.
interface OllamaOptions {
// Generation
mirostat: number; // Enable Mirostat sampling.
mirostat_eta: number; // Mirostat learning rate.
mirostat_tau: number; // Mirostat target surprise value.
num_ctx: number; // Context window size.
num_predict: number; // Max number of tokens to predict.
repeat_last_n: number; // How far back to look for repetitions.
repeat_penalty: number; // Penalty for repetition.
temperature: number; // Controls randomness. Higher is more creative.
seed: number; // Random seed for reproducibility.
stop: string[]; // Sequences where the model will stop generating.
tfs_z: number; // Tail-free sampling.
top_k: number; // Top-K sampling.
top_p: number; // Top-P (nucleus) sampling.
min_p: number; // Min-P sampling.
// Penalties
penalize_newline: boolean;
presence_penalty: number;
frequency_penalty: number;
typical_p: number;
}
Runtime Options
These parameters affect how the model is loaded and run on the hardware.
interface OllamaOptions {
// Hardware & Performance
num_gpu: number; // Number of GPU layers to use.
main_gpu: number; // Main GPU to use.
low_vram: boolean; // Use for systems with low VRAM.
f16_kv: boolean; // Use 16-bit floats for KV cache.
numa: boolean; // Enable NUMA support.
num_batch: number; // Batch size for prompt processing.
num_thread: number; // Number of threads to use.
// Memory Management
use_mmap: boolean; // Use memory-mapped files.
use_mlock: boolean; // Force the model to be kept in RAM.
// Other
num_keep: number; // Number of tokens to keep from the start of the context.
vocab_only: boolean;
}
For a detailed explanation of each parameter, please refer to the official Ollama documentation on parameters.