Configuration

AIDE ML's behavior is controlled through a YAML configuration file, environment variables, and command-line arguments. This page details the available options.

Configuration File

The default configuration is stored in aide/utils/config.yaml. You can create your own YAML file and pass it to the aide command to use it.

Here is the default configuration file with explanations:

# path to the task data directory
data_dir: null

# either provide a path to a plaintext file describing the task
desc_file: null
# or provide the task goal (and optionally evaluation information) as arguments
goal: null
eval: null

log_dir: logs
workspace_dir: workspaces

# whether to unzip any archives in the data directory
preprocess_data: True
# whether to copy the data to the workspace directory (otherwise it will be symlinked)
copy_data: True

exp_name: null # a random experiment name will be generated if not provided

# settings for code execution
exec:
  # Timeout in seconds for each script execution
  timeout: 3600
  # The filename for the agent's code within its workspace
  agent_file_name: runfile.py
  # Format tracebacks using IPython's verbose style
  format_tb_ipython: False

# Whether to generate a final markdown report from the journal
generate_report: True

# LLM settings for the final report generation
report:
  model: gpt-4.1
  temp: 1.0

# agent hyperparams
agent:
  # how many improvement iterations to run
  steps: 20
  # whether to instruct the agent to use k-fold cross-validation (set to 1 to disable)
  k_fold_validation: 5
  # whether to instruct the agent to generate a prediction function
  expose_prediction: False
  # whether to provide the agent with a preview of the data
  data_preview: True

  # LLM settings for coding (draft, improve, debug)
  code:
    model: o4-mini
    temp: 0.5

  # LLM settings for evaluating program output / tracebacks
  feedback:
    model: gpt-4.1-mini
    temp: 0.5

  # hyperparameters for the tree search
  search:
    # How many consecutive debug steps are allowed for a single branch
    max_debug_depth: 3
    # Probability of choosing to debug a buggy node vs. improving a good one
    debug_prob: 0.5
    # Number of initial solutions to draft before starting the improve/debug loop
    num_drafts: 5

Overriding Configuration via CLI

Any parameter in the YAML file can be overridden from the command line using dot notation. This is the most common way to change agent behavior for a specific run.

Example: To change the number of steps and the coding model:

aide data_dir=... goal=... eval=... agent.steps=50 agent.code.model="claude-3-5-sonnet-20240620"

Environment Variables for API Keys

AIDE ML uses environment variables to access LLM provider API keys.

  • OpenAI: OPENAI_API_KEY
  • Anthropic: ANTHROPIC_API_KEY
  • Gemini: GEMINI_API_KEY
  • OpenRouter: OPENROUTER_API_KEY

Set them in your shell before running aide:

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

Using Local LLMs (Ollama Example)

To use a local LLM that exposes an OpenAI-compatible API (like Ollama), you can set the OPENAI_BASE_URL environment variable. The agent will then route requests for OpenAI models to this URL.

  1. Set the Base URL:

    export OPENAI_BASE_URL="http://localhost:11434/v1"
  2. Run aide with the local model name:

    aide agent.code.model="qwen2" data_dir=... goal=... eval=...

    Note: The model name must still be recognized as an OpenAI model by the backend logic. For custom servers, you may need to ensure the model name does not start with prefixes like claude- or gemini- to be routed correctly.