RedBeat Architecture

RedBeat's design is centered around a few simple Redis data structures to provide a fast, scalable, and persistent scheduling system.

Core Data Structures

Assuming the default redbeat_key_prefix of 'redbeat:', RedBeat uses the following keys in Redis:

1. The Schedule (redbeat:schedule)

  • Type: Sorted Set
  • Purpose: This is the heart of RedBeat. It acts as a priority queue for all scheduled tasks.
  • Members: The members of the sorted set are the keys of the individual task definitions (e.g., 'redbeat:my-task-name').
  • Scores: The score of each member is a UNIX timestamp (UTC) representing the next scheduled run time. A lower score means the task is due sooner.

By using a sorted set, RedBeat can efficiently query for all tasks due to run at the current time (zrangebyscore ... 0 <now>) with minimal overhead, regardless of the total number of tasks.

2. Task Definitions (redbeat:<task-name>)

  • Type: Hash
  • Purpose: Each scheduled task has its own hash containing all the information needed to execute it.
  • Fields:
    • definition: A JSON string containing the static task details: name, task, schedule, args, kwargs, options, and enabled status.
    • meta: A JSON string containing the dynamic runtime metadata for the task. This includes last_run_at and total_run_count.

Separating the definition from the metadata allows the core task definition to remain unchanged while its runtime state is updated frequently.

3. Static Task Tracking (redbeat:statics)

  • Type: Set
  • Purpose: This set stores the names of all tasks that were defined statically in the Celery configuration (beat_schedule).
  • Function: On startup, RedBeat compares the current static configuration with the members of this set. If a task name exists in the set but not in the current configuration, RedBeat assumes it has been removed and deletes it from Redis. This prevents orphaned static tasks from continuing to run after being removed from code.

The Scheduling Tick

On each tick, the RedBeatScheduler performs the following steps:

  1. Lock Check: If locking is enabled, it extends the expiration time of its distributed lock to signal that it is still alive.
  2. Query for Due Tasks: It queries the redbeat:schedule sorted set for all members with a score less than or equal to the current timestamp.
  3. Fetch Definitions: For each due task key, it retrieves the definition and meta from the corresponding Redis hash.
  4. Send to Celery: It deserializes the task definition and sends it to the Celery broker for a worker to execute.
  5. Reschedule: After sending the task, it calculates the next run time, updates the meta (e.g., last_run_at), and updates the task's score in the redbeat:schedule sorted set with the new timestamp.
  6. Calculate Sleep Time: It peeks at the score of the next task in the schedule to calculate the optimal time to sleep before the next tick, ensuring both responsiveness and efficiency.