Expression Trees

At the heart of AlphaGen is the representation of financial signals as Expression Trees. This approach allows the system to generate human-readable, mathematically valid formulas.

Structure

Defined in alphagen/data/expression.py, the base class Expression supports recursive evaluation. The trees consist of three types of nodes:

  1. Features: Leaf nodes representing raw market data.

    • OPEN, CLOSE, HIGH, LOW, VOLUME, VWAP.
    • Implementation: Feature class. It fetches the specific column from the StockData tensor.
  2. Constants: Floating point numbers.

    • Implementation: Constant class. Broadcasts a single value to the shape of the data tensor.
  3. Operators: Functional nodes that combine or transform children nodes.

Operators List

The allowed operators are defined in alphagen/config.py and implemented in alphagen/data/expression.py.

Unary Operators

  • Abs(x): Absolute value.
  • Log(x): Natural logarithm.
  • Sign(x): Sign function (-1, 0, 1).

Binary Operators

  • Add(x, y), Sub(x, y), Mul(x, y), Div(x, y): Standard arithmetic.
  • Greater(x, y): Element-wise maximum (returns max(x, y)).
  • Less(x, y): Element-wise minimum.

Rolling Operators (Time-Series)

These operators require a DeltaTime (integer days) argument, e.g., Mean(Close, 10).

  • Ref(x, d): The value of x d days ago (shift).
  • Mean(x, d): Rolling moving average over d days.
  • Sum(x, d): Rolling sum.
  • Std(x, d): Rolling standard deviation.
  • Var(x, d): Rolling variance.
  • Max(x, d), Min(x, d): Rolling max/min.
  • Delta(x, d): Difference: x(t) - x(t-d).
  • WMA(x, d): Weighted Moving Average.
  • EMA(x, d): Exponential Moving Average.

Pair Rolling Operators

  • Cov(x, y, d): Rolling covariance between x and y over d days.
  • Corr(x, y, d): Rolling correlation.

Parsing

AlphaGen includes a robust parser in alphagen/data/parser.py. It converts string representations into executable expression trees.

from alphagen.data.parser import parse_expression

# Convert string to tree
expr = parse_expression("Sub(Div(Close, Open), 1.0)")

# Evaluate on data
# returns a PyTorch tensor of shape (days, stocks)
result = expr.evaluate(stock_data)