02 · Architecture

A 3-stage pipeline, end-to-end in under 50ms.

Benna receives a bid request, encodes the context into a dense embedding, runs a two-tower ranking model over candidate ads, and returns a winning ad with a calibrated bid — all before the user's next token renders.

1 · Encode

bb_context → ctx_embed[512]
p99 < 3ms

2 · Predict

Two-tower ranking
p99 < 8ms

3 · Calibrate

Isotonic + floor check
p99 < 2ms

Ready to optimize every impression?

Request early API access and we'll get back within a few days.