02 · Architecture
A 3-stage pipeline, end-to-end in under 50ms.
Benna receives a bid request, encodes the context into a dense embedding, runs a two-tower ranking model over candidate ads, and returns a winning ad with a calibrated bid — all before the user's next token renders.
1 · Encode
bb_context → ctx_embed[512]
p99 < 3ms
→
2 · Predict
Two-tower ranking
p99 < 8ms
→
3 · Calibrate
Isotonic + floor check
p99 < 2ms
Ready to optimize every impression?
Request early API access and we'll get back within a few days.