Four AI engineering libraries.
One agent. Watch them work.

Every message is traced by TraceForge, scored by Evalify, adversarially probed by RedForge, and versioned by StateForge. Send a question to see them in action.

TraceForge

Records every LLM call, tool call, token count and latency as replayable spans.

Evalify

LLM-judge scoring across relevance, accuracy and safety — after every turn.

RedForge

Prompt-injection & jailbreak probes run in the background on your first message.

StateForge

Git-like memory snapshots and diffs of what the agent knew, and when.

Four AI engineering libraries.One agent. Watch them work.

Demo is busy

Four AI engineering libraries.
One agent. Watch them work.