ValidationFramework
ValidationFramework is the main entry point. It wires together an LLM and two Pipe guardrails into a single validate() call.
Constructor
ValidationFramework( llm: LLMProvider, input_guardrail: Pipe, output_guardrail: Pipe,)| Parameter | Type | Description |
|---|---|---|
llm | LLMProvider | The LLM to call after the input guardrail passes. |
input_guardrail | Pipe | Agents that evaluate the user’s query before it reaches the LLM. |
output_guardrail | Pipe | Agents that evaluate the LLM’s response before returning it. |
Methods
validate(query)
def validate(query: str) -> ValidationSummaryRuns the full pipeline and returns a structured result.
Flow:
input_guardrail.evaluate(query)— evaluates the raw query string- If input passes →
llm.call_api(query)— calls the LLM output_guardrail.evaluate({"question": query, "answer": response})— evaluates the response- Returns a
ValidationSummary
Return type
class ValidationSummary(TypedDict): status: Literal["PASS", "FAIL"] score: float # average of input + output scores input: GuardrailSummary output: GuardrailSummary
class GuardrailSummary(TypedDict): status: Literal["PASS", "FAIL"] score: float # average score across all agents in the Pipe reason: NotRequired[str] results: NotRequired[list[EvaluationResult]]Scores are always in the range [0.0, 1.0]. The overall score is the arithmetic mean of the input and output guardrail scores.
Example
from llm_validation_framework import ( ValidationFramework, LLMProvider, Pipe, ToxicityAgent, PrivacyAgent, AccuracyAgent,)from llm_validation_framework.config_loader import load_api_key
api_key = load_api_key(provider="ANTHROPIC")llm = LLMProvider(provider="anthropic", model="claude-haiku-4-5-20251001", key=api_key)
input_guardrail = Pipe( steps=[ToxicityAgent(), PrivacyAgent()], verbose=False,)output_guardrail = Pipe( steps=[ToxicityAgent(), PrivacyAgent(), AccuracyAgent()], verbose=False,)
vf = ValidationFramework( llm=llm, input_guardrail=input_guardrail, output_guardrail=output_guardrail,)
result = vf.validate("What is the capital of France?")
print(result["status"]) # "PASS"print(f"{result['score']:.2f}") # e.g. "0.91"print(result["output"]["results"][2]["reason"]) # AccuracyAgent's reasoningCustomising the pipeline
You can put any combination of agents into either guardrail:
# Input-only checks (no LLM calls on the way in)input_guardrail = Pipe(steps=[ToxicityAgent(), PrivacyAgent()])
# Heavier output checksoutput_guardrail = Pipe(steps=[ ToxicityAgent(), PrivacyAgent(system_prompt="You are a helpful assistant."), AccuracyAgent(), RelevancyAgent(), BiasAgent(),])You can also use agents completely outside of ValidationFramework — see Pipe for standalone use.