Toxicity
Three-layer harmful content detection — runs fully locally with no API calls. Catches profanity, ML-detected toxicity, and semantic similarity to illegal categories.
LLM Validation Framework wraps your LLM calls in a composable pipeline of guardrails. Each guardrail is an independent agent that evaluates either the user’s input or the model’s output — returning a structured score and status that your application can act on.
Toxicity
Three-layer harmful content detection — runs fully locally with no API calls. Catches profanity, ML-detected toxicity, and semantic similarity to illegal categories.
Privacy
Regex-based PII and secrets scanning. Detects SSNs, credit cards (Luhn-validated), API keys, and system prompt leakage — also local, no API calls.
Accuracy
LLM-as-a-judge factual correctness check. Optionally grounded against your own document corpus via a RAG retriever.
Relevancy
LLM-as-a-judge check that the answer actually addresses the question asked.
Bias
LLM-as-a-judge detection of stereotypes, discriminatory language, and unfair generalisations across protected traits.
pip install validate-llmfrom llm_validation_framework import ( ValidationFramework, LLMProvider, Pipe, ToxicityAgent, AccuracyAgent,)from llm_validation_framework.config_loader import load_api_key
api_key = load_api_key(provider="ANTHROPIC")llm = LLMProvider(provider="anthropic", model="claude-haiku-4-5-20251001", key=api_key)
input_guardrail = Pipe(steps=[ToxicityAgent()], verbose=False)output_guardrail = Pipe(steps=[ToxicityAgent(), AccuracyAgent()], verbose=False)
vf = ValidationFramework(llm=llm, input_guardrail=input_guardrail, output_guardrail=output_guardrail)result = vf.validate("What is the Pacific Ocean?")
print(result["status"]) # "PASS" or "FAIL"print(result["score"]) # 0.0 – 1.0