Appearance
GuardrailSystem
The GuardrailSystem class provides core guardrail functionality for validating text content.
Overview
GuardrailSystem performs safety checks on text content, including:
- Toxicity detection
- Sensitive data detection
- Content classification (jailbreak, malicious content, etc.)
Initialization
Basic Initialization
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
config = GuardrailConfig()
guardrail = GuardrailSystem(config=config)With Custom Configuration
python
config = GuardrailConfig(
check_toxicity=True,
check_sensitive_data=True,
check_semantic=True,
toxicity_threshold=0.7,
block_toxic=True,
block_sensitive_data=True
)
guardrail = GuardrailSystem(config=config)From Config File
python
guardrail = GuardrailSystem.from_config("config.yml")With Input/Output Checks
python
guardrail = GuardrailSystem(
config=config,
input_checks=True,
output_checks=True
)Methods
check_input()
Check user input through guardrails. This method is designed for validating user input before it's sent to an LLM.
python
result = guardrail.check_input("user input text")Parameters:
user_input(str): User input to validate
Returns:
GuardrailResult: Result object
Note: Adds "Please rephrase your input." to message if check fails.
Example - Separate Input Check:
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
# Create guardrail system (no LLM needed for separate checks)
guardrail = GuardrailSystem.from_config("config.yml")
# Step 1: Check user input
user_input = "What is machine learning?"
input_result = guardrail.check_input(user_input)
if not input_result.passed:
print(f"Input blocked: {input_result.message}")
# Do not proceed to LLM
return
# Step 2: Input passed, now call your LLM manually
# (Your LLM call here)
# Step 3: Check LLM output
output_result = guardrail.check_output(llm_response)check_output()
Check LLM output through guardrails. This method is designed for validating LLM-generated responses before returning them to users.
python
result = guardrail.check_output("LLM generated response")Parameters:
llm_output(str): LLM output to validate
Returns:
GuardrailResult: Result object
Note: Adds "LLM response has been blocked." to message if check fails.
Example - Separate Output Check:
python
# After receiving LLM response
llm_response = "Generated response from LLM..."
# Check output
output_result = guardrail.check_output(llm_response)
if not output_result.passed:
print(f"Output blocked: {output_result.message}")
# Do not return this response to user
# Consider sanitizing or asking LLM to regenerate
return
# Output passed, safe to return to user
print(llm_response)Example Usage
Basic Text Check
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
config = GuardrailConfig(
check_toxicity=True,
check_sensitive_data=True,
check_semantic=True
)
guardrail = GuardrailSystem(config=config)
# Check text
result = guardrail.check_text("Hello, how are you?")
if result.passed:
print("Text passed all checks")
else:
print(f"Text blocked: {result.message}")Separate Input and Output Checks
This pattern gives you full control over the LLM call while still having guardrails:
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
# Create guardrail system (no LLM needed)
guardrail = GuardrailSystem.from_config("config.yml")
def process_user_query(user_input: str):
"""Complete workflow: input check -> LLM call -> output check"""
# Step 1: Check user input
input_result = guardrail.check_input(user_input)
if not input_result.passed:
return {
'success': False,
'error': f"Input blocked: {input_result.message}",
'blocked_at': 'input'
}
# Step 2: Call your LLM (input passed)
try:
# Replace with your actual LLM call
from openai import OpenAI
client = OpenAI(api_key="your-key")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": user_input}]
)
llm_response = response.choices[0].message.content
except Exception as e:
return {
'success': False,
'error': f"LLM error: {str(e)}",
'blocked_at': 'llm_call'
}
# Step 3: Check LLM output
output_result = guardrail.check_output(llm_response)
if not output_result.passed:
return {
'success': False,
'error': f"Output blocked: {output_result.message}",
'blocked_at': 'output'
}
# Step 4: Success
return {
'success': True,
'response': llm_response
}
# Usage
result = process_user_query("What is AI?")
if result['success']:
print(f"Response: {result['response']}")
else:
print(f"Blocked at {result['blocked_at']}: {result['error']}")Input Validation Only
python
user_input = "My email is user@example.com"
result = guardrail.check_input(user_input)
if not result.passed:
print(f"Input blocked: {result.message}")
print(f"Reason: {result.sensitive_data.get('predicted_labels', [])}")
# Ask user to remove sensitive informationOutput Validation Only
python
llm_output = "Here is the response..."
result = guardrail.check_output(llm_output)
if not result.passed:
print(f"Output blocked: {result.message}")
# Do not return to user, consider sanitizing or regeneratingConfiguration Options
See GuardrailConfig for all configuration options.
Result Object
The check_text(), check_input(), and check_output() methods return a GuardrailResult object. See GuardrailResult for details.
Next Steps
- LLMRails - High-level API with LLM integration
- GuardrailResult - Understanding results
- Configuration Guide - Configuration options
