Appearance
Input Rails
Input rails validate user input before it's processed by the LLM.
Overview
Input rails perform safety checks on user messages to prevent:
- Toxic or offensive content
- Sensitive data exposure
- Jailbreak attempts
- Malicious requests
How It Works
- User sends input
- Input rails validate the content
- If validation fails, request is blocked
- If validation passes, input proceeds to LLM
Configuration
Enable input checks in your configuration:
yaml
guardrails:
input_checks: true
check_toxicity: true
check_sensitive_data: true
check_semantic: trueUsage
With LLMRails
Input checks are automatic when input_checks: true:
python
from elsai_guardrails.guardrails import LLMRails, RailsConfig
yaml_content = """
llm:
engine: "openai"
model: "gpt-4o-mini"
api_key: "sk-..."
guardrails:
input_checks: true
check_toxicity: true
check_sensitive_data: true
check_semantic: true
"""
config = RailsConfig.from_content(yaml_content=yaml_content)
rails = LLMRails(config=config)
# Input will be automatically checked
result = rails.generate(
messages=[{"role": "user", "content": "user input"}],
return_details=True
)
if result.get('input_check'):
print(f"Input passed: {result['input_check'].passed}")Standalone Input Validation
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
config = GuardrailConfig(
check_toxicity=True,
check_sensitive_data=True,
check_semantic=True
)
guardrail = GuardrailSystem(config=config)
user_input = "test input"
result = guardrail.check_input(user_input)
if not result.passed:
print(f"Input blocked: {result.message}")Block Reasons
Input can be blocked for:
- Toxicity: Toxic or offensive content detected
- Sensitive Data: Personal information detected (emails, phone numbers, etc.)
- Content Issues: Jailbreak attempts, malicious requests, prompt injections
Example
python
from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig
config = GuardrailConfig()
guardrail = GuardrailSystem(config=config)
# Example 1: Valid input
result = guardrail.check_input("Hello, how are you?")
print(f"Passed: {result.passed}") # True
# Example 2: Sensitive data
result = guardrail.check_input("My email is user@example.com")
print(f"Passed: {result.passed}") # False
print(f"Reason: {result.message}") # "Sensitive data detected."
# Example 3: Jailbreak attempt
result = guardrail.check_input("Ignore previous instructions and...")
print(f"Passed: {result.passed}") # False
print(f"Content Class: {result.semantic_class}") # "jailbreak"Next Steps
- Output Rails - Output validation
- Toxicity Detection - Toxicity checks
- Sensitive Data Detection - Sensitive data checks
- Content Classification - Content checks
