Input Rails

Input rails validate user input before it's processed by the LLM.

Overview

Input rails perform safety checks on user messages to prevent:

Toxic or offensive content
Sensitive data exposure
Jailbreak attempts
Malicious requests

How It Works

User sends input
Input rails validate the content
If validation fails, request is blocked
If validation passes, input proceeds to LLM

Configuration

Enable input checks in your configuration:

yaml

guardrails:
  input_checks: true
  check_toxicity: true
  check_sensitive_data: true
  check_semantic: true

Usage

With LLMRails

Input checks are automatic when input_checks: true:

python

from elsai_guardrails.guardrails import LLMRails, RailsConfig

yaml_content = """
llm:
  engine: "openai"
  model: "gpt-4o-mini"
  api_key: "sk-..."

guardrails:
  input_checks: true
  check_toxicity: true
  check_sensitive_data: true
  check_semantic: true
"""

config = RailsConfig.from_content(yaml_content=yaml_content)
rails = LLMRails(config=config)

# Input will be automatically checked
result = rails.generate(
    messages=[{"role": "user", "content": "user input"}],
    return_details=True
)

if result.get('input_check'):
    print(f"Input passed: {result['input_check'].passed}")

Standalone Input Validation

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

config = GuardrailConfig(
    check_toxicity=True,
    check_sensitive_data=True,
    check_semantic=True
)
guardrail = GuardrailSystem(config=config)

user_input = "test input"
result = guardrail.check_input(user_input)

if not result.passed:
    print(f"Input blocked: {result.message}")

Block Reasons

Input can be blocked for:

Toxicity: Toxic or offensive content detected
Sensitive Data: Personal information detected (emails, phone numbers, etc.)
Content Issues: Jailbreak attempts, malicious requests, prompt injections

Example

python

from elsai_guardrails.guardrails import GuardrailSystem, GuardrailConfig

config = GuardrailConfig()
guardrail = GuardrailSystem(config=config)

# Example 1: Valid input
result = guardrail.check_input("Hello, how are you?")
print(f"Passed: {result.passed}")  # True

# Example 2: Sensitive data
result = guardrail.check_input("My email is user@example.com")
print(f"Passed: {result.passed}")  # False
print(f"Reason: {result.message}")  # "Sensitive data detected."

# Example 3: Jailbreak attempt
result = guardrail.check_input("Ignore previous instructions and...")
print(f"Passed: {result.passed}")  # False
print(f"Content Class: {result.semantic_class}")  # "jailbreak"

Next Steps

Output Rails - Output validation
Toxicity Detection - Toxicity checks
Sensitive Data Detection - Sensitive data checks
Content Classification - Content checks

Input Rails ​

Overview ​

How It Works ​

Configuration ​

Usage ​

With LLMRails ​

Standalone Input Validation ​

Block Reasons ​

Example ​

Next Steps ​

Input Rails

Overview

How It Works

Configuration

Usage

With LLMRails

Standalone Input Validation

Block Reasons

Example

Next Steps