Skip to content

SQL Syntax Validation

Validate SQL query syntax to prevent malformed queries from reaching your database.

Overview

SQL Syntax Validation checks SQL queries for syntax errors before execution. This helps catch errors early, provides better error messages, and prevents invalid queries from being sent to your database.

How It Works

  1. Configure the SQL dialect used by your database
  2. SQL queries in user input or LLM output are parsed and validated
  3. Syntax errors are detected and reported
  4. Invalid SQL queries are blocked based on configuration

Supported Dialects

  • PostgreSQL (postgresql) - PostgreSQL
  • MySQL (mysql) - MySQL
  • SQL Server (sqlserver) - Microsoft SQL Server
  • SQLite (sqlite) - SQLite
  • MongoDB (mongodb) - MongoDB query syntax
  • Oracle (oracle) - Oracle Database
  • Redshift (redshift) - Amazon Redshift

Configuration

Enable SQL Syntax Validation

yaml
guardrails:
  check_sql_syntax: true
  sql_dialect: "mysql"  # Specify your database dialect

Validation Modes

Output-Only Validation

Validate only LLM-generated SQL (recommended for text-to-SQL):

yaml
guardrails:
  input_checks: false   # Don't validate user input
  output_checks: true   # Validate LLM-generated SQL
  check_sql_syntax: true
  sql_dialect: "postgresql"

Input-Only Validation

Validate only user-provided SQL:

yaml
guardrails:
  input_checks: true    # Validate user input
  output_checks: false  # Don't validate LLM output
  check_sql_syntax: true
  sql_dialect: "mysql"

Both Input and Output

Validate both user queries and LLM responses:

yaml
guardrails:
  input_checks: true
  output_checks: true
  check_sql_syntax: true
  sql_dialect: "sqlserver"

Detected Issues

SQL Syntax Validation can detect:

  • Syntax Errors: Malformed SQL statements
  • Missing Keywords: Required keywords like SELECT, FROM
  • Unbalanced Parentheses: Mismatched parentheses in expressions
  • Invalid Identifiers: Improperly quoted or named identifiers
  • Type Mismatches: Incompatible data types in operations
  • Reserved Word Usage: Incorrect use of reserved keywords

Best Practices

1. Match Your Database Dialect

Always use the dialect that matches your database:

python
# If using PostgreSQL
config = GuardrailConfig(
    check_sql_syntax=True,
    sql_dialect="postgresql"  # Match your database
)

2. Combine with Other Security Measures

SQL syntax validation doesn't prevent SQL injection. Use with parameterized queries:

python
# Validate syntax
result = guardrail.check_input(sql_query)

if result.passed:
    # Use parameterized queries to prevent injection
    cursor.execute(
        "SELECT * FROM users WHERE id = %s",
        (user_id,)
    )

3. Provide Helpful Error Messages

Use validation results to give users actionable feedback:

python
result = guardrail.check_output(sql_query)

if not result.passed:
    # Provide specific error message
    error_response = {
        "error": "Invalid SQL syntax",
        "details": result.message,
        "suggestion": "Please check your query for syntax errors"
    }
    return error_response

Complete Example

python
from elsai_guardrails.guardrails import LLMRails, RailsConfig

yaml_content = """
llm:
  engine: "openai"
  model: "gpt-4o-mini"
  api_key: "sk-..."

guardrails:
  input_checks: false
  output_checks: true
  check_sql_syntax: true
  sql_dialect: "postgresql"
"""

config = RailsConfig.from_content(yaml_content=yaml_content)
rails = LLMRails(config=config)

# Ask LLM to generate a SQL query
response = rails.generate(
    messages=[{
        "role": "user",
        "content": "Write a SQL query to find all orders from last month"
    }]
)

# The SQL in the response is automatically validated
if response.output_result.passed:
    print("Valid SQL generated:")
    print(response.response)
else:
    print("Invalid SQL detected:")
    print(response.output_result.message)

Result Inspection

When invalid SQL is detected, the result provides details:

python
sql_query = "SELEC * FROM users"  # Typo: SELEC instead of SELECT

result = guardrail.check_input(sql_query)

if not result.passed:
    print(f"SQL Validation Failed")
    print(f"Message: {result.message}")
    # Handle invalid SQL appropriately

API Response Format

The SQL syntax validation returns a response with the following structure:

python
{
    "is_valid": False,  # Boolean indicating if SQL is valid
    "error_message": "Syntax error: expected SELECT, got SELEC",
    "dialect": "mysql"  # The dialect used for validation
}

Response Fields:

  • is_valid - True if SQL syntax is correct, False if errors detected
  • error_message - Detailed error message when is_valid is False
  • dialect - The SQL dialect used for validation

How It Works Internally

The SQL syntax validation system:

  1. Query Parsing: Parses the SQL query according to the specified dialect
  2. Syntax Checking: Validates SQL structure, keywords, and grammar
  3. Error Reporting: Returns detailed error messages for syntax violations

Important: This is syntax validation only. It does not:

  • Prevent SQL injection attacks (use parameterized queries for that)
  • Validate table or column names against your schema
  • Check for logical errors or performance issues

Configuration Reference

OptionTypeDefaultDescription
check_sql_syntaxboolfalseEnable SQL syntax validation
sql_dialectstr"mysql"SQL dialect to use for validation

Supported Dialects

DialectValueDescription
PostgreSQL"postgresql"PostgreSQL database
MySQL"mysql"MySQL and MariaDB
SQL Server"sqlserver"Microsoft SQL Server
SQLite"sqlite"SQLite database
MongoDB"mongodb"MongoDB queries
Oracle"oracle"Oracle Database
Redshift"redshift"Amazon Redshift

Limitations

  • Syntax validation only checks if SQL is well-formed, not if it's secure or optimal
  • Does not validate table or column names against your schema
  • Does not prevent SQL injection (use parameterized queries)
  • Does not check for logical errors or performance issues

Next Steps

Released under the MIT License.