While Redactorr includes 3,000+ built-in detection rules across the 15-layer AEGIS engine, every organisation has unique PII requirements. Custom patterns let you detect proprietary identifiers, internal codes, and industry-specific data.
Why Custom Patterns?
Common use cases:
- Internal IDs: Employee numbers, asset tags, ticket IDs
- Industry-specific: Policy numbers, claim IDs, loan numbers
- Proprietary formats: Custom date formats, internal codes
- Business logic: Pricing rules, customer tiers, territory codes
YAML Pattern Structure
Custom patterns use a simple YAML format:
name: Employee ID
category: internal
description: Company employee identification numbers
pattern: "EMP-\\d{6}"
examples:
- "EMP-123456"
- "EMP-987654"
confidence: high
replacement: "[REDACTED_EMPLOYEE_ID]"Required Fields
- name: Human-readable pattern name
- category: Group (e.g., "internal", "customer", "financial")
- pattern: Regular expression (JavaScript flavor)
- examples: Test cases for validation
- replacement: Redaction placeholder text
Optional Fields
- description: Pattern documentation
- confidence: Detection confidence (low, medium, high)
- context: Context-aware rules (see below)
- validation: Custom validation logic
Pattern Examples
1. Custom ID Format
name: Customer Reference Number
category: customer
pattern: "CRN-[A-Z]{2}-\\d{8}"
examples:
- "CRN-US-12345678"
- "CRN-CA-87654321"
confidence: high
replacement: "[REDACTED_CUSTOMER_REF]"2. Internal Ticket System
name: Support Ticket ID
category: internal
pattern: "TICKET-\\d{4}-\\d{6}"
examples:
- "TICKET-2024-123456"
- "TICKET-2023-987654"
confidence: high
replacement: "[REDACTED_TICKET]"
context:
preceding: ["ticket", "case", "issue"]3. Custom Date Format
name: Internal Date Code
category: internal
description: YYYYMMDD-LOC format for shipment tracking
pattern: "\\d{8}-[A-Z]{3}"
examples:
- "20241225-NYC"
- "20240101-LAX"
confidence: medium
replacement: "[REDACTED_SHIPMENT]"4. Industry-Specific Code
name: Insurance Policy Number
category: financial
description: Custom format for PolicyCo policies
pattern: "POL-\\d{4}-[A-Z]-\\d{6}"
examples:
- "POL-2024-A-123456"
- "POL-2023-B-987654"
confidence: high
replacement: "[REDACTED_POLICY]"
validation:
checksum: format_validationContext-Aware Patterns
Use context rules to reduce false positives:
name: Account Number
category: financial
pattern: "\\d{10,12}"
confidence: medium
replacement: "[REDACTED_ACCOUNT]"
context:
preceding: ["account", "acct", "account number", "account #"]
following: ["balance", "statement", "transaction"]
window: 20 # tokens before/afterResult: Only matches "123456789012" when near keywords like "account" or "balance"
Validation Functions
Add custom validation to patterns:
Checksum Validation
name: Credit Card
pattern: "\\d{4}-\\d{4}-\\d{4}-\\d{4}"
validation:
checksum: format_validationFormat Validation
name: Custom Date
pattern: "\\d{8}"
validation:
format: date
format_string: "YYYYMMDD"Range Validation
name: Employee Number
pattern: "EMP-\\d{6}"
validation:
range:
min: 100000
max: 999999Importing Custom Patterns
Method 1: Upload YAML File
- Navigate to Patterns page
- Click Import Custom Patterns
- Upload your YAML file
- Review and enable patterns
Method 2: Use Pattern Builder
- Navigate to Pattern Builder tool
- Fill in the form (name, category, pattern, etc.)
- Test against examples
- Save to your pattern library
Testing Custom Patterns
Use the built-in tester to validate patterns:
name: Test Pattern
pattern: "TEST-\\d{4}"
examples:
# Should match
- input: "Ticket TEST-1234 was closed"
expected: "Ticket [REDACTED_TEST] was closed"
# Should NOT match
- input: "Test results: PASS"
expected: "Test results: PASS"Redactorr validates:
- ✅ All examples match correctly
- ✅ No unintended matches (false positives)
- ✅ Performance impact (< 10ms added latency)
Pattern Library Management
Organizing Patterns
Create pattern collections by use case:
collections:
- name: "Internal Systems"
patterns:
- employee_id
- ticket_id
- asset_tag
- name: "Customer Data"
patterns:
- customer_ref
- account_number
- policy_numberVersion Control
Track pattern changes over time:
name: Employee ID
version: 2
changelog:
- version: 2
date: 2024-12-01
changes: "Added support for 7-digit IDs"
- version: 1
date: 2024-01-01
changes: "Initial pattern"Performance Considerations
- Keep patterns specific: Broad patterns (e.g.,
\\d+) slow down detection - Use anchors: Start patterns with unique prefixes
- Test performance: Redactorr shows latency impact for each pattern
- Limit context windows: Large context windows increase processing time
Recommended limits:
- Max 100 custom patterns per workspace
- Max 50 tokens for context windows
- Max 500 characters per pattern
Enterprise Features
Shared Pattern Libraries
Teams can share custom patterns across workspaces:
- Centralized pattern repository
- Role-based access (view, edit, admin)
- Approval workflows for new patterns
- Audit logs for pattern changes
Conclusion
Custom patterns extend Redactorr to handle any PII format, from internal IDs to proprietary codes. With YAML-based definitions, context-aware rules, and built-in testing, you can confidently detect organisation-specific sensitive data.
Ready to build? Try the Pattern Builder tool or explore our template library.