Smart Detection
How Redactorr finds sensitive data through multi-layer pattern matching and validation
Smart Detection: Finding Needles in Haystacks
Redactorr does not just look for obvious patterns like "[email protected]". It runs hundreds of patterns with format verification and context analysis to find sensitive data accurately.
What Gets Detected
Personal identifiers
- Email addresses: All formats, including plus addressing and subdomains
- Phone numbers: Australian mobile and landline formats, international numbers
- Tax File Numbers, Medicare numbers, ABNs, ACNs, BSBs
- Addresses: Street addresses, PO boxes
- Names: First, last, full names with context analysis
Credentials and secrets
- API keys: 50+ providers (AWS, Stripe, GitHub, and others)
- OAuth tokens, access tokens, refresh tokens
- Passwords in code, configuration files, connection strings
- Private keys: RSA, SSH, PGP
Financial data
- Credit cards: Visa, Mastercard, Amex, Discover (format verified)
- Bank accounts: BSBs, account numbers
- IBAN: International bank account numbers
- Crypto: Wallet addresses
Healthcare data
- Medical record numbers
- Patient identifiers, health fund member IDs
- AHPRA registration numbers
Custom patterns
- Your organisation's ID formats
- Internal project codes
- Customer account numbers
How It Works
Pattern Screening**
Redactorr scans your text with hundreds of detection patterns, each tuned for a specific type of sensitive data.
Validation**
Every match is verified:
- Credit cards are checked against standard format rules
- Emails must have valid domain structure
- Account numbers must follow known structural rules
- API keys must match provider-specific formats
Context Analysis**
The surrounding text matters:
- Is this in a comment or documentation? (Probably an example)
- Is it near keywords like "fake", "test", "example"? (Probably not real)
- Is it in a code snippet? (Might be a placeholder)
Confidence Scoring**
Each detection gets a confidence score:
- High: Very likely sensitive — flagged immediately
- Medium: Probably sensitive — shown for review
- Low: Possibly sensitive — held back in standard mode
Real-World Examples
Email detection Catches:
Skips:
- user@localhost (not a real domain)
- [email protected] (whitelisted)
API key detection Catches:
- sk_live_51Hxyz123456789abcdefghijklmn (Stripe)
- AKIAIOSFODNN7EXAMPLE (AWS)
- ghp_1234567890abcdefghijklmnopqrstuvwxyz (GitHub)
Phone number detection Catches:
- 0412 345 678 (Australian mobile)
- +61 2 9876 5432 (Australian landline)
- (03) 9876 5432
Custom Patterns
Do not see your organisation's ID format? Add a custom pattern:
Go to Settings → Custom Patterns
Click "Add Pattern"
Name it (e.g., "Customer ID")
Provide the format pattern: CUST-d{6}
Test with sample data
Save and activate
Now Redactorr will detect "CUST-123456" as sensitive data.
Need help?