Data Masking & Privacy

Overview

Secure60 Collector implements comprehensive data masking and privacy protection capabilities that allow organisations to protect sensitive data before it reaches the Secure60 platform. These solutions are applied within your environment, ensuring sensitive information is never transmitted or stored in its original form.

Key Benefits

Data Protection Strategies

Secure60 Collector offers multiple complementary data protection strategies:

  1. Content Redaction - Regex-based redaction of sensitive patterns within field content
  2. Replacement Masking - Replace sensitive field values with X characters
  3. Cryptographic Hashing - Transform sensitive data using secure hash algorithms
  4. Field Removal - Remove entire fields containing sensitive information
  5. Event Filtering - Drop entire events based on content criteria

Content Redaction

Overview

Content redaction protects sensitive information by applying regex-based filters to field content while preserving the overall data structure and context.

Features

Configuration

Via Portal UI

  1. Navigate to Integrations → Secure60 Collector
  2. Click “Advanced Config”
  3. Find “Content Redaction” section
  4. Configure redaction blocks with target fields and patterns

Via Environment Variables

# Credit card number redaction
REDACT_CONTENT_FIELD_NAME=transaction_log
REDACT_CONTENT_REGEX=r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'

# Password redaction in URLs
REDACT_CONTENT_FIELD_NAME_2=request_url
REDACT_CONTENT_REGEX_2=r'password=[^&\s]+'

# Social security number redaction
REDACT_CONTENT_FIELD_NAME_3=message
REDACT_CONTENT_REGEX_3=r'\b\d{3}-\d{2}-\d{4}\b'

# Email address redaction
REDACT_CONTENT_FIELD_NAME_4=user_data
REDACT_CONTENT_REGEX_4=r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

# Phone number redaction
REDACT_CONTENT_FIELD_NAME_5=contact_info
REDACT_CONTENT_REGEX_5=r'\b\d{3}[-.]?\d{3}[-.]?\d{4}\b'

Regex Syntax

The Secure60 Collector uses the Rust regex engine for all pattern matching operations:

For complete syntax reference, see the Rust regex documentation

Common Redaction Patterns

Credit Card Numbers

# Matches various credit card formats
REDACT_CONTENT_REGEX=r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'

# Examples:
# "4532 1234 5678 9012" → "****************"
# "4532-1234-5678-9012" → "****************"
# "4532123456789012" → "****************"

Social Security Numbers

# Matches SSN formats
REDACT_CONTENT_REGEX=r'\b\d{3}-\d{2}-\d{4}\b'

# Example:
# "SSN: 123-45-6789" → "SSN: ***-**-****"

Email Addresses

# Comprehensive email pattern
REDACT_CONTENT_REGEX=r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

# Example:
# "Contact: john.doe@company.com" → "Contact: *********************"

API Keys and Tokens

# Generic API key pattern
REDACT_CONTENT_REGEX=r'\b[A-Za-z0-9]{32,}\b'

# Bearer token pattern
REDACT_CONTENT_REGEX=r'Bearer\s+[A-Za-z0-9+/=]+'

# Example:
# "Authorization: Bearer abc123xyz789" → "Authorization: Bearer ***********"

URL Parameters

# Password parameters
REDACT_CONTENT_REGEX=r'password=[^&\s]+'

# Token parameters
REDACT_CONTENT_REGEX=r'[?&](token|key|secret)=[^&\s]*'

# Example:
# "?user=john&password=secret123&action=login" → "?user=john&password=***&action=login"

Replacement Masking

Overview

Replacement masking transforms sensitive field values by replacing them with X characters, either fully or partially.

Features

Configuration

Basic Configuration

# Enable replacement masking
ENABLE_DATA_MASKING_X=true

# Specify fields to mask
DATA_MASKING_ARRAY=["password", "credit_card", "ssn", "api_key"]

# Enable partial redaction (preserve first/last characters)
ENABLE_DATA_MASKING_PARTIAL_REDACT=true

Advanced Configuration

# Complete masking configuration
DATA_MASKING_ARRAY=["password", "credit_card_number", "social_security", "api_token", "user_password"]
ENABLE_DATA_MASKING_X=true
ENABLE_DATA_MASKING_PARTIAL_REDACT=true

Examples

Full Replacement Masking

# Original Event
{
  "username": "john_doe",
  "password": "mySecret123",
  "credit_card": "4532123456789012"
}

# After Full Masking
{
  "username": "john_doe", 
  "password": "XXXXXXXXXXX",
  "credit_card": "XXXXXXXXXXXXXXXX"
}

Partial Redaction

# Original Event
{
  "username": "john_doe",
  "password": "mySecret123",
  "credit_card": "4532123456789012"
}

# After Partial Redaction
{
  "username": "john_doe",
  "password": "mXXXXXXXX3", 
  "credit_card": "4XXXXXXXXXXXXXX2"
}

URL Parameter Masking

# Original Event
{
  "request_url": "https://api.example.com/login?user=john&password=secret123&token=abc123xyz"
}

# After URL Parameter Masking
{
  "request_url": "https://api.example.com/login?user=john&password=XXXXXXXX&token=XXX123"
}

Cryptographic Hashing

Overview

Cryptographic hashing transforms sensitive data into irreversible hash values, providing strong protection while maintaining some analytical value through consistency.

Supported Algorithms

Configuration

Basic Hashing Setup

# Enable cryptographic hashing
ENABLE_DATA_MASKING_HASH=true

# Specify fields to hash
DATA_MASKING_ARRAY=["user_id", "email_address", "phone_number"]

# Select hashing algorithm
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA3  # SHA3-256 default

Algorithm Selection

# MD5 (fastest, least secure)
DATA_MASKING_ENCRYPTION_ALGORITHM=MD5

# SHA1 (moderate security/speed)
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA1

# SHA2 (good balance)
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA2

# SHA3-256 (default, high security)
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA3

# SHA3-512 (highest security, slower)
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA3-512

Examples

SHA3-256 Hashing (Default)

# Original Event
{
  "user_id": "john.doe@company.com",
  "username": "john_doe",
  "session_id": "abc123xyz789"
}

# After SHA3-256 Hashing
{
  "user_id": "a665a45920422f9d417e4867efdc4fb8a04a1f3fff1fa07e998e86f7f7a27ae3",
  "username": "john_doe",
  "session_id": "b1946ac92492d2347c6235b4d2611184"
}

MD5 Hashing (Legacy Support)

# Original Event
{
  "user_email": "john.doe@company.com"
}

# After MD5 Hashing
{
  "user_email": "5d41402abc4b2a76b9719d911017c592"
}

Combined Protection Strategies

Layered Protection

Combine multiple protection methods for comprehensive coverage:

# Layer 1: Content redaction for patterns
REDACT_CONTENT_FIELD_NAME=raw_log
REDACT_CONTENT_REGEX=r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'

# Layer 2: Field-level masking
DATA_MASKING_ARRAY=["password", "api_key", "access_token"]
ENABLE_DATA_MASKING_X=true

# Layer 3: Hashing for consistent identifiers
ENABLE_DATA_MASKING_HASH=true
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA3

Field-Specific Strategies

User Identity Protection

# Hash email addresses for consistency
DATA_MASKING_ARRAY=["email_address", "user_email"]
ENABLE_DATA_MASKING_HASH=true

# Redact email patterns in message fields
REDACT_CONTENT_FIELD_NAME=message
REDACT_CONTENT_REGEX=r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

Financial Data Protection

# Redact credit card patterns
REDACT_CONTENT_FIELD_NAME=transaction_log
REDACT_CONTENT_REGEX=r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'

# Mask account number fields
DATA_MASKING_ARRAY=["account_number", "routing_number"]
ENABLE_DATA_MASKING_X=true
ENABLE_DATA_MASKING_PARTIAL_REDACT=true

API Security Protection

# Redact bearer tokens in URLs
REDACT_CONTENT_FIELD_NAME=request_url
REDACT_CONTENT_REGEX=r'Bearer\s+[A-Za-z0-9+/=]+'

# Mask API key fields
DATA_MASKING_ARRAY=["api_key", "access_token", "secret_key"]
ENABLE_DATA_MASKING_X=true

Implementation Best Practices

Security Considerations

Algorithm Selection

  1. SHA3-256: Recommended for new implementations (default)
  2. SHA2: Good balance for performance-sensitive environments
  3. SHA1: Use only for legacy compatibility
  4. MD5: Avoid for new implementations due to security vulnerabilities

Field Classification

# High-security fields (use hashing)
HIGH_SECURITY=["social_security", "passport_number", "driver_license"]

# Medium-security fields (use partial masking)  
MEDIUM_SECURITY=["phone_number", "account_number"]

# Low-security fields (use full masking)
LOW_SECURITY=["password", "api_key", "session_token"]

Performance Optimization

Processing Order

  1. Apply content redaction first (most specific)
  2. Apply field-level masking second
  3. Apply hashing last (most computationally intensive)

Resource Management

# Optimize for high-volume environments
DATA_MASKING_ENCRYPTION_ALGORITHM=SHA2  # Faster than SHA3
ENABLE_DATA_MASKING_PARTIAL_REDACT=false  # Simpler processing

Compliance Alignment

GDPR Compliance

# Protect personal identifiers
DATA_MASKING_ARRAY=["email", "phone", "name", "address"]
ENABLE_DATA_MASKING_HASH=true

# Redact personal data in content
REDACT_CONTENT_FIELD_NAME=message
REDACT_CONTENT_REGEX=r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

HIPAA Compliance

# Protect healthcare identifiers
DATA_MASKING_ARRAY=["patient_id", "medical_record_number", "insurance_id"]
ENABLE_DATA_MASKING_HASH=true

# Redact healthcare patterns
REDACT_CONTENT_FIELD_NAME=medical_notes
REDACT_CONTENT_REGEX=r'\b\d{3}-\d{2}-\d{4}\b'  # SSN pattern

PCI DSS Compliance

# Protect payment card data
REDACT_CONTENT_FIELD_NAME=transaction_data
REDACT_CONTENT_REGEX=r'\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'

# Mask cardholder data fields
DATA_MASKING_ARRAY=["card_number", "cvv", "cardholder_name"]
ENABLE_DATA_MASKING_X=true

Validation and Testing

Test Configuration

Before deploying to production, validate your masking configuration:

# Enable debug output to verify masking
DEBUG_OUTPUT=true

# Test with sample data
echo '{"password":"test123","email":"user@example.com"}' | \
  curl -X POST -H "Content-Type: application/json" \
  --data @- http://collector:80/test

Validation Checklist

  1. Sensitive Patterns: Verify all sensitive patterns are detected and masked
  2. Field Coverage: Ensure all sensitive fields are included in configuration
  3. Data Utility: Confirm masked data retains analytical value
  4. Performance Impact: Test with representative data volumes
  5. Compliance Alignment: Validate against regulatory requirements

Monitoring and Auditing

Effectiveness Monitoring

Configuration Management

Troubleshooting

Common Issues

Incomplete Masking

Problem: Some sensitive data not being masked Solutions:

  1. Check field names in DATA_MASKING_ARRAY match exactly
  2. Verify regex patterns cover all data formats
  3. Enable debug output to trace processing

Performance Issues

Problem: High CPU usage with masking enabled Solutions:

  1. Use simpler hashing algorithms (SHA2 instead of SHA3)
  2. Disable partial redaction for better performance
  3. Optimize regex patterns for efficiency

Pattern Mismatches

Problem: Regex patterns not matching expected data Solutions:

  1. Test patterns with sample data before deployment
  2. Account for variations in data formatting
  3. Use online regex testing tools for validation

Debug Configuration

# Enable comprehensive debugging
DEBUG_OUTPUT=true
VECTOR_LOG=debug

# Test masking with sample data
curl -X POST -H "Content-Type: application/json" \
  --data '{"test_field":"sensitive_data"}' \
  http://collector:80/test

For assistance with data masking configuration, contact our integrations team at integrations@secure60.io

Back to top