Data Enrichment Guide

Overview

Data enrichment is the process of adding additional context to events being stored in the Secure60 system. This additional context enables effective searching and reporting across multiple dimensions, while allowing the system to scale to store data from hundreds of applications or environments with tens of thousands of devices.

Event enrichment enables you to:

Enrichment Strategy

The Secure60 platform supports multiple levels of enrichment that are applied in a specific order:

  1. Client-Side Enrichment - Add fields at the data source
  2. Static Field Addition - Add consistent metadata via the collector
  3. Field Extraction - Extract structured data from unstructured fields
  4. Field Mapping - Normalize field names across sources
  5. Technology-Specific Normalization - Apply pre-built transformations
  6. Subnet-Based Enrichment - Add context based on IP address ranges
  7. Exact Field Matching - Add context based on specific field values
  8. Automatic Normalization - Apply Secure60’s built-in field standardization
  9. Ingest GEO Enhancement - Add geographic and ASN information

Common Enrichment Fields

Standard Context Fields

Examples of valuable context fields include:

Secure60 Standard Fields

The platform follows the Secure60 Common Information Model with these recommended fields:

Client-Side Enrichment

Overview

Some applications and devices allow you to add custom fields at the source before sending data to the collector.

Examples

Linux Syslog Template Modification

Linux servers can use modified syslog templates to add additional fields:

# Example rsyslog template with custom fields
template(name="CustomFormat" type="string"
    string="%timestamp:::date-rfc3339% %hostname% %app-name% %msg% app_name=\"web-server\" environment=\"production\"\n")

Application-Level Field Addition

Many applications support adding custom fields to their log output:

{
  "timestamp": "2024-01-15T10:30:00Z",
  "level": "INFO",
  "message": "User login successful",
  "app_name": "user-portal",
  "environment": "production",
  "region": "us-east-1"
}

Best Practices

Collector-Based Enrichment

Static Fields

Add consistent metadata to all events processed by a specific collector instance.

Configuration

# Format: field_name=value,field_name2=value2
STATIC_FIELDS=environment=production,region=us-east-1,collector_zone=dmz

Use Cases

Field Mapping

Normalize field names from different data sources to create consistency.

Configuration

# Format: source_field=destination_field
MAP_FIELDS=username=user_name,clientip=source_ip,app=application_name,src_ip=source_ip

Common Mappings

# Standardize user field names
user=user_name,username=user_name,userid=user_name

# Standardize IP address fields
clientip=source_ip,src_ip=source_ip,client_addr=source_ip

# Standardize application fields
app=application_name,service=application_name,program=application_name

Subnet-Based Enrichment

Overview

Enhance events with contextual information based on the source IP address’s subnet. This method calculates the subnet for an IP address and looks up enrichment data in a CSV file.

How It Works

  1. Extract the IP address from the configured field (default: ip_src_address)
  2. Calculate the subnet using the configured prefix (default: /24)
  3. Look up the subnet in the provided CSV file
  4. Add matching fields from the CSV to the event

Configuration

Environment Variables

ENRICH_SUBNET_ENABLE=true
ENRICH_SUBNET_SOURCE_FIELD=source_ip          # Field containing IP address
ENRICH_SUBNET_LOOKUP_PREFIX=/24               # Subnet mask to use
ENRICH_SUBNET_MAPPING_FIELDS=source_department,source_business_unit,source_location,source_criticality

CSV File Format

Create a file named mappings_subnet.csv:

subnet,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
192.168.1.0/24,IT,Technology,New York,High,Infrastructure,Production
10.0.1.0/24,Finance,Business,London,Critical,Applications,Production
172.16.0.0/16,Development,Technology,San Francisco,Medium,Development,Development
10.50.0.0/16,HR,Business,Chicago,Medium,Applications,Production

Deployment

Mount the CSV file into your collector container:

docker run -v ./mappings_subnet.csv:/etc/vector/mappings_subnet.csv \
  --name s60-collector --env-file .env secure60/s60-collector:1.08

Example Enrichment

# Original Event
{
  "message": "User login failed",
  "source_ip": "192.168.1.100"
}

# After Subnet Enrichment (using /24 prefix)
{
  "message": "User login failed", 
  "source_ip": "192.168.1.100",
  "source_department": "IT",
  "source_business_unit": "Technology", 
  "source_location": "New York",
  "source_criticality": "High",
  "technology_group": "Infrastructure",
  "environment": "Production"
}

Exact Field Matching Enrichment

Overview

Enhance events with contextual information based on exact field value matching. This method performs exact lookups against any field value (commonly hostnames, but configurable for any field).

How It Works

  1. Extract the field value from the configured field (default: host_name)
  2. Perform an exact match lookup in the provided CSV file
  3. Add matching fields from the CSV to the event

Configuration

Environment Variables

ENRICH_CUSTOM_EXACT_ENABLE=true
ENRICH_CUSTOM_EXACT_SOURCE_FIELD=host_name     # Field to match against
ENRICH_CUSTOM_EXACT_MAPPING_FIELDS=source_department,source_business_unit,source_location,source_criticality

CSV File Format

Create a file named mappings_exact.csv:

field_value,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
web-server-01,IT,Technology,New York,High,Infrastructure,Production
web-server-02,IT,Technology,London,High,Infrastructure,Production
db-server-prod,Database,Technology,Frankfurt,Critical,Infrastructure,Production
app-server-dev,Development,Technology,San Francisco,Low,Applications,Development
mail-server-01,IT,Technology,Sydney,Critical,Infrastructure,Production

Alternative Field Matching

You can match against any field, not just hostnames:

# Match against application name
ENRICH_CUSTOM_EXACT_SOURCE_FIELD=application_name

# Match against user name  
ENRICH_CUSTOM_EXACT_SOURCE_FIELD=user_name

# Match against service name
ENRICH_CUSTOM_EXACT_SOURCE_FIELD=service_name

Deployment

Mount the CSV file into your collector container:

docker run -v ./mappings_exact.csv:/etc/vector/mappings_exact.csv \
  --name s60-collector --env-file .env secure60/s60-collector:1.08

Example Enrichment

# Original Event
{
  "message": "Database connection established",
  "host_name": "db-server-prod"
}

# After Exact Field Matching
{
  "message": "Database connection established",
  "host_name": "db-server-prod",
  "source_department": "Database",
  "source_business_unit": "Technology",
  "source_location": "Frankfurt", 
  "source_criticality": "Critical",
  "technology_group": "Infrastructure",
  "environment": "Production"
}

Technology-Specific Normalization

Overview

The Secure60 Collector includes pre-built transformations for specific technologies that can be enabled via the Portal UI or environment variables.

Supported Technologies

Configuration

Enable technology-specific normalization via Portal UI:

  1. Navigate to Integrations → Secure60 Collector
  2. Click “Advanced Config”
  3. Enable the specific technology transformations you need

Or via environment variables:

ENABLE_LINUX_SYSLOG=true
ENABLE_M365=true
ENABLE_NGINX=true
ENABLE_AWS=true

Automatic Normalization

Overview

The Secure60 Collector automatically normalizes known field names into the Secure60 schema with zero configuration required.

Covered Log Types

Configuration

ENABLE_GENERIC_NORMALISE=true  # Enabled by default

Ingest GEO Enhancement

Overview

The Secure60 Ingest layer automatically adds geographic information to any ip_src_address or ip_dst_address fields in your events.

Added Fields

Example Enhancement

# Before GEO Enhancement
{
  "ip_src_address": "8.8.8.8",
  "message": "DNS query"
}

# After GEO Enhancement
{
  "ip_src_address": "8.8.8.8",
  "message": "DNS query",
  "geo_src_country": "United States",
  "geo_src_city": "Mountain View",
  "geo_src_latitude": 37.4056,
  "geo_src_longitude": -122.0775,
  "asn_src_org": "Google LLC",
  "asn_src_number": 15169
}

Data Architecture Considerations

Organisation and Project Structure

Organisational Design

Common Architectures

Regional Architecture

Organisation: North America
├── Project: US Production
├── Project: US Development  
└── Project: Canada Production

Organisation: Europe
├── Project: UK Production
├── Project: Germany Production
└── Project: EU Development

Application-Centric Architecture

Organisation: Enterprise IT
├── Project: ERP Systems
├── Project: Web Applications
├── Project: Database Systems
└── Project: Network Infrastructure

Environment-Based Architecture

Organisation: Company
├── Project: Production
├── Project: Staging
└── Project: Development

Best Practices

Planning Your Enrichment Strategy

  1. Work Backwards: Identify key questions you need to answer
  2. Design Around Use Cases: Structure data storage around common queries
  3. Consider Scale: Plan for hundreds of applications and thousands of devices
  4. Standardize Early: Establish consistent field naming conventions

Field Naming Conventions

Performance Considerations

Regex Best Practices

When using regex patterns for field extraction or content redaction:

Troubleshooting

Common Issues

Enrichment Not Applied

  1. Verify CSV file is properly mounted in the container
  2. Check that the source field exists in incoming events
  3. Ensure the field value matches exactly (case-sensitive)
  4. Validate CSV file format and headers

Performance Issues

  1. Reduce the number of enrichment operations
  2. Optimize CSV file size by removing unused columns
  3. Use more specific subnet masks when appropriate
  4. Consider caching strategies for frequently accessed data

Data Quality Issues

  1. Validate field extraction regex patterns with test data
  2. Monitor for null or empty values in enrichment sources
  3. Implement data validation at the source when possible
  4. Use consistent data formats across all sources

For assistance with data enrichment configuration, contact our integrations team at integrations@secure60.io

Back to top