Secure60 Collector - Data Enrichment

The Secure60 Collector offers a powerful data enrichment feature that allows you to augment your log data with additional information based on the source IP address. This is particularly useful for adding context such as department, business unit, location, or any other relevant metadata to your events.

Subnet-Based Enrichment

This enrichment strategy works by taking an IP address field (by default, ip_src_address, but configurable via the ENRICH_SUBNET_SOURCE_FIELD environment variable) from an incoming event, calculating its subnet based on a configurable prefix/mask, and then looking up this subnet in a CSV (Comma Separated Values) file that you provide. If a match is found, specified fields from the CSV row are merged into the event.

Default Fields: The collector automatically maps the following fields from your CSV if they exist: source_department, source_business_unit, source_location, source_criticality, technology_group, and environment. You only need to specify the ENRICH_SUBNET_MAPPING_FIELDS environment variable if you want to add additional columns from your CSV or use different field names.

How It Works

Event Ingestion: An event arrives at the collector.
IP Extraction: The collector checks if the event contains the configured IP address field (defaulting to ip_src_address).
Subnet Calculation: If the IP address field exists, its network subnet is calculated. For example, if the field contains 192.168.1.123 and the lookup prefix is /24, the calculated subnet will be 192.168.1.0.
CSV Lookup: The collector searches for this calculated subnet (e.g., 192.168.1.0) in a dedicated column (by default, named subnet) within your mappings CSV file.
Data Merging: If a matching row is found in the CSV, the collector extracts the values from the columns specified in the ENRICH_SUBNET_MAPPING_FIELDS environment variable and adds them as new fields to the event.

Preparing Your Subnet Mappings CSV File

You need to create a CSV file that contains your subnet mappings.

The first row of the CSV file must be a header row defining the column names.
One column must contain the subnets you want to match against (e.g., 192.168.1.0, 10.0.0.0). The default name for this column is subnet, but it should match the key used in the internal lookup (which is hardcoded to subnet for the find_enrichment_table_records function based on the s60-core.yaml).
Other columns will contain the data you want to add to your events (e.g., source_department, source_business_unit, source_location).

Example mappings_subnet.csv:

subnet,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
192.168.1.0,IT,Infrastructure,Data Center A,High,Web Services,Production
10.0.0.0,Sales,CRM Team,Cloud,Medium,Application,Production
172.16.0.0,Engineering,R&D,Main Campus,Critical,Development,Development

Deploying the Mappings File

To make your CSV file accessible to the Secure60 Collector running in Docker, you need to mount it as a volume.

Using `docker run`:

If your CSV file is named mappings_subnet.csv and is located in your current directory (./mappings_subnet.csv), you would mount it to the default path /etc/vector/mappings_subnet.csv inside the container:

docker run -i --name s60-collector \
  -p 80:80 -p 443:443 -p 514:514/udp -p 6514:6514 -p 5044:5044 \
  -v ./mappings_subnet.csv:/etc/vector/mappings_subnet.csv \
  --rm -d --env-file .env secure60/s60-collector:1.09

Make sure to adjust ./mappings_subnet.csv if your file is in a different location or has a different name. If you change the target path inside the container, you must also update the ENRICH_SUBNET_MAPPINGS_FILE environment variable.

Using `docker-compose.yaml`:

services:
  s60-collector:
    image: "secure60/s60-collector:1.09"
    container_name: "s60-collector"
    ports:
      - "443:443"
      - "80:80"
      - "514:514/udp"
      - "6514:6514"
    volumes:
      - ./mappings_subnet.csv:/etc/vector/mappings_subnet.csv
    env_file:
      - .env
    restart: 'always'
    logging:
      driver: "json-file"
      options:
        max-size: "50m"
        max-file: "10"

Again, ensure the host path (./mappings_subnet.csv) correctly points to your file. The container path should match what ENRICH_SUBNET_MAPPINGS_FILE expects, or you should set that variable accordingly.

Configuration (Environment Variables)

The subnet enrichment feature is controlled by the following environment variables, which you would typically set in your .env file:

ENRICH_SUBNET_ENABLE
- Description: Enables or disables the subnet enrichment feature.
- Values: true or false.
- Default: false.
- Example: ENRICH_SUBNET_ENABLE=true
ENRICH_SUBNET_SOURCE_FIELD
- Description: Specifies the field in the incoming event to use for extracting the IP address for subnet lookup.
- Default: ip_src_address.
- Example: ENRICH_SUBNET_SOURCE_FIELD=source.ip
ENRICH_SUBNET_MAPPINGS_FILE
- Description: The absolute path inside the collector container where your subnet mappings CSV file is located.
- Default: /etc/vector/mappings_subnet.csv.
- Example: ENRICH_SUBNET_MAPPINGS_FILE=/etc/vector/custom_subnet_data.csv (if you mounted your CSV to a different path).
ENRICH_SUBNET_LOOKUP_PREFIX
- Description: The CIDR prefix (e.g., /24) or subnet mask (e.g., 255.255.255.0) used to calculate the subnet from the IP address field for the lookup.
- Default: /24.
- Example: ENRICH_SUBNET_LOOKUP_PREFIX=/16 or ENRICH_SUBNET_LOOKUP_PREFIX=255.255.0.0
ENRICH_SUBNET_MAPPING_FIELDS
- Description: A comma-separated list of column names from your CSV file that you want to merge into the event data. These names must match the headers in your CSV file.
- Default: source_department,source_business_unit,source_location,source_criticality,technology_group,environment.
- Note: You only need to set this variable if you want to map additional columns beyond the defaults or use different field names.
- Example: ENRICH_SUBNET_MAPPING_FIELDS=source_department,source_business_unit,source_location,source_criticality,technology_group,environment,cost_center,asset_tag

Use Case Example

Imagine you are collecting firewall logs. These logs contain the source IP address of traffic but lack contextual information about what that IP address represents in your organization.

You prepare a mappings_subnet.csv file:

subnet,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
10.1.10.0,Finance,Corporate HQ,New York Office,High,Web Services,Production
10.1.20.0,HR,Corporate HQ,New York Office,Medium,Application,Production
192.168.5.0,Guest WiFi,Annex A,Guest Network,Low,Network Infrastructure,Production

You configure your .env file:

ENRICH_SUBNET_ENABLE=true
ENRICH_SUBNET_MAPPINGS_FILE=/etc/vector/mappings_subnet.csv
ENRICH_SUBNET_LOOKUP_PREFIX=/24
# No need to specify ENRICH_SUBNET_MAPPING_FIELDS unless you want additional fields

You deploy the collector, ensuring mappings_subnet.csv is mounted to /etc/vector/mappings_subnet.csv.

Now, an incoming event like this:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "ip_src_address": "10.1.10.55",
  "action": "allowed",
  "destination_port": 443
}

Will be enriched by the collector to become:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "ip_src_address": "10.1.10.55",
  "action": "allowed",
  "destination_port": 443,
  "source_department": "Finance",
  "source_business_unit": "Corporate HQ",
  "source_location": "New York Office",
  "source_criticality": "High",
  "technology_group": "Web Services",
  "environment": "Production"
}

This enriched data provides much more context for analysis, alerting, and reporting within the Secure60 platform.

Exact Field Matching Enrichment

In addition to subnet-based enrichment, the Secure60 Collector supports exact field matching enrichment. This strategy works by taking any field value from an incoming event (by default, host_name, but configurable via the ENRICH_CUSTOM_EXACT_SOURCE_FIELD environment variable) and performing an exact match lookup against a CSV file. If a match is found, specified fields from the CSV row are merged into the event.

Default Fields: The collector automatically maps the following fields from your CSV if they exist: source_department, source_business_unit, source_location, source_criticality, technology_group, and environment. You only need to specify the ENRICH_CUSTOM_EXACT_MAPPING_FIELDS environment variable if you want to add additional columns from your CSV or use different field names.

How It Works

Event Ingestion: An event arrives at the collector.
Field Extraction: The collector checks if the event contains the configured field (defaulting to host_name).
Exact Match Lookup: If the field exists, the collector searches for an exact match of this field value in the field_value column of your mappings CSV file.
Data Merging: If a matching row is found in the CSV, the collector extracts the values from the columns specified in the ENRICH_CUSTOM_EXACT_MAPPING_FIELDS environment variable and adds them as new fields to the event.

Preparing Your Exact Match Mappings CSV File

You need to create a CSV file that contains your exact field mappings.

The first row of the CSV file must be a header row defining the column names.
One column must be named field_value and contain the exact values you want to match against (e.g., webserver01, firewall-dmz, crm-app-staging).
Other columns will contain the data you want to add to your events (e.g., source_department, source_business_unit, source_location).

Example mappings_exact.csv:

field_value,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
webserver01,IT,Infrastructure,Headquarters,High,Web Services,Production
firewall-dmz,Security,Networking,Perimeter,Critical,Firewall,Production
vpn-gateway01,IT,Networking,Remote Access,High,VPN,Production
kube-master-01,DevOps,Platform Engineering,Kubernetes Cluster,Critical,Orchestration,Production
marketing-site-prod,Marketing,Web Team,Public Website,Medium,Web Server,Production
crm-app-staging,Sales,CRM Team,Staging Environment,Medium,Application,Staging
ldap-auth-primary,IT,Identity Management,Authentication Services,Critical,LDAP,Production

Deploying the Exact Match Mappings File

To make your CSV file accessible to the Secure60 Collector running in Docker, you need to mount it as a volume.

Using `docker run`:

If your CSV file is named mappings_exact.csv and is located in your current directory (./mappings_exact.csv), you would mount it to the default path /etc/vector/mappings_exact.csv inside the container:

docker run -i --name s60-collector \
  -p 80:80 -p 443:443 -p 514:514/udp -p 6514:6514 -p 5044:5044 \
  -v ./mappings_exact.csv:/etc/vector/mappings_exact.csv \
  --rm -d --env-file .env secure60/s60-collector:1.09

Make sure to adjust ./mappings_exact.csv if your file is in a different location or has a different name. If you change the target path inside the container, you must also update the ENRICH_CUSTOM_EXACT_FILE environment variable.

Using `docker-compose.yaml`:

services:
  s60-collector:
    image: "secure60/s60-collector:1.09"
    container_name: "s60-collector"
    ports:
      - "443:443"
      - "80:80"
      - "514:514/udp"
      - "6514:6514"
    volumes:
      - ./mappings_exact.csv:/etc/vector/mappings_exact.csv
    env_file:
      - .env
    restart: 'always'
    logging:
      driver: "json-file"
      options:
        max-size: "50m"
        max-file: "10"

Again, ensure the host path (./mappings_exact.csv) correctly points to your file. The container path should match what ENRICH_CUSTOM_EXACT_FILE expects, or you should set that variable accordingly.

Configuration (Environment Variables)

The exact field matching enrichment feature is controlled by the following environment variables, which you would typically set in your .env file:

ENRICH_CUSTOM_EXACT_ENABLE
- Description: Enables or disables the exact field matching enrichment feature.
- Values: true or false.
- Default: false.
- Example: ENRICH_CUSTOM_EXACT_ENABLE=true
ENRICH_CUSTOM_EXACT_SOURCE_FIELD
- Description: Specifies the field in the incoming event to use for exact match lookup.
- Default: host_name.
- Example: ENRICH_CUSTOM_EXACT_SOURCE_FIELD=application_name
ENRICH_CUSTOM_EXACT_FILE
- Description: The absolute path inside the collector container where your exact match mappings CSV file is located.
- Default: /etc/vector/mappings_exact.csv.
- Example: ENRICH_CUSTOM_EXACT_FILE=/etc/vector/custom_exact_data.csv (if you mounted your CSV to a different path).
ENRICH_CUSTOM_EXACT_MAPPING_FIELDS
- Description: A comma-separated list of column names from your CSV file that you want to merge into the event data. These names must match the headers in your CSV file.
- Default: source_department,source_business_unit,source_location,source_criticality,technology_group,environment.
- Note: You only need to set this variable if you want to map additional columns beyond the defaults or use different field names.
- Example: ENRICH_CUSTOM_EXACT_MAPPING_FIELDS=source_department,source_business_unit,source_location,source_criticality,technology_group,environment,cost_center,asset_tag

Use Case Example

Imagine you are collecting application logs from various servers and applications. These logs contain hostnames or application names but lack organizational context about what these systems represent.

You prepare a mappings_exact.csv file:

field_value,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
web-prod-01,IT,Infrastructure,Data Center A,High,Web Services,Production
db-primary,IT,Infrastructure,Data Center A,Critical,Database,Production
crm-staging,Sales,CRM Team,Cloud,Medium,Application,Staging
analytics-worker,Data Engineering,Analytics,Data Center B,Medium,Big Data,Production
auth-service,Security,Identity Management,Cloud,Critical,Authentication,Production

You configure your .env file:

ENRICH_CUSTOM_EXACT_ENABLE=true
ENRICH_CUSTOM_EXACT_SOURCE_FIELD=host_name
ENRICH_CUSTOM_EXACT_FILE=/etc/vector/mappings_exact.csv
# No need to specify ENRICH_CUSTOM_EXACT_MAPPING_FIELDS unless you want additional fields

You deploy the collector, ensuring mappings_exact.csv is mounted to /etc/vector/mappings_exact.csv.

Now, an incoming event like this:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "host_name": "web-prod-01",
  "service": "nginx",
  "message": "Request processed successfully",
  "response_code": 200
}

Will be enriched by the collector to become:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "host_name": "web-prod-01",
  "service": "nginx",
  "message": "Request processed successfully",
  "response_code": 200,
  "source_department": "IT",
  "source_business_unit": "Infrastructure",
  "source_location": "Data Center A",
  "source_criticality": "High",
  "technology_group": "Web Services",
  "environment": "Production"
}

Flexible Field Matching

Unlike subnet enrichment which is specifically designed for IP addresses, exact field matching can work with any field in your events. You can configure it to match against:

Hostnames: host_name, hostname, server_name
Application Names: app_name, application, service_name
User IDs: user_id, username, employee_id
Device IDs: device_id, asset_tag, serial_number
Any other exact field values

Simply change the ENRICH_CUSTOM_EXACT_SOURCE_FIELD environment variable to point to the field you want to use for matching.

Combining Enrichment Methods

Both subnet-based and exact field matching enrichment can be enabled simultaneously. The collector will apply both enrichment strategies to your events, allowing you to add both network-based context (from subnet enrichment) and asset-specific context (from exact field matching) to the same events.

This enriched data provides comprehensive organizational context for analysis, alerting, and reporting within the Secure60 platform.

Troubleshooting

CSV Mapping Field Rules

When working with CSV files for data enrichment, there are strict rules that must be followed to ensure proper functionality. These rules apply to both subnet-based and exact field matching enrichment.

Rule 1: Column Names and Mapping Fields Must Match

If you change the column names or count inside your CSV file, you must update the corresponding mapping fields environment variable to match exactly.

For Subnet Enrichment:

Environment Variable: ENRICH_SUBNET_MAPPING_FIELDS
Default Value: source_department,source_business_unit,source_location,source_criticality,technology_group,environment

For Exact Field Matching:

Environment Variable: ENRICH_CUSTOM_EXACT_MAPPING_FIELDS
Default Value: source_department,source_business_unit,source_location,source_criticality,technology_group,environment

Example Problem:

subnet,department,business_unit,location,criticality,tech_group,env
192.168.1.0,IT,Infrastructure,Data Center A,High,Web Services,Production

Solution: You must update your environment variable to match the new column names:

ENRICH_SUBNET_MAPPING_FIELDS=department,business_unit,location,criticality,tech_group,env

Rule 2: First Column Must Remain Unchanged

The first column of your CSV file serves as the lookup key and must not be renamed or repositioned.

For Subnet Enrichment:

First column must be named subnet and contain network addresses (e.g., 192.168.1.0)

For Exact Field Matching:

First column must be named field_value and contain the exact values to match against

Incorrect Example:

network,source_department,source_business_unit  # ❌ Wrong - first column renamed
192.168.1.0,IT,Infrastructure

Correct Example:

subnet,source_department,source_business_unit   # ✅ Correct - first column unchanged
192.168.1.0,IT,Infrastructure

Rule 3: Data Consistency - All Rows Must Have Same Column Count

Every data row in your CSV must have the same number of columns as defined in the header row. Columns can be empty, but they cannot be missing.

Incorrect Example:

subnet,source_department,source_business_unit,source_location
192.168.1.0,IT,Infrastructure                    # ❌ Missing source_location column
10.0.0.0,Sales,CRM Team,Cloud,Extra              # ❌ Too many columns

Correct Example:

subnet,source_department,source_business_unit,source_location
192.168.1.0,IT,Infrastructure,                   # ✅ Empty but present
10.0.0.0,Sales,CRM Team,Cloud                    # ✅ All columns present
172.16.0.0,,,                                    # ✅ All empty but present

Common Issues and Solutions

Issue: Enrichment Not Working

Symptoms:

Events are not being enriched with additional fields
No error messages in logs

Possible Causes and Solutions:

CSV file not mounted correctly
- Verify the volume mount path matches the environment variable
- Check file permissions (should be readable by the container)
Field names don’t match
- Ensure ENRICH_SUBNET_MAPPING_FIELDS or ENRICH_CUSTOM_EXACT_MAPPING_FIELDS exactly match your CSV column headers
- Field names are case-sensitive
Lookup field missing or incorrect
- For subnet enrichment: Verify the event contains the field specified in ENRICH_SUBNET_SOURCE_FIELD (default: ip_src_address)
- For exact matching: Verify the event contains the field specified in ENRICH_CUSTOM_EXACT_SOURCE_FIELD (default: host_name)
CSV format issues
- Ensure proper CSV formatting with commas as separators
- Check for extra spaces or special characters in headers
- Verify all rows have the same number of columns

Issue: Partial Enrichment

Symptoms:

Some fields are enriched but others are missing
Inconsistent enrichment across events

Possible Causes and Solutions:

Incomplete CSV data
- Check that all required columns exist in your CSV
- Verify data rows are complete (no missing commas)
Mapping field mismatch
- Compare your ENRICH_*_MAPPING_FIELDS environment variable with actual CSV column names
- Ensure field names match exactly (case-sensitive)

Issue: Container Fails to Start

Symptoms:

Collector container exits immediately
Error messages about file access

Possible Causes and Solutions:

File path issues
- Verify the CSV file exists at the specified host path
- Check that the file is readable (permissions)
- Ensure the container path in environment variables matches the volume mount
CSV parsing errors
- Validate CSV syntax using a CSV validator
- Check for malformed rows or headers
- Ensure file encoding is UTF-8

Best Practices

Always validate your CSV file before deployment using a CSV validator or spreadsheet application
Use consistent naming conventions for your column headers
Test with a small subset of data first to verify enrichment is working correctly
Keep backups of your CSV files before making changes
Document your custom field mappings for future reference
Monitor logs during initial deployment to catch any configuration issues early

Secure60 Collector - Data Enrichment

Subnet-Based Enrichment

How It Works

Preparing Your Subnet Mappings CSV File

Deploying the Mappings File

Using docker run:

Using docker-compose.yaml:

Configuration (Environment Variables)

Use Case Example

Exact Field Matching Enrichment

How It Works

Preparing Your Exact Match Mappings CSV File

Deploying the Exact Match Mappings File

Using docker run:

Using docker-compose.yaml:

Configuration (Environment Variables)

Use Case Example

Flexible Field Matching

Combining Enrichment Methods

Troubleshooting

CSV Mapping Field Rules

Rule 1: Column Names and Mapping Fields Must Match

Rule 2: First Column Must Remain Unchanged

Rule 3: Data Consistency - All Rows Must Have Same Column Count

Common Issues and Solutions

Issue: Enrichment Not Working

Issue: Partial Enrichment

Issue: Container Fails to Start

Best Practices

Using `docker run`:

Using `docker-compose.yaml`:

Using `docker run`:

Using `docker-compose.yaml`: