Secure60 Collector - Data Enrichment

The Secure60 Collector offers a powerful data enrichment feature that allows you to augment your log data with additional information based on the source IP address. This is particularly useful for adding context such as department, business unit, location, or any other relevant metadata to your events.

Subnet-Based Enrichment

This enrichment strategy works by taking an IP address field (by default, ip_src_address, but configurable via the ENRICH_SUBNET_SOURCE_FIELD environment variable) from an incoming event, calculating its subnet based on a configurable prefix/mask, and then looking up this subnet in a CSV (Comma Separated Values) file that you provide. If a match is found, specified fields from the CSV row are merged into the event.

Default Fields: The collector automatically maps the following fields from your CSV if they exist: source_department, source_business_unit, source_location, source_criticality, technology_group, and environment. You only need to specify the ENRICH_SUBNET_MAPPING_FIELDS environment variable if you want to add additional columns from your CSV or use different field names.

How It Works

  1. Event Ingestion: An event arrives at the collector.
  2. IP Extraction: The collector checks if the event contains the configured IP address field (defaulting to ip_src_address).
  3. Subnet Calculation: If the IP address field exists, its network subnet is calculated. For example, if the field contains 192.168.1.123 and the lookup prefix is /24, the calculated subnet will be 192.168.1.0.
  4. CSV Lookup: The collector searches for this calculated subnet (e.g., 192.168.1.0) in a dedicated column (by default, named subnet) within your mappings CSV file.
  5. Data Merging: If a matching row is found in the CSV, the collector extracts the values from the columns specified in the ENRICH_SUBNET_MAPPING_FIELDS environment variable and adds them as new fields to the event.

Preparing Your Subnet Mappings CSV File

You need to create a CSV file that contains your subnet mappings.

Example mappings_subnet.csv:

subnet,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
192.168.1.0,IT,Infrastructure,Data Center A,High,Web Services,Production
10.0.0.0,Sales,CRM Team,Cloud,Medium,Application,Production
172.16.0.0,Engineering,R&D,Main Campus,Critical,Development,Development

Deploying the Mappings File

To make your CSV file accessible to the Secure60 Collector running in Docker, you need to mount it as a volume.

Using docker run:

If your CSV file is named mappings_subnet.csv and is located in your current directory (./mappings_subnet.csv), you would mount it to the default path /etc/vector/mappings_subnet.csv inside the container:

docker run -i --name s60-collector \
  -p 80:80 -p 443:443 -p 514:514/udp -p 6514:6514 -p 5044:5044 \
  -v ./mappings_subnet.csv:/etc/vector/mappings_subnet.csv \
  --rm -d --env-file .env secure60/s60-collector:1.08

Make sure to adjust ./mappings_subnet.csv if your file is in a different location or has a different name. If you change the target path inside the container, you must also update the ENRICH_SUBNET_MAPPINGS_FILE environment variable.

Using docker-compose.yaml:

services:
  s60-collector:
    image: "secure60/s60-collector:1.08"
    container_name: "s60-collector"
    ports:
      - "443:443"
      - "80:80"
      - "514:514/udp"
      - "6514:6514"
    volumes:
      - ./mappings_subnet.csv:/etc/vector/mappings_subnet.csv
    env_file:
      - .env
    restart: 'always'
    logging:
      driver: "json-file"
      options:
        max-size: "50m"
        max-file: "10"

Again, ensure the host path (./mappings_subnet.csv) correctly points to your file. The container path should match what ENRICH_SUBNET_MAPPINGS_FILE expects, or you should set that variable accordingly.

Configuration (Environment Variables)

The subnet enrichment feature is controlled by the following environment variables, which you would typically set in your .env file:

Use Case Example

Imagine you are collecting firewall logs. These logs contain the source IP address of traffic but lack contextual information about what that IP address represents in your organization.

  1. You prepare a mappings_subnet.csv file:

    subnet,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
    10.1.10.0,Finance,Corporate HQ,New York Office,High,Web Services,Production
    10.1.20.0,HR,Corporate HQ,New York Office,Medium,Application,Production
    192.168.5.0,Guest WiFi,Annex A,Guest Network,Low,Network Infrastructure,Production
    
  2. You configure your .env file:

    ENRICH_SUBNET_ENABLE=true
    ENRICH_SUBNET_MAPPINGS_FILE=/etc/vector/mappings_subnet.csv
    ENRICH_SUBNET_LOOKUP_PREFIX=/24
    # No need to specify ENRICH_SUBNET_MAPPING_FIELDS unless you want additional fields
    
  3. You deploy the collector, ensuring mappings_subnet.csv is mounted to /etc/vector/mappings_subnet.csv.

Now, an incoming event like this:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "ip_src_address": "10.1.10.55",
  "action": "allowed",
  "destination_port": 443
}

Will be enriched by the collector to become:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "ip_src_address": "10.1.10.55",
  "action": "allowed",
  "destination_port": 443,
  "source_department": "Finance",
  "source_business_unit": "Corporate HQ",
  "source_location": "New York Office",
  "source_criticality": "High",
  "technology_group": "Web Services",
  "environment": "Production"
}

This enriched data provides much more context for analysis, alerting, and reporting within the Secure60 platform.

Exact Field Matching Enrichment

In addition to subnet-based enrichment, the Secure60 Collector supports exact field matching enrichment. This strategy works by taking any field value from an incoming event (by default, host_name, but configurable via the ENRICH_CUSTOM_EXACT_SOURCE_FIELD environment variable) and performing an exact match lookup against a CSV file. If a match is found, specified fields from the CSV row are merged into the event.

Default Fields: The collector automatically maps the following fields from your CSV if they exist: source_department, source_business_unit, source_location, source_criticality, technology_group, and environment. You only need to specify the ENRICH_CUSTOM_EXACT_MAPPING_FIELDS environment variable if you want to add additional columns from your CSV or use different field names.

How It Works

  1. Event Ingestion: An event arrives at the collector.
  2. Field Extraction: The collector checks if the event contains the configured field (defaulting to host_name).
  3. Exact Match Lookup: If the field exists, the collector searches for an exact match of this field value in the field_value column of your mappings CSV file.
  4. Data Merging: If a matching row is found in the CSV, the collector extracts the values from the columns specified in the ENRICH_CUSTOM_EXACT_MAPPING_FIELDS environment variable and adds them as new fields to the event.

Preparing Your Exact Match Mappings CSV File

You need to create a CSV file that contains your exact field mappings.

Example mappings_exact.csv:

field_value,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
webserver01,IT,Infrastructure,Headquarters,High,Web Services,Production
firewall-dmz,Security,Networking,Perimeter,Critical,Firewall,Production
vpn-gateway01,IT,Networking,Remote Access,High,VPN,Production
kube-master-01,DevOps,Platform Engineering,Kubernetes Cluster,Critical,Orchestration,Production
marketing-site-prod,Marketing,Web Team,Public Website,Medium,Web Server,Production
crm-app-staging,Sales,CRM Team,Staging Environment,Medium,Application,Staging
ldap-auth-primary,IT,Identity Management,Authentication Services,Critical,LDAP,Production

Deploying the Exact Match Mappings File

To make your CSV file accessible to the Secure60 Collector running in Docker, you need to mount it as a volume.

Using docker run:

If your CSV file is named mappings_exact.csv and is located in your current directory (./mappings_exact.csv), you would mount it to the default path /etc/vector/mappings_exact.csv inside the container:

docker run -i --name s60-collector \
  -p 80:80 -p 443:443 -p 514:514/udp -p 6514:6514 -p 5044:5044 \
  -v ./mappings_exact.csv:/etc/vector/mappings_exact.csv \
  --rm -d --env-file .env secure60/s60-collector:1.08

Make sure to adjust ./mappings_exact.csv if your file is in a different location or has a different name. If you change the target path inside the container, you must also update the ENRICH_CUSTOM_EXACT_FILE environment variable.

Using docker-compose.yaml:

services:
  s60-collector:
    image: "secure60/s60-collector:1.08"
    container_name: "s60-collector"
    ports:
      - "443:443"
      - "80:80"
      - "514:514/udp"
      - "6514:6514"
    volumes:
      - ./mappings_exact.csv:/etc/vector/mappings_exact.csv
    env_file:
      - .env
    restart: 'always'
    logging:
      driver: "json-file"
      options:
        max-size: "50m"
        max-file: "10"

Again, ensure the host path (./mappings_exact.csv) correctly points to your file. The container path should match what ENRICH_CUSTOM_EXACT_FILE expects, or you should set that variable accordingly.

Configuration (Environment Variables)

The exact field matching enrichment feature is controlled by the following environment variables, which you would typically set in your .env file:

Use Case Example

Imagine you are collecting application logs from various servers and applications. These logs contain hostnames or application names but lack organizational context about what these systems represent.

  1. You prepare a mappings_exact.csv file:

    field_value,source_department,source_business_unit,source_location,source_criticality,technology_group,environment
    web-prod-01,IT,Infrastructure,Data Center A,High,Web Services,Production
    db-primary,IT,Infrastructure,Data Center A,Critical,Database,Production
    crm-staging,Sales,CRM Team,Cloud,Medium,Application,Staging
    analytics-worker,Data Engineering,Analytics,Data Center B,Medium,Big Data,Production
    auth-service,Security,Identity Management,Cloud,Critical,Authentication,Production
    
  2. You configure your .env file:

    ENRICH_CUSTOM_EXACT_ENABLE=true
    ENRICH_CUSTOM_EXACT_SOURCE_FIELD=host_name
    ENRICH_CUSTOM_EXACT_FILE=/etc/vector/mappings_exact.csv
    # No need to specify ENRICH_CUSTOM_EXACT_MAPPING_FIELDS unless you want additional fields
    
  3. You deploy the collector, ensuring mappings_exact.csv is mounted to /etc/vector/mappings_exact.csv.

Now, an incoming event like this:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "host_name": "web-prod-01",
  "service": "nginx",
  "message": "Request processed successfully",
  "response_code": 200
}

Will be enriched by the collector to become:

{
  "timestamp": "2023-10-26T10:00:00Z",
  "host_name": "web-prod-01",
  "service": "nginx",
  "message": "Request processed successfully",
  "response_code": 200,
  "source_department": "IT",
  "source_business_unit": "Infrastructure",
  "source_location": "Data Center A",
  "source_criticality": "High",
  "technology_group": "Web Services",
  "environment": "Production"
}

Flexible Field Matching

Unlike subnet enrichment which is specifically designed for IP addresses, exact field matching can work with any field in your events. You can configure it to match against:

Simply change the ENRICH_CUSTOM_EXACT_SOURCE_FIELD environment variable to point to the field you want to use for matching.

Combining Enrichment Methods

Both subnet-based and exact field matching enrichment can be enabled simultaneously. The collector will apply both enrichment strategies to your events, allowing you to add both network-based context (from subnet enrichment) and asset-specific context (from exact field matching) to the same events.

This enriched data provides comprehensive organizational context for analysis, alerting, and reporting within the Secure60 platform.

Troubleshooting

CSV Mapping Field Rules

When working with CSV files for data enrichment, there are strict rules that must be followed to ensure proper functionality. These rules apply to both subnet-based and exact field matching enrichment.

Rule 1: Column Names and Mapping Fields Must Match

If you change the column names or count inside your CSV file, you must update the corresponding mapping fields environment variable to match exactly.

For Subnet Enrichment:

For Exact Field Matching:

Example Problem:

subnet,department,business_unit,location,criticality,tech_group,env
192.168.1.0,IT,Infrastructure,Data Center A,High,Web Services,Production

Solution: You must update your environment variable to match the new column names:

ENRICH_SUBNET_MAPPING_FIELDS=department,business_unit,location,criticality,tech_group,env

Rule 2: First Column Must Remain Unchanged

The first column of your CSV file serves as the lookup key and must not be renamed or repositioned.

For Subnet Enrichment:

For Exact Field Matching:

Incorrect Example:

network,source_department,source_business_unit  # ❌ Wrong - first column renamed
192.168.1.0,IT,Infrastructure

Correct Example:

subnet,source_department,source_business_unit   # ✅ Correct - first column unchanged
192.168.1.0,IT,Infrastructure

Rule 3: Data Consistency - All Rows Must Have Same Column Count

Every data row in your CSV must have the same number of columns as defined in the header row. Columns can be empty, but they cannot be missing.

Incorrect Example:

subnet,source_department,source_business_unit,source_location
192.168.1.0,IT,Infrastructure                    # ❌ Missing source_location column
10.0.0.0,Sales,CRM Team,Cloud,Extra              # ❌ Too many columns

Correct Example:

subnet,source_department,source_business_unit,source_location
192.168.1.0,IT,Infrastructure,                   # ✅ Empty but present
10.0.0.0,Sales,CRM Team,Cloud                    # ✅ All columns present
172.16.0.0,,,                                    # ✅ All empty but present

Common Issues and Solutions

Issue: Enrichment Not Working

Symptoms:

Possible Causes and Solutions:

  1. CSV file not mounted correctly

    • Verify the volume mount path matches the environment variable
    • Check file permissions (should be readable by the container)
  2. Field names don’t match

    • Ensure ENRICH_SUBNET_MAPPING_FIELDS or ENRICH_CUSTOM_EXACT_MAPPING_FIELDS exactly match your CSV column headers
    • Field names are case-sensitive
  3. Lookup field missing or incorrect

    • For subnet enrichment: Verify the event contains the field specified in ENRICH_SUBNET_SOURCE_FIELD (default: ip_src_address)
    • For exact matching: Verify the event contains the field specified in ENRICH_CUSTOM_EXACT_SOURCE_FIELD (default: host_name)
  4. CSV format issues

    • Ensure proper CSV formatting with commas as separators
    • Check for extra spaces or special characters in headers
    • Verify all rows have the same number of columns

Issue: Partial Enrichment

Symptoms:

Possible Causes and Solutions:

  1. Incomplete CSV data

    • Check that all required columns exist in your CSV
    • Verify data rows are complete (no missing commas)
  2. Mapping field mismatch

    • Compare your ENRICH_*_MAPPING_FIELDS environment variable with actual CSV column names
    • Ensure field names match exactly (case-sensitive)

Issue: Container Fails to Start

Symptoms:

Possible Causes and Solutions:

  1. File path issues

    • Verify the CSV file exists at the specified host path
    • Check that the file is readable (permissions)
    • Ensure the container path in environment variables matches the volume mount
  2. CSV parsing errors

    • Validate CSV syntax using a CSV validator
    • Check for malformed rows or headers
    • Ensure file encoding is UTF-8

Best Practices

  1. Always validate your CSV file before deployment using a CSV validator or spreadsheet application
  2. Use consistent naming conventions for your column headers
  3. Test with a small subset of data first to verify enrichment is working correctly
  4. Keep backups of your CSV files before making changes
  5. Document your custom field mappings for future reference
  6. Monitor logs during initial deployment to catch any configuration issues early
Back to top