Skip to main content
Version: 2.0.0

Cribl Collector

caution

Splunk Supporting Add-on for Cribl Stream Replay was tested on manual Single-Instance and Distributed installations. Splunk Supporting Add-on for Cribl Stream Replay was not tested with Orchestrated or Docker deployments.

General Settings

UTStream is able to query the Cribl API multiple times a second with new sessions, it is required to increase the logon rate limit enforced on the auth endpoint in Cribl. The setting resides at Settings > System > General Settings > API Server Settings > Advanced > Login rate limit. Based on experience a value of 30/second is sufficient. Additionally, discovery and replay jobs might run for a long time. To prevent Cribl from invalidating the used tokens the value in Settings > System > General Settings > API Server Settings > Advanced > Auth-token TTL needs to be set to the maximal possible job length. In theory, every job that takes longer than 5 minutes is bad for the performance in UTStream but values up to 4 hours are possible.

Replay Pipe

The following two actions need to be configured in a distinct Pipeline for replayed events. Only use this pipeline for events collected by a collector and not in Routes.

  • Eval - Write inputId to cribl_replay to distinguish replay jobs from Splunk
    • cribl_replay = __inputId.split(":").pop()
    • cribl_filename = filename
  • Parse - Replace _raw with _raw from the original event
    • Operation mode: Extract
    • Type: Json Object
    • Source: _raw
    • Fields to keep: _raw

S3 Destination

General Settings

  • Output ID: define a unique ID for the output
  • S3 Bucket Name: define the bucket name of your existing bucket
  • Region: define the AWS region of your bucket
  • Key Prefix: define the key prefix of your bucket; this is a folder created in the root of your bucket containing all data written by this destination
  • Partitioning Expression: `${C.Time.strftime(_time ? _time : Date.now() / 1000, '%Y/%m/%d')}/${index ? index : 'no_index'}/${host ? host : 'no_host'}/${sourcetype ? sourcetype : 'no_sourcetype'}`
  • Data Format: json
  • File Name Prefix Expression: `${C.Time.strftime(_time ? _time : Date.now() / 1000, '%H%M')}`
  • Compress: gzip

The defined partition ensures that Cribl finds files quickly in the bucket.

Authentication

Define authentication to the bucket depending on your environment.

Advanced Settings

  • Max File Size (MB): 128MB
  • Max File Open Time (Sec): 60
  • Max File Idle Time (Sec): 60
  • Max Open Files: 2000

S3 Collector

An S3 collector must be configured to read the files written to the S3 Destination. Collectors are found in Data Sources.

Collector Settings

  • Collector ID: id for this collector; note this as the ID is required for Splunk
  • Auto populate from: select the destination from which this collector should read data from
  • Path: append the following string from the auto populated key prefix `${_time:%Y}/${_time:%m}/${_time:%d}/${index}/${host}/${sourcetype}/${_time:%H}${_time:%M}-${filename}`

Result Settings

Event Breakers

Click Add ruleset and select the Cribl event breaker.

Result Routing

  • Send to Routes: No
  • Pipeline: select the pipeline created above
  • Destination: select your Splunk destination
danger

Ensure that the data is not sent to routes as this might trigger a loop