Splunk Enterprise

transform ingested HEC JSON log as regular log

PT_crusher
Explorer

Looking for props.conf / transforms.conf configuration guidance.

The aim is to search logs from a HTTP Event Collector the same way we search for regular logs. Don't want to search JSON in the search heads.

We're in the process of migrating from Splunk Forwarders to logging-operator in k8s. Thing is, Splunk Forwarder uses log files and standard indexer discovery whereas logging-operator uses stdout/stderr and must output to an HEC endpoint, meaning the logs arrive as JSON at the heavy forwarder.

We want to use Splunk the same way we did over the years and want to avoid adapting alerts/dashboards etc to the new JSON source

OLD CONFIG AIMED TO THE INDEXERS (using the following config we get environment/site/node/team/pod as search-time extraction fields)

 

[vm.container.meta]
# source: /data/nodes/env1/site1/host1/logs/team1/env1/pod_name/localhost_access_log.log
CLEAN_KEYS = 0
REGEX = \/.*\/.*\/(.*)\/(.*)\/(.*)\/.*\/(.*)\/.*\/(.*)\/
FORMAT = environment::$1 site::$2 node::$3 team::$4 pod::$5
SOURCE_KEY = MetaData:Source
WRITE_META = true

 


SAMPLE LOG USING logging-operator

 

{
"log": "ts=2024-10-15T15:22:44.548Z caller=scrape.go:1353 level=debug component=\"scrape manager\" scrape_pool=kubernetes-pods target=http://1.1.1.1:8050/_api/metrics msg=\"Scrape failed\" err=\"Get \\\"http://1.1.1.1:8050/_api/metrics\\\": dial tcp 1.1.1.1:8050: connect: connection refused\"\n",
"stream": "stderr",
"time": "2024-10-15T15:22:44.548801729Z",
"environment": "env1",
"node": "host1",
"pod": "pod_name",
"site": "site1",
"team": "team1"
}

 

Labels (2)
Tags (1)
0 Karma

sainag_splunk
Splunk Employee
Splunk Employee

The only way you can do is to use  /services/collector/raw endpoint.

I understand the desire to maintain your existing Splunk setup, I would advise against using the raw endpoint (/services/collector/raw) to transform the JSON logs back into regular log format. This approach would unnecessarily increase system load and complexity.

Instead, the best practice is to use the existing event endpoint (/services/collector/event) for ingesting data into Splunk. This is optimized for handling structured data like JSON and is more efficient.

I recommend adjusting your alerts and dashboards to work with the new JSON structure from logging-operator. While this may require some initial effort, it's a more sustainable approach in the long run:

  1. Update your search queries to use JSON-specific commands like spath or KV_MODE=JSON to extract fields.
  2. Modify dashboards to reference the new JSON field names.
  3. Adjust alerts to use the appropriate JSON fields and structure.




0 Karma

PickleRick
SplunkTrust
SplunkTrust

I don't think that's the issue here. The same payload sent to the /raw endpoint would end up looking the same. It's the source formatting the data differently than before.

0 Karma

PT_crusher
Explorer

raw endpoint is not an option because it is not supported by the logging-operator

0 Karma
Get Updates on the Splunk Community!

Mastering Data Pipelines: Unlocking Value with Splunk

 In today's AI-driven world, organizations must balance the challenges of managing the explosion of data with ...

The Latest Cisco Integrations With Splunk Platform!

Join us for an exciting tech talk where we’ll explore the latest integrations in Cisco + Splunk! We’ve ...

AI Adoption Hub Launch | Curated Resources to Get Started with AI in Splunk

Hey Splunk Practitioners and AI Enthusiasts! It’s no secret (or surprise) that AI is at the forefront of ...