About fields
Fields appear in event data as searchable name-value pairings such as user_name=fred
or ip_address=192.168.1.1
. Fields are the building blocks of Splunk searches, reports, and data models. When you run a search on your event data, Splunk software looks for fields in that data.
Look at the following example search.
status=404
This search finds events with status
fields that have a value of 404
. When you run this search, does not look for events with any other status
value. It also does not look for events containing other fields that share 404
as a value. As a result, this search returns a set of results that are more focused than you get if you used 404
in the search string.
Fields often appear in events as key=value
pairs such as user_name=Fred
. But in many events, field values appear in fixed, delimited positions without identifying keys. For example, you might have events where the user_name
value always appears by itself after the timestamp and the user_id
value.
Nov 15 09:32:22 00224 johnz Nov 15 09:39:12 01671 dmehta Nov 15 09:45:23 00043 sting Nov 15 10:02:54 00676 lscott
can identify these fields using a custom field extraction.
About field extraction
As Splunk software processes events, it extracts fields from them. This process is called field extraction.
Automatically-extracted fields
Splunk software automatically extracts host
, source
, and sourcetype
values, timestamps, and several other default fields when it indexes incoming events.
It also extracts fields that appear in your event data as key=value
pairs. This process of recognizing and extracting k/v pairs is called field discovery. You can disable field discovery to improve search performance.
When fields appear in events without their keys, Splunk software uses pattern-matching rules called regular expressions to extract those fields as complete k/v pairs. With a properly-configured regular expression, can extract user_id=johnz
from the previous sample event. comes with several field extraction configurations that use regular expressions to identify and extract fields from event data.
For more information about field discovery and an example of automatic field extraction, see When Splunk software extracts fields.
For more information on how uses regular expressions to extract fields, see About Splunk regular expressions.
To get all of the fields in your data, create custom field extractions
To use the power of Splunk search, create additional field extractions. Custom field extractions allow you to capture and track information that is important to your needs, but which is not automatically discovered and extracted by Splunk software. Any field extraction configuration you provide must include a regular expression that specifies how to find the field that you want to extract.
All field extractions, including custom field extractions, are tied to a specific source
, sourcetype
, or host
value. For example, if you create an ip
field extraction, you might tie the extraction configuration for ip
to sourcetype=access_combined
.
Custom field extractions should take place at search time, but in certain rare circumstances you can arrange for some custom field extractions to take place at index time. See When extracts fields.
Before you create custom field extractions, get to know your data
Before you begin to create field extractions, ensure that you are familiar with the formats and patterns of the event data associated with the source
, sourcetype
, or host
that you are working with. One way is to investigate the predominant event patterns in your data with the Patterns tab. See Identify event patterns with the Patterns tab in the Search Manual.
Here are two events from the same source type, an apache server web access log.
131.253.24.135 - - [03/Jun/2014:20:49:53 -0700] "GET /wp-content/themes/aurora/style.css HTTP/1.1" 200 7464 "http://www.splunk.com/download" "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.0; Trident/5.0; Trident/5.0)" 10.1.10.14 - - [03/Jun/2014:20:49:33 -0700] "GET / HTTP/1.1" 200 75017 "-" "Mozilla/5.0 (compatible; Nmap Scripting Engine; http://nmap.org/book/nse.html)"
While these events contain different strings and characters, they are formatted in a consistent manner. They both present values for fields such as clientIP
, status
, bytes
, method
, and so on in a reliable order.
Reliable means that the method
value is always followed by the URI
value, the URI
value is always followed by the status
value, the status
value is always followed by the bytes
value, and so on. When your events have consistent and reliable formats, you can create a field extraction that accurately captures multiple field values from them.
For contrast, look at this set of Cisco ASA firewall log events:
1 | Jul 15 20:10:27 10.11.36.31 %ASA-6-113003: AAA group policy for user AmorAubrey is being set to Acme_techoutbound
|
2 | Jul 15 20:12:42 10.11.36.11 %ASA-7-710006: IGMP request discarded from 10.11.36.36 to outside:87.194.216.51
|
3 | Jul 15 20:13:52 10.11.36.28 %ASA-6-302014: Teardown TCP connection 517934 for Outside:128.241.220.82/1561 to Inside:10.123.124.28/8443 duration 0:05:02 bytes 297 Tunnel has been torn down (AMOSORTILEGIO)
|
4 | Apr 19 11:24:32 PROD-MFS-002 %ASA-4-106103: access-list fmVPN-1300 denied udp for user 'sdewilde7' outside/12.130.60.4(137) -> inside1/10.157.200.154(137) hit-cnt 1 first hit [0x286364c7, 0x0] "
|
While these events contain field values that are always space-delimited, they do not share a reliable format like the preceding two events. In order, these events represent:
- A group policy change
- An IGMP request
- A TCP connection
- A firewall access denial for a request from a specific IP
Because these events differ so widely, it is difficult to create a single field extraction that can apply to each of these event patterns and extract relevant field values.
In situations like this, where a specific host, source type, or source contains multiple event patterns, you may want to define field extractions that match each pattern, rather than designing a single extraction that can apply to all of the patterns. Inspect the events to identify text that is common and reliable for each pattern.
Using required text in field extractions
In the last four events, the string of numbers that follows %ASA-#-
have specific meanings. You can find their definitions in the Cisco documentation. When you have unique event identifiers like these in your data, specify them as required text in your field extraction. Required text strings limit the events that can match the regular expression in your field extraction.
Specifying required text is optional, but it offers multiple benefits. Because required text reduces the set of events that it scans, it improves field extraction efficiency and decreases the number of false-positive field extractions.
The field extractor utility enables you to highlight text in a sample event and specify that it is required text.
Methods of custom field extraction
As a knowledge manager you oversee the set of custom field extractions created by users of your Splunk deployment, and you might define specialized groups of custom field extractions yourself. The ways that you can do this include:
- The field extractor utility, which generates regular expressions for your field extractions.
- Adding field extractions through pages in Settings. You must provide a regular expression.
- Manual addition of field extraction configurations at the
.conf
file level. Provides the most flexibility for field extraction.
The field extraction methods that are available to Splunk users are described in the following sections. All of these methods enable you to create search-time field extractions. To create an index-time field extraction, choose the third option: Configure field extractions directly in configuration files.
Let the field extractor build extractions for you
The field extractor utility leads you step-by-step through the field extraction design process. It provides two methods of field extraction: regular expressions and delimiter-based field extraction. The regular expression method is useful for extracting fields from unstructured event data, where events may follow a variety of different event patterns. It is also helpful if you are unfamiliar with regular expression syntax and usage, because it generates regular expressions and lets you validate them.
The delimiter-based field extraction method is suited to structured event data. Structured event data comes from sources like SQL databases and CSV files, and produces events where all fields are separated by a common delimiter, such as commas, spaces, or pipe characters. Regular expressions usually are not necessary for structured data events from a common source.
With the regular expression method of the field extractor you can:
- Set up a field extraction by selecting a sample event and highlighting fields to extract from that event.
- Create individual extractions that capture multiple fields.
- Improve extraction accuracy by detecting and removing false positive matches.
- Validate extraction results by using search filters to ensure specific values are being extracted.
- Specify that fields only be extracted from events that have a specific string of required text.
- Review stats tables of the field values discovered by your extraction.
- Manually configure regular expression for the field expression yourself.
With the delimiter method of the field extractor you can:
- Identify a delimiter to extract all of the fields in an event.
- Rename specific fields as appropriate.
- Validate extraction results.
The field extractor can only build search time field extractions that are associated with specific sources or source types in your data (no hosts).
For more information about using the field extractor, see Build field extractions with the field extractor.
Define field extractions with the Field extractions and Field transformations pages
You can use the Field extractions and Field transformations pages in Settings to define and maintain complex extracted fields in Splunk Web.
This method of field extraction creation lets you create a wider range of field extractions than you can generate with the field extractor utility. It requires that you have the following knowledge.
- Understand how to design regular expressions.
- Have a basic understanding of how field extractions are configured in
props.conf
andtransforms.conf
.
If you create a custom field extraction that extracts its fields from _raw
and does not require a field transform, use the field extractor utility. The field extractor can generate regular expressions, and it can give you feedback about the accuracy of your field extractions as you define them.
Use the Field Extractions page to create basic field extractions, or use it in conjunction with the Field Transformations page to define field extraction configurations that can do the following things.
- Reuse the same regular expression across multiple sources, source types, or hosts.
- Apply multiple regular expressions to the same source, source type, or host.
- Use a regular expression to extract fields from the values of another field.
The Field extractions and Field transformations pages define only search time field extractions.
See the following topics in this manual:
Configure field extractions directly in configuration files
To get complete control over your field extractions, add the configurations directly into props.conf
and transforms.conf
. This method lets you create field extractions with capabilities that extend beyond what you can create with Splunk Web methods such as the field extractor utility or the Settings pages. For example, with the configuration files, you can set up:
- Delimiter-based field extractions.
- Extractions for multivalue fields.
- Extractions of fields with names that begin with numbers or underscores. This action is typically not allowed unless key cleaning is disabled.
- Formatting of extracted fields.
See Create and maintain search-time extractions through configuration files.
You can create index-time field extractions only by configuring them in props.conf
and transforms.conf
. Adding to the default set of indexed fields can result in search performance and indexing problems. But if you must create additional index-time field extractions, see Create custom fields at index time in the Getting Data In manual.
Create custom calculated fields and multivalue fields
Two kinds of custom fields can be persistently configured with the help of .conf
files: calculated fields and multivalue fields.
Multivalue fields can appear multiple times in a single event, each time with a different value. To configure custom multivalue fields, make changes to fields.conf
as well as to props.conf
. See Configure multivalue fields.
Calculated fields provide values that are calculated from the values of other fields present in the event, with the help of eval
expressions. Configure them in props.conf
. See About calculated fields.
Build field extractions into search strings
The following search commands facilitate the search-time extraction of fields in different ways:
See Extract fields with search commands in the Search Manual. Alternatively you can look up each of these commands in the Search Reference.
Field extractions facilitated by search commands apply only to the results returned by the searches in which you use these commands. You cannot use these search commands to create reusable extractions that persist after the search is completed. For that, use the field extractor utility, configure extractions with the Settings pages, or set up configurations directly in the .conf
files.
About Splunk regular expressions | Use default fields |
This documentation applies to the following versions of Splunk Cloud Platform™: 9.3.2408, 8.2.2201, 8.2.2202, 8.2.2112, 9.0.2205, 8.2.2203, 9.0.2208, 9.0.2209, 9.0.2303, 9.0.2305, 9.1.2308, 9.1.2312, 9.2.2403, 9.2.2406 (latest FedRAMP release)
Feedback submitted, thanks!