Splunk Search

How can I use a subsearch with rex extracted field to seach over an extracted field

mwolfe
Engager

I am trying to take the results of one search, extract a field from those results (named "id") and take all of those values (deduped) and use them to get results from another search. Unfortunately the second search doesn't have this field name directly in the sourcetype either so it has to be extracted with rex. 

I've been having issues with this though. From what I've read I need to use the subsearch to extract the id's for the outer search. It's not working though. Each search is from a competely different data set that has very little in common.

 

index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
[ search index=index2  source="/another.log"" "condition-i-want-to-find"
  | rex field=_raw "some_id:(?<some_id>[^,]+),*"
  | dedup some_id
  | fields some_id
]

 


I've tried a bunch of variations of this with no luck. Including renaming field some_id to "search" as  some have said that would help. I don't necessarily need the original uri="/path/with/id/some_id" in the outer search but that would be nice to limit those results.

Labels (3)
0 Karma
1 Solution

yuanliu
SplunkTrust
SplunkTrust

Whereas the syntax problem that @PickleRick pointed out can be rectified by adding a pipe like this

 

index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
| search
  [ search index=index2  source="/another.log"" "condition-i-want-to-find"
  | rex field=_raw "some_id:(?<some_id>[^,]+),*"
  | dedup some_id
  | fields some_id
  ]

 

this method reduces the advantage of using subsearch in your dataset.

To improve efficiency, "renaming field some_id to "search" as  some have said would help" actually will help. (In part because / is a hard separator in Splunk.)  You just need to add a format command:

 

index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
    [ search index=index2  source="/another.log"" "condition-i-want-to-find"
    | rex field=_raw "some_id:(?<search>[^,]+),*"
    | dedup search
    | fields search
    | format
    ]
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"

 

Here is an emulation.  Play with it and compare with your data.

 

index = _internal log/splunk
``` the above emulates
index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
```
    [makeresults format=csv data="search
    supervisor.log
    splunkd_ui_access.log"
``` the above emulates
        [ search index=index2  source="/another.log"" "condition-i-want-to-find"
    | rex field=_raw "some_id:(?<search>[^,]+),*"
    | dedup search
    | fields search
    | format
    ]
```
    | format]
| rex field=series "log/splunk/(?<some_id>[^\"]+)" ``` emulates | rex field=uri "/path/with/id/(?<some_id>[^/]+)/*" ```
| stats count by some_id

 

On my laptop, it gives

some_idcount
splunkd_ui_access.log59
supervisor.log1045

As you can see, among all the logs, the output is limited to the two values in the subsearch.

View solution in original post

Tags (1)
0 Karma

yuanliu
SplunkTrust
SplunkTrust

Whereas the syntax problem that @PickleRick pointed out can be rectified by adding a pipe like this

 

index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"
| search
  [ search index=index2  source="/another.log"" "condition-i-want-to-find"
  | rex field=_raw "some_id:(?<some_id>[^,]+),*"
  | dedup some_id
  | fields some_id
  ]

 

this method reduces the advantage of using subsearch in your dataset.

To improve efficiency, "renaming field some_id to "search" as  some have said would help" actually will help. (In part because / is a hard separator in Splunk.)  You just need to add a format command:

 

index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
    [ search index=index2  source="/another.log"" "condition-i-want-to-find"
    | rex field=_raw "some_id:(?<search>[^,]+),*"
    | dedup search
    | fields search
    | format
    ]
| rex field=uri "/path/with/id/(?<some_id>[^/]+)/*"

 

Here is an emulation.  Play with it and compare with your data.

 

index = _internal log/splunk
``` the above emulates
index=index1 source="/somefile.log"  uri="/path/with/id/some_id/"
```
    [makeresults format=csv data="search
    supervisor.log
    splunkd_ui_access.log"
``` the above emulates
        [ search index=index2  source="/another.log"" "condition-i-want-to-find"
    | rex field=_raw "some_id:(?<search>[^,]+),*"
    | dedup search
    | fields search
    | format
    ]
```
    | format]
| rex field=series "log/splunk/(?<some_id>[^\"]+)" ``` emulates | rex field=uri "/path/with/id/(?<some_id>[^/]+)/*" ```
| stats count by some_id

 

On my laptop, it gives

some_idcount
splunkd_ui_access.log59
supervisor.log1045

As you can see, among all the logs, the output is limited to the two values in the subsearch.

Tags (1)
0 Karma

PickleRick
SplunkTrust
SplunkTrust

A subsearch will get executed first and if it completes successfully (which might not happen - subsearches have limitations and throwing heavy raw-data based searches into them is not a good idea) will return a set of conditions or a search string which will get substituted in the main search.

So your search as it is will make no sense syntactically because the rex command doesn't take more arguments.

If anything you'd need to do

<something>
| search [ your subsearch here ]

 

0 Karma
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...