Alerting

How to set up an alert to trigger an email when license usage by a sourcetype exceeds an average?

rhall2016
New Member

So I have this search that gives me amount logged by sourcetype in a given a time frame, say 24 hours.

index=_internal source=*license_usage.log* type=Usage | stats sum(b) as bytes by st h| eval MB = round(bytes/1024/1024,3) | fields h st MB | sort -MB

What I'm trying to do is have an alert trigger when a sourcetype goes over an average, say 30%, of what it has been reporting daily, and send out an email, but I'm stuck with the next step.

0 Karma
1 Solution

lguinn2
Legend

Despite @ddrillic's nice comment, I would probably do it this way

index=_internal source=*license_usage.log* type=Usage earliest=-8d@d
| eval cutoff=relative_time(now(),"-24h@h")
| eval group=if(_time<cutoff,"prevDays","today")
| eval hour=strftime(_time,"%H")
| stats sum(b) as bytes by st h group hour
| eval MB = round(bytes/1024/1024,3) 
| stats avg(MB) as avgMB by hour st h group
| eval prev_avg = if (group=="prevDays",avgMB,null())
| eval today =  if (group=="today",avgMB,null())
| stats first(prev_avg) as prior_days_avg first(today) as today by hour st h
| eval threshold = prior_days_avg * 1.3
| where today > threshold

How it works, line by line:
1. The search, gathering the data from the past 8 days
2. Establish a cutoff that is 24 hours ago - we will compare the week prior to the cutoff to the most recent 24 hours
3. Group the events based on their timestamp relative to the cutoff
4. Identify the hour of the day (as we will compare hour-by-hour and not day-to-day)
5. Add up the license by sourcetype host hour and group
6. Round
7. Compute the average for the hour. This calculates the 8am average, the 9am average, etc. I believe that this will allow you to be more responsive to changes in data patterns. For the last 24 hours of data, this isn't truly an average, but a sum.
8. Create a separate field for the average of the past week.
9. Create a separate field for the sum of today's data.
10. Collapse the two lines for each result into a single line
11. Create a threshold value; I set the threshhold to 130% of the average. Not a great statistic, but you mentioned it.
12. Eliminate all rows where the threshold is not exceeded.

If you use this to set an alert, you could alert on "number of results > 0" or "number of results is rising."
The results of the search will be a line for each hour in the last 24 hours where the threshold was exceeded.

View solution in original post

lguinn2
Legend

Despite @ddrillic's nice comment, I would probably do it this way

index=_internal source=*license_usage.log* type=Usage earliest=-8d@d
| eval cutoff=relative_time(now(),"-24h@h")
| eval group=if(_time<cutoff,"prevDays","today")
| eval hour=strftime(_time,"%H")
| stats sum(b) as bytes by st h group hour
| eval MB = round(bytes/1024/1024,3) 
| stats avg(MB) as avgMB by hour st h group
| eval prev_avg = if (group=="prevDays",avgMB,null())
| eval today =  if (group=="today",avgMB,null())
| stats first(prev_avg) as prior_days_avg first(today) as today by hour st h
| eval threshold = prior_days_avg * 1.3
| where today > threshold

How it works, line by line:
1. The search, gathering the data from the past 8 days
2. Establish a cutoff that is 24 hours ago - we will compare the week prior to the cutoff to the most recent 24 hours
3. Group the events based on their timestamp relative to the cutoff
4. Identify the hour of the day (as we will compare hour-by-hour and not day-to-day)
5. Add up the license by sourcetype host hour and group
6. Round
7. Compute the average for the hour. This calculates the 8am average, the 9am average, etc. I believe that this will allow you to be more responsive to changes in data patterns. For the last 24 hours of data, this isn't truly an average, but a sum.
8. Create a separate field for the average of the past week.
9. Create a separate field for the sum of today's data.
10. Collapse the two lines for each result into a single line
11. Create a threshold value; I set the threshhold to 130% of the average. Not a great statistic, but you mentioned it.
12. Eliminate all rows where the threshold is not exceeded.

If you use this to set an alert, you could alert on "number of results > 0" or "number of results is rising."
The results of the search will be a line for each hour in the last 24 hours where the threshold was exceeded.

ddrillic
Ultra Champion

Some true fun from @lguinn at Comparing Standard Deviations

 index="prd_common_events" AppCode="MMX" EventName="ReportRun" earliest = -24h@h latest=@h | 
 fields Duration ReportType |
 stats avg(Duration) as avg stdev(Duration) as standdev by ReportType |  
 eval avgts = avg + ( 2* standdev ) | 
 fields ReportType avgts |
 join ReportType  [ search
 index="prd_common_events" AppCode="MMX" EventName="ReportRun" earliest = -1h@h latest=@h | 
 fields Duration ReportType |
 stats avg(Duration) as nowavg by ReportType |
 fields ReportType nowavg ] |
 where nowavg > avgts
Get Updates on the Splunk Community!

Splunk Enterprise Security 8.0.2 Availability: On cloud and On-premise!

A few months ago, we released Splunk Enterprise Security 8.0 for our cloud customers. Today, we are excited to ...

Logs to Metrics

Logs and Metrics Logs are generally unstructured text or structured events emitted by applications and written ...

Developer Spotlight with Paul Stout

Welcome to our very first developer spotlight release series where we'll feature some awesome Splunk ...