Skip to main content

count, count_distinct, count_frequent Operators

Aggregating (group-by) functions are used in conjunction with the group operator and a field name.

Only the word聽by聽is required to represent the聽group operator. The count function is also an operator in its own right,聽and therefore can be used with or without the word聽by.

count鈥�/a>

Counts the total number of logs that match the keyword search within the time range analyzed.

Syntax鈥�/a>

count[(<field>)] [as <field>] [by <field>, ...]

Rules鈥�/a>

Examples鈥�/a>

To count the number of logs:

| count

To count the number of logs from a specific field, in this example the field is port:

| count(port)

To count the number of logs from a specific field based on grouping聽by other fields:聽

| count(port)by srcAddress, tgtAddress

The same example above with an alias field name,聽countOfPort, and an additional aggregate operator, avg:

| count(port) as countOfPort, avg(bytes)by聽srcAddress, tgtAddress
| sort by countOfPort

When you want to count聽more than one field, you must create an alias using the as operator to rename the _count fields.

count_distinct鈥�/a>

Counts only distinct occurrences of the value of a field being counted within the time range analyzed.

An empty value still counts as a unique value and will be counted.

Syntax鈥�/a>

count_distinct(<field>) [as <field>] [by <field>, ...]

Rules鈥�/a>

  • Creates field named聽_count_distinct

Examples鈥�/a>

| count_distinct(username) group by hostname
_sourceCategory=*apache*
| parse "* -" as src_ip
| count_distinct(src_ip)

By default, ordering is not defined inside of groups created using a group-by expression. To order your results, use聽the聽sort聽operator.

If the number of distinct items returned is less than 100, the聽count_distinct聽function provides an exact number. If the number of distinct items returned is larger than 100,聽count_distinct聽instead uses an approximate algorithm, and displays a message that explains: count_distinct saw more than 100 values, results may be approximate

The approximation algorithm uses a relative error parameter of 2%, for example:

  • 65% of the time, results are within +/- 2%.
  • 95% of the time, results are within +/- 4%.
  • 99% of the time, results are within +/- 6%.

So for example, if the true count of distinct items is 1,000, the result returned by the approximation algorithm is between 950 and 1050 about 95% of the time.

The error parameter value is important to making the聽count_distinct聽function return results quickly and in a scalable way.

Also, note that when you want to count聽the distinct occurrences聽of more than one field, you must create an alias using the as operator to rename the _count_distinct fields. See this example:

_sourceCategory=PaloAltoNetworks
| count_distinct(threatid) as cntthreatid, count_distinct(repeatcnt) as cntrepeatcnt

count_frequent鈥�/a>

The count_frequent function can be used in cases where you want to identify the most common values for aggregations with over 10,000 distinct groups. This query returns the highest-count 10,000 results in sorted order. The resulting count field is called _approxcount because it is only an聽estimate聽of the true count; the estimate may be incorrect, but can only be over (it will never be under).

The count_frequent function is followed immediately by one or more field names.

You can use the聽count_frequent聽operator in Dashboard queries, but the number of results returned is limited to the top 100 most frequent results. All results are available when the search is run on the聽Search聽page, but only the top 100 are displayed in the Panel.

Syntax鈥�/a>

count_frequent <field>[, <field2>, field3, ...]

Rules鈥�/a>

  • Creates field named _approxcount
  • Cannot be used with other aggregating functions like聽sum聽or聽avg
  • Sort is built into the query and defaults to a most-to-least order

Example鈥�/a>

* | parse "srcIP=*, url=*" as srcIP, url聽
| count_frequent srcIP, url
Status
Legal
Privacy Statement
Terms of Use

Copyright 漏 2025 by Sumo Logic, Inc.