Last updated at Wed, 03 Jan 2024 21:34:12 GMT

“Where are the logs we are forwarding to InsightIDR, and what can we do with them?”

If you have been asking yourself this lately, this Log Search blog series is for you. In this series, you’ll get our advice on how to be successful with powerful queries as quickly as possible.

In Part 1, we covered different ways to select log sets, the three search modes in InsightIDR, and adjusting how the log entries appear in search results. In Part 2, we covered three useful Log Search concepts: parsed logs, the groupby function, and Log Search operators.

In this final installment, we’ll look at some simple regular expressions that will greatly expand your Log Search options.

What is regular expression?

Regular expressions are specially encoded strings that are used to match patterns in text. Think of them as fancy wildcards that let you search through your logs and do more sophisticated pattern matching than you would be able to do otherwise. Don’t let the “regular expression” moniker scare you off! While regular expression (regex, for short) pattern matching can be used in a few complicated scenarios, it’s actually quite easy to use, particularly in InsightIDR. As in life, a few thoughtful expressions will go a long way.

Syntax basics

In InsightIDR Log Search, regular expressions are always wrapped with two forward slashes (“/”). Those slashes simply indicate that the insides are regular expression. Note that pattern matching in Log Search is case-sensitive: You can make your query case-insensitive by adding an i to the end of the query. You’ll see this in a few of the following examples.
Here are queries recommended by our team and the InsightIDR community:

Syntax Example Query Meaning
| logon_type=/REMOTE | INTERACTIVE/ The pipe character, | , means OR in regular expression.

Find log entries where logon_type is REMOTE OR INTERACTIVE
.* source_account=/.*admin.*/ Dot star together, .* , will match any character(s).

Find logs where source_account contains the string “admin”.
\d source_asset=/t\d\d\d.*/ \d matches any digit (0–9).

Find logs where source_asset starts with the letter “t”, followed by three digits, followed by anything.
\w destination_account=/\wadmin/i \w matches any single character (a-z, A-Z, and underscore (“_”)).

Find all destination_account for results that start with any single character, and end with “admin”. Search is case-insensitive.
+ destination_account=/\w+admin/i The plus sign, + , means to match at least one of the specified character.

In this case, \w+ means to match any word character as many times as it appears, i.e., “one or more times”. The i means that the search is case-insensitive. This search will match where destination_account ends with “admin”.
\s destination_user=/.*\ssmith/i \s will match any whitespace character.

Find all results where destination_user ends with the last name “smith”. Search is case-insensitive.
[] destination_account=/[\w-]+/ Square brackets [], creates a set. The search will match against any of the characters in the set.

Find all destination_account that contains any combination of word characters and hyphens (i.e., an account with a number in it will not match).

With regex, characters that have special meaning must be “escaped” with a backslash (“\”). When you escape the character, it tells regular expression to ignore the character’s special meaning and match to whatever it actually is.

For example, the period, . has special meaning. The period means “match any single character”. If you want to match for an actual period, simply escape it with backslashes. To match to a particular IP address, your query would look similar to this:
source_address=/10\.10\.10\.10/

Let’s look at a few sample searches that can be run against your Asset Authentication Log Set. Note that all of these queries should be run in Advanced Mode.

Log Entry Query Result
where(logon_type=/REMOTE|INTERACTIVE/) Find logon_type entries that are REMOTE OR INTERACTIVE.
where(destination_account=/.*service/i)groupby(destination_account) Find Destination_account entries that ends with “service”. Search is case-insensitive. The output should be listed as a “groupby”.
where(destination_account=/.*service/i AND logon_type AND logon_type!=NETWORK)groupby(destination_account) Find entries where someone using an account that ends with “service” to log in. Search is case-insensitive. Output as a “groupby”.
where(source_asset_address=/10\..*/)groupby(source_asset_address) Source_asset_address starts with 10. Output as a “groupby”.
where(destination_account=/[abc].*/i)groupby(destination_account) Destination_account starts with “a”, “b” or “c”. Search is case-insensitive. Output as a “groupby”.

Putting it together: Two use case examples

If you haven’t tried any regex searches yet, now’s the time to get hands-on. Please log into InsightIDR yourself and follow along with me. After you are logged in, head into the Log Search section on the left-hand side of the page.

Now that we are familiar with how to use search operators and the groupby function, let’s combine that with regular expression to find suspicious ingress authentications onto the network.

  1. Select the Ingress Authentication log set. (NOTE: Ingress Authentication will contain authentications where the source_ip is external to your organization. These types of auths typically come from VPN, Office 365, Okta, or other cloud services. If you do not have an Ingress Authentication log set, you do not have any of this type of activity currently being collected and analyzed by InsightIDR.)
  2. Set the search mode to Advanced.
  3. Use Time Picker to set the search time to how far back you want to search.
  4. Enter in the log search query that you want to run into the “Query” box.

We’ll start with my favorite query, which hopefully is becoming a part of your repertoire: the groupby function. Simply look through parsed fields in your logs for an interesting field, and run the groupby function.

Here is what I see after running groupby(geoip_country_name):

As we covered in Part 2 of this series, groupby is a great way to start any investigation. This query shows me all of the countries where there has been a successful or failed ingress authentication. On reflection, I am only interested in the successful logons. We can do this with the following:

where(geoip_country_name AND result=SUCCESS)groupby(geoip_country_name)

Note that with this query, if there are logs that do not have the specific parsed field geoip_country_name, they will not show up in the results.

Let’s go a step further. Since my organization has locations all over the United States, ingress from the U.S. is not particularly suspicious, so I want to eliminate those logs from my results. We now have:

where(geoip_country_name AND result=SUCCESS AND geoip_country_name!="United States")groupby(geoip_country_name)

Are you wondering yet when to enclose values in quotes and when not to? If there is a space in the value, like there is in United States, you must enclose the value in quotes. Otherwise, quotes are optional.

So far, so good—but let’s say my organization actually has locations in some other countries, too. I need to eliminate those countries from my results as well. I could run a query like this:

where(geoip_country_name AND result=SUCCESS AND geoip_country_name!="United States" AND geoip_country_name!="Ireland")groupby(geoip_country_name)

However, I only have two countries listed and it is already tedious. We can use regex instead—specifically the pipe, | , which is interpreted as OR.

If we wrap the regular expression in forward slashes and use the pipe, we get:

where(geoip_country_name AND result=SUCCESS AND geoip_country_name!=/United States|Ireland|United Kingdom|Canada|India|Singapore|Japan|Australia|Germany|Mauritius|Netherlands/)groupby(geoip_country_name)

With this useful query, we can head over to the InsightIDR Dashboards and create a custom dashboard card using this same query. Another option is Custom Alerts to use this query as a base for a custom alert. Please note that for custom alerts, you should use only the section inside the where() for pattern matching.

Now that you can visualize suspicious ingress authentications onto your network, let’s try a different use case: searching across a user’s DNS queries. Let’s say your HR department has asked you for a list of all the DNS queries that a user, Holly Fox (hfox), has made over the past week, particularly those that contain the word “catfish.” Okay, you probably don’t really care about fish, but you get the idea!

Let’s start by finding the sites that the user has made DNS queries to:

  1. Change the selected Log Set to DNS Query.
  2. Find a field on which to use the function groupby. In this case, I am using the field top_private_domain.
  3. Let’s use a function I have not talked about yet, limit. By default, groupby will return up to 40 results. Using limit after groupby allows you to alter that number. limit(10) will show the top 10 sites; limit(1000) will show you the top 1,000 sites.

Therefore, we can use this as a starting query:

where(user=hfox)groupby(top_private_domain)limit(1000)

Let’s run the search to see if the user has actually browsed to a specific website, as directed by HR:

where(user=hfox AND top_private_domain=/.*catfish.*/i)groupby(top_private_domain)limit(1000)

Below is a snippet of the search results. I was actually searching for a slightly different type of website (hence the black box). I can now use Export to CSV to give the results to HR.

For a bonus use case to track when users get added to a security group, I recorded a two-minute video you can follow along with here:

Wrapping up

I hope you all found this blog series useful. If you’ve been able to solve interesting use cases with the log search in InsightIDR, please drop a comment below. To expand on your Log Search strength, be sure to check out our official documentation here.

All of the queries demonstrated in this blog and many more are also available here.