Infosec Chaos: Analyst's Problems as a Service (APaaS)

In this blog we will discuss some of the issues an analyst face due to lack of log and alert enrichment. This is a continuation of a three-part blog series. If you have not read the previous one, you can find it here.

SIEM is full of logs:

SIEM is made of logs, I mean, the whole concept of SIEM is to bring in all those records from products/applications into a single place. There are many reasons to do that, but the important ones are

· Someone can look/find the logs at a single place

· Leverage “Least Privilege” by restricting analyst access to point products (unless necessary, which is a rare instance) but still be able to see all the information they need

· To create alerts, baselines, dashboards, and all sorts of meaningful things.

All of them require logs BUT to leverage the advanced features, logs must be pulled/pushed into SIEM, parsed, standardized, augmented, and correlated. Logs provide user/system/application-related data and help an analyst to create a story using them.

There are so many devices that generate logs, firewalls, routers, proxies, endpoints, cloud applications are a few among them. The first thing you might be stuck with is “what data we need, do we need to collect from everything, if yes, how are we supposed to do it. Once we figure that out, the immediate question is, how are we supposed to get these logs (are there set standards, should we use scripts, Application Programming interfaces (API)s or any other tools). Right after you address that issue, the next problem is, what format these logs are in because the fields you get are dependant on the format you get the logs from.

But if you are just collecting raw logs without

· enriching them with additional information such as ASN, geo, forward/reverse DNS, etc.…

· having a well-structured common schema notation for fields

it will make analyst’s work cumbersome.

Some of the questions some might ask are What’s the problem here, why it is so important, and how it affects an analyst? I’m glad you are asking me these questions. Let’s address them

If you don’t know what logs you need, you might have a “blind spot” on what you can see and protect in your environment. If you don’t know what log format the logs can be extracted from, you don’t know what fields exist in the logs. If you are just bringing in raw logs, then the important fields that are required for an analyst are not searchable which has a huge impact on the queries you create to search for logs (like using regex) and the number of queries you perform on a SIEM’s storage. Also, if you don’t know the fields, there is no consistent field naming convention across your SIEM, and an analyst now has to create per field searches in different data sources and have to triage an alert without the ability to correlate between them. Along the same lines, an alert is based on the fields you query in specified/correlated data sources, if you do not have field consistencies or enrichment, you are not able to use the out-of-the-box alerts from the SIEM (this might work if you are using SIEM specific solutions to ship data BUT still have issues with custom applications) and also are not able to create custom ones.

This gives us a nice segway into our last set of problems in a SOC, alerts.

Alerts, where the life of an analyst revolves around:

The alert engine is an important component of SIEM, as all the things we discussed above are indeed done to create alerts to catch evil/suspicious behavior in an organization. And as you can imagine, this is one of the places that the analyst spends a lot of time in a SIEM. An analyst is normally either hired, appraised, and even fired on the number of alerts they triage. Like logs, alerts can also be generated from various places and some of the most common ones are Intrusion Detection/Prevention Systems, firewalls, and Anti-Virus (AV) engines. Though you will get some information from the point products, it does not give an analyst the full picture of what happened. You may be thinking, well, I have an Endpoint Detection Response product, so I get everything from it. You are right to a certain extent but still you see only the data the agent of an EDR is placed on (We will discuss the quirks of an EDR some other time). Back to alerts, point-product generated alerts do not have context. What is it triggering on, why it is triggering, is it relevant to our environment or not, who is the user, and other information. They are point products, they only alert on what they see. If done effectively, we can trim the noise, enrich the alerts with information related to the rule the alert is triggering on, the reason for the trigger, and the logs from other data sources. Let’s first see what the issues are, an analyst faces with point products and raw alerts.

But first, we need to understand that there are so many products that generate alerts which forces the analyst to understand and know these products. See, this is a different problem from logs being generated from different products. To triage an alert, you need to understand basic things like what it is, which device generated it, why it is generated, in what scenarios we will see this alert trigger. Which is a good and bad thing at the same. The good thing is you can learn a lot of new technologies and it is bad because you are not able to triage the alert and even worse sometimes you will fear even attempting alerts from those systems. Combine this with a lack of context around the alert.

If you are/were an analyst, the first thing you will look for is the reason the alert was triggered. If there is a rule associated with the alert, then it is easy to understand why we saw this alert and even it gives a chance to disable/re-write the alert (like the cases where the alert is triggered for Linux systems, but you are a Windows shop). The next thing is the alert context. Let’s take an example, it is good to look for HTTP requests which are plain IPs and not an FQDN. The reason behind this is some malware like to request an IP to download second stage payload or command control and such. Your IDS will trigger such an alert. Pause for a second and think if it is a good rule or not. If you think about it the rule is good, it is not perfect, but it helps us to catch evil. But you will probably drown in False positives because internet scanners will scan your organization’s IP block for many different reasons. Here, the rule is good but the context in which we apply the rule must be changed. Like we might want to re-write the rule to only trigger when the traffic is from internal to external instead of both directions.

The final and most important one is, lack of knowing “what am I protecting”. If the analyst does not know what they are protecting and if the alert for a Point-of-Sale system is given the same severity as for a Domain Controller. If there is no enrichment on the logs or alerts and most importantly if there is no documentation around it, it is very difficult for an analyst to progress in the triage process.

By not doing things or due to the above-mentioned problems, you are making an analyst work for the SIEM instead of letting the SIEM work for an analyst (sorry Justin Henderson, I stole your quote).

Thank you for reading this far, I hope you enjoyed the blog. Please check the next section where we will conclude our series in the next blog with some important things an analyst can/may do to overcome the issues mentioned so far. Hope to see you there.

Infosec Chaos

Tuesday, 8 December 2020

Analyst's Problems as a Service (APaaS) - Part 2

No comments:

Post a Comment

The Power of the Trio — LAYER Approach