Last time, we looked at how we do device-level forensics on a deception network. Today, we'll look at enterprise-wide forensics on the actual devices off the deception grid. As I explained in my previous post, device-level enterprise-wide forensics on a deception grid using an external tool usually is not feasible because the decoy sink-holes anything trying to scan it. Thus, the forensic system cannot embed an agent, and even if it could—quite unlikely—it could not return results. For these reasons, the deception network usually handles the device-level forensic chores.
My tool of choice for this blog entry is Infocyte, a next generation enterprise-grade forensics tool. There are several other tools that do roughly the same thing as Infocyte, but I am comfortable with it and have used it in my lab for several years. Of course, you might have other preferences, and for the most part this posting could have been done with them.
I said that Infocyte is a next generation tool, and that means that it uses some form of artificial intelligence. The tool largely is file-based which means that in many cases it is looking for malware. It approaches that problem using machine learning, in this case, supervised. That means that it slurps data from some source, such as a data lake or similar, and creates a training dataset. It creates the dataset using the developer's algorithms to set the rules. It then uses other developer algorithms to execute the rules on unknown data.
Although Infocyte is file-based, it would be wrong to think of it as little more than a fancied-up anti-malware tool. The fact is that many malicious events besides malware can manifest as files, particularly in memory. Infocyte can scan memory and produce a history of the application of any file even if it is a memory artifact, including scripts of various types. In addition, it can show communications connections, so it is a complete active forensics tool with ability to do both live and dead-box forensics across the enterprise.
The tool uses embedded agents in the target devices, and those can be permanently installed or temporarily installed an the time of scanning. Those temporary agents are self-dissolving so that when the scan is finished they're gone until the next scan. The agents also are quite lightweight, and I have never seen a performance hit during a scan. So, on to the forensics.
First, tell Infocyte to discover. It will go out on the network and will discover all actual devices. You can tell it to look at the decoys, but it won't return anything. Once it has discovered the devices, you can tell it to analyze. One useful feature is that you can break the network up into groups and scan a group at a time. I like this for servers that are particularly sensitive to attack either by the device's criticality or its external exposure (such as a web server). I can run a forensic scan more frequently to accommodate the level of sensitivity involved.
Knowing that there is a possible compromise is nowhere near enough information; you need to know the details. As I told my students until they were sick of hearing it, you need to know the questions before you can get to the answers. That's where the forensic scan comes into the picture.
The questions have to do with such things as possible malicious files, scripts, or artifacts. The answers begin to be revealed by looking at the detailed sequences of events that align with the artifact. One of the most important questions is, where did the artifact communicate? We also want to know with what the artifact may have hooked. Both of these questions may suggest lateral movement across the enterprise.
Let's take an example scan over a single computer and see what I mean.
First, I've picked a single computer, but it could just as easily be a group or the whole enterprise. Picking a single computer lets us see details without confusion because the tool shows everything that it sees until you settle on a particular host of interest. In most practical applications, you won't pick a single target unless you have some special interest in the host. However, I don't recommend that because it is useful to see how hosts interact with each other. For our purposes here, though, it is fine.
Remember that forensics—even live forensics—is a snapshot in time. That means that the agent will report what it sees in the scan at the moment of the scan. That's not bad, though, since we have artifacts and communications shown. The artifacts will take the form of executables that have run, hooks, communications, and memory, again, at the instant of the snapshot. What you will not see—that you do in typical dead box forensics—are all of the files on the disk if they have not been involved in some sort of activity.
My first step usually is a look at communications. That shows me what has been talking to what, including both internally and externally. My next step is memory activity. I am looking specifically for signs that a script or fileless malware is present. If the malware destroys itself, things get a bit dicey. The malware may have left an artifact in memory, but if you don't know what you're looking for, finding it is a daunting task. In that case, if I suspect fileless malware (because of symptoms or behavior of the computer), a little research is in order to figure out what might be its indicators of compromise.
The next step is examining the processes that have run. This in conjunction with communication, and memory can lead me to tracing the activity associated with my findings. Processes also can lead you to executions that, in the normal operation of the computer, don't make sense. However, remember that there are many legitimate services that may seem unfamiliar. These may be Microsoft services or something caused by an application.
When I see something that doesn't ring a bell with me, I simply Google it and see what others have found. Don't forget that a Trojan may masquerade as a benign file. Look at what it does, not just what it is. A good example of this apparent behavior is svchost. Service host usually is not a Trojan, though. Since it is tied to the execution of a dll, it might be acting correctly for a malicious dll.
Starting with communications, we select connections. As you can see from Figure 1, drstephenson-pc had no connections. But another computer, dc01.centerd1.local (shown in Figure 2) does have some. This is somewhat concerning on its surface since drstephenson-pc is a workstation and dc01 is a domain controller. We would hope that the workstation is clean, but we really hope that the domain controller has only legitimate connections.
Figure 1: drstephenson-pc - No connections
Figure 2: 7 connections showing on the domain controller
Notice that all but one of the communications are with Infocyte. That makes sense, so we'll not be alarmed at that. The last connection is with Microsoft's Active Directory web services service. That is legitimate and, unless we see it acting malicious, we'll leave it alone. So, we can conclude that connections on both computers are okay.
Now, we'll take a look at memory. This can be confusing because it is not neatly presented like a file folder. However, the deception network tools help, and Infocyte is equally intuitive for actual devices. Reviewing both devices, we find no interesting memory activity, so we’re on to processes. drstephenson-pc is empty and now I'm beginning to wonder a bit. I know that the device was running and was scanned, but I see no results. Conversely, I see one process that was running during the scan on the domain controller; that process was armsvc.exe. This is a file that belongs to Acrobat, and since Acrobat is running that is of little concern to me.
I am a bit curious about why drstephenson-pc seems dead even though I know it's not. When that happens, I look back in time, either 30 or 90 days, to see if something was going on. If I find results, I suspect that something went wrong with the current scan—perhaps blocked by the target host—or my anti-malware tool cleaned something up.
In this case, I'll go back 30 days. I see nothing under connections, so I'm on to processes. Here we see a lot of activity, virtually all of it marked "suspicious"—all, that is, but the first one: tor. I have truncated the list in Figure 3.
Figure 3: Partial list of processes running on drstephenson-pc at time of scan
I am not too concerned about tor as I think that was me, but I'll check the activity anyway, just to be sure. According to the Infocyte activity log, it was me:
What we've learned in this exercise is that there has been some activity on drstephenson-pc but it goes back a few weeks. If I had found something significant—or that appeared significant—I would have dug deeper, perhaps resorting to traditional dead box forensics on the suspect device. However, with enterprise-grade tools such as Infocyte, you can screen an entire network in hours (or, at most, days) instead of months taking devices one at a time. I have used Infocyte to screen a network with about 1,500 user devices and about 700 servers in under eight hours. When I have analyzed devices manually, I was restricted from the perspective of practicality to those that have exhibited other suspicious symptoms. Between imaging, consuming the image, and analysis, I easily could spend a couple of weeks on a single large server.
I think that we've beaten this horse into submission, so next time I'll move on to something new. While sticking with the theme of next generation attacks and defenses, I'll start digging into the events of the day (or week or month) and how they impact or are impacted by the next generation of cybersecurity. We'll look at tools, their impacts, the latest adversaries and attacks, and how we can interact safely with risks to our enterprise that come in the form of AI, ML, or neural networks and their ilk.