Before Virus Total released the open sourced and cross platform application, Yara, there were very few ways for independent security researchers to identify and detect new malware themselves. Named “the pattern matching swiss knife for malware researchers”, Yara takes a different approach to identifying and locating new threats. In most cases, identifying these threats are hash based. Once a malicious binary has been identified it is usually classified by hashing and storing that file’s hash code. Users of each specific virus database have to hash their potential harmful code and match with known signatures to detect these programs. One of the major drawbacks is the malware can be slightly changed and the tracked signature will not work anymore.
Yara takes a rule based approach to detecting bad code. Instead of storing known malware threats by file hashes, it will search the target file by specific rules created by malware researchers. These rules can look inside the executable for known strings, methods or other logic. Here is a sample rule.
# michael_rule.yara
rule michael_rule
{
meta :
description = "this is a yara rule sample"
author = "michael rinderle"
strings:
$a = "michael rinderle" ascii nocase
$b = { 6d 69 63 68 61 65 6c 20 72 69 6e 64 65 72 6c 65 }
condition:
$a or $b
}
If you come from a programming background you can see that the Yara rule structure resembles the C language. The rules require 3 major parts to be valid.
- Meta
- Strings
- Condition
Meta is for providing information useful to the analyst for developing and identification. This can include the rule description, author, revision, or other information to describe the rule.
The rest deals with the dynamic analysis part of how Yara works with rules. The strings area is for defining the specific items to look for in the executable or file. You can look for string and hex information. Yara provides a few keywords to look for this information in different ways too.
In this rule, we can look for the string “Michael Rinderle” in ASCII format with no case sensitivity. There is a secondary string to look for the hex format. The last part of the rule is the condition. This rule will trigger if “Michael Rinderle” is found in the previous formats.
Continue on to learn how to install Yara and test a file against a ruleset.
Installing on Linux
Before you can install on a Linux machine, a few packages need to be installed first. Use your distribution’s package manager to download and install the dependencies for building and using Yara.
sudo <apt-get/dnf/pacman> install automake libtool make gcc pkg-config libssl-dev flex bison libjansson-dev libmagic-dev
Go ahead and download Yara from the repository.
Then unzip the file and enter the directory to run the bootstrap.sh file.
tar -zxf yara-4.0.0.tar.gz
cd yara-4.0.0
./bootstrap.sh
Before we can build Yara, we want to make enable a few modules that are not compiled by default. This allows integration with the Cuckoo Sandbox, uses the magic module for identifying file types, and the dotnet module for better .NET file analysis. Do this with the configure command.
./configure --enable-cuckoo --enable-magic --enable-dotnet
Lastly, make and install Yara.
make
sudo make install
Installing on Windows
You can easily install Yara on Windows with Chocolately. To learn more about installing Chocolately for Windows package management, read the following article. Open up in an elevated shell and use the following command.
choco install yara
Installing on OSX
OSX installation is made easy with the Homebrew package manager.
brew install yara
Getting Yara Rules
Instead of creating a list of your own rules, there are many sources to obtain already created rule sets. Why re-create the wheel when there are many trusted providers already out there offering up their rules. The following is a collection of popular Yara rulesets created by industry leaders.
- AlienVault Labs Rules
- AppleOSX
- Burp Yara Rules
- GoDaddy ProcFilter Rules
- McAfee Advanced Threat Research IOCs
- Sophos AL YaraML Rules
- Tenable Rules
Running Yara against potential threats
With Yara installed and armed with some rules, it’s finally time to test a binary with the sample rule that looks for my name in an executable in the beginning of this article.
Let’s create a simple C program that outputs the target string.
# michael.c
#include <stdio.h>
int main(){
printf("Michael Rinderle");
}
Now we can compile the C code with gcc.
gcc michael.c -o michael.exe
With our test executable compiled, we can test the sample rule against it.
yara michael_rule.yara michael.exe
The result is what we expected. Yara was able to extract the string argument for the printf function and triggered on our rule. Since our rule looked for a non-case sensitive string “michael rinderle”, the rule detected the pattern.
Note that I installed Yara with Chocolately so my Yara binary is named yara64.
Conclusion
If you are looking for a robust solution to malicious code detection, Yara may be for you. Utilizing the command line for scanning millions of files is not scalable. Fortunately, there are many ways you can implement it. It comes with a Python module allowing you to write scripts to automate the process of scanning one rule with one file. It’s as easy as opening a file and importing all the Yara rules recursively into a list datatype for enumerating over other target files, whether on disk or traveling the network. Currently, there are also extensions that allow it do be integrated the Zeek intrusion detection system. Lastly, Yara is even more powerful than demonstrated here as it can also scan other types of files beside executables so you can even analyze the macros on a Word Document for malicious activity. All of this combines makes Yara a game changer so it is no wonder such major IT security outfits are using it these days.
Stay tuned for more articles on how to include Yara into your Python scripts and scanning binaries as they traverse your network.
Comments are closed, but trackbacks and pingbacks are open.