DISCLAIMER: Image is generated using ChatGPT.
1. Introduction
2. What is YaraFFI?
3. Install YARA and YaraFFI
4. Basic Usage
5. Writing Simple YARA Rule
6. Scanning Memory Buffer
7. Scanning File
8. Callback Events
9. Advanced Event Types
10. Limitations
11. Example
Introduction
YARA is a battle‑tested malware pattern matching engine relied on by reverse engineers, Digital Forensics and Incident Response (DFIR) analysts and modern Security Operations Center (SOC) pipelines. It lets you express malware characteristics in readable rules and scan files or memory for matches. YaraFFI brings this capability natively into Perl, no system calls, no temporary files by binding directly to the C library via FFI::Platypus.
What is YaraFFI?
YaraFFI is a minimal, modern Perl interface to the libyara C engine using Foreign Function Interface (FFI). Instead of relying on XS or spawning the yara CLI tool, it talks to YARA in‑process. This makes it ideal for embedding in automation, stream scanners, build pipelines and malware research scripts.
Installing YARA and YaraFFI
You need the native libyara shared library installed first.
$ sudo apt install libyara-dev yara
$ yara -v
4.5.0
Then install the Perl module YaraFFI from CPAN:
$ cpanm -vS YaraFFI
Basic Usage
Let’s start with a minimal real scan.
File: ex-1.pl
use YaraFFI;
my $rules = <<'YARA';
rule HelloWorld {
strings:
$a = "hello" ascii
condition:
$a
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
$yara->scan_buffer("hello hacker", sub {
my ($event) = @_;
print "Matched rule: $event\n";
}, emit_string_events => 0);
Output
$ perl ex-1.pl
Matched rule: HelloWorld
Writing Simple YARA Rule
In YARA a rule is divided into meta, strings and condition sections. For a practical first example we’ll detect the ASCII word "test".
my $rules = <<'YARA';
rule TestRule {
meta:
description = "Detect the literal string 'test'"
author = "you@example.com"
strings:
$t1 = "test" ascii
condition:
$t1
}
YARA
Scanning Memory Buffer
scan_buffer is the most common in-process operation, it lets you scan arbitrary byte buffers (scalars) directly. This is ideal for scanning network captures, unpacked payloads, or API-returned blobs without touching disk.
File: ex-2.pl
use YaraFFI;
my $rules = <<'YARA';
rule TestRule {
meta:
description = "Detect the literal string 'test'"
author = "you@example.com"
strings:
$t1 = "test" ascii
condition:
$t1
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
my $payload = "this is a test payload";
$yara->scan_buffer($payload, sub {
my ($event) = @_;
print "Matched: $event\n";
print "Event type: " . $event->{event}, "\n";
});
Output
$ perl ex-2.pl
Matched: TestRule
Event type: rule_match
Matched: TestRule
Event type: string_match
Important Practical Notes:
-
Binary data & binmode:
Ensure any data read from files or sockets is treated as raw bytes (Perl’s
binmodeorread_file(..., binmode => ':raw')).scan_bufferexpects aPerlscalar containing the bytes to scan. -
Large buffers:
Scanning very large buffers in one call consumes memory and may be slow. For very large inputs consider chunking and scanning each chunk with
scan_buffer. BecauseYaraFFI(currently) does not report match offsets by default, if you need exact byte positions you must enable the experimentalenable_offsetsoption and track chunk offsets yourself. -
Callback behaviour:
The callback is invoked for
rule_matchandstring_matchevents by default. The supplied event object stringifies to therule namebut also contains a hash-like structure e.g.{ event => 'rule_match', rule => 'RuleName' }. -
Concurrency:
FFI::Platypusclosures capturePerlstate; concurrency models vary — prefer process-level parallelism for heavy scanning workloads until you’ve tested threads in your environment.
Collecting matches into an array:
File: ex-3.pl
use YaraFFI;
my $rules = <<'YARA';
rule TestRule {
meta:
description = "Detect the literal string 'test'"
author = "you@example.com"
strings:
$t1 = "test" ascii
condition:
$t1
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
my $payload = "this is a test payload";
my @hits;
$yara->scan_buffer($payload, sub {
my ($event) = @_;
push @hits, $event;
});
print scalar(@hits) . " matches found.\n";
foreach my $e (@hits) {
print "- " . $e->{rule} . " (" . $e->{event} . ")\n";
}
Output
$ perl ex-3.pl
2 matches found.
- TestRule (rule_match)
- TestRule (string_match)
Scanning File
scan_file in YaraFFI is a convenience wrapper that reads the file into memory and calls scan_buffer. For small-to-medium files this is usually the easiest option.
File: ex-4.pl
use YaraFFI;
die "Usage: $0 <file>\n" unless @ARGV == 1;
my $path = $ARGV[0];
my $rules = <<'YARA';
rule TestRule {
meta:
description = "Detect the literal string 'test'"
author = "you@example.com"
strings:
$t1 = "test" ascii
condition:
$t1
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
$yara->scan_file($path, sub {
my ($event) = @_;
print "[event=$event->{event}] rule=$event->{rule}\n";
});
Let’s first create a malicious file, malicious.bin, for the demo purpose.
$ dd if=/dev/urandom of=malicious.bin bs=1K count=64 2>/dev/null
$ printf 'test' | dd of=malicious.bin bs=1 seek=16384 conv=notrunc 2>/dev/null
Output
$ perl ex-4.pl malicious.bin
[event=rule_match] rule=TestRule
[event=string_match] rule=TestRule
Practical considerations:
-
Large files:
scan_fileslurps the whole file. For very large files read the file in chunks and callscan_bufferper chunk while tracking chunk offsets externally. -
Binary mode:
scan_fileuses binary read (:raw). If you implement your own reader, always open files withbinmodeon Windows to avoidCRLFconversions. -
Directory scanning: To scan many files, use
File::FindorPath::Tinyto iterate and callscan_filefor each regular file.
Chunked scanning pattern:
File: ex-5.pl
use YaraFFI;
die "Usage: $0 <file>\n" unless @ARGV == 1;
my $path = $ARGV[0];
my $rules = <<'YARA';
rule TestRule {
meta:
description = "Detect the literal string 'test'"
author = "you@example.com"
strings:
$t1 = "test" ascii
condition:
$t1
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
open my $fh, '<:raw', $path or die "open $path: $!";
my $chunk_size = 1024;
my $offset = 0;
my $overlap = 4096;
my $carry = '';
while (1) {
my $buf;
my $read = read($fh, $buf, $chunk_size);
last unless $read;
my $to_scan = $carry . $buf;
my $chunk_start = $offset - length($carry);
$yara->scan_buffer($to_scan, sub {
my ($event) = @_;
print "[event=$event->{event}] rule=$event->{rule} (chunk start: $chunk_start)\n";
}, emit_string_events => 1);
if (length($to_scan) > $overlap) {
$carry = substr($to_scan, -$overlap);
} else {
$carry = $to_scan;
}
$offset += $read;
}
close $fh;
Output
$ perl ex-5.pl malicious.bin
[event=rule_match] rule=TestRule (chunk start: 12288)
[event=string_match] rule=TestRule (chunk start: 12288)
[event=rule_match] rule=TestRule (chunk start: 13312)
[event=string_match] rule=TestRule (chunk start: 13312)
[event=rule_match] rule=TestRule (chunk start: 14336)
[event=string_match] rule=TestRule (chunk start: 14336)
[event=rule_match] rule=TestRule (chunk start: 15360)
[event=string_match] rule=TestRule (chunk start: 15360)
[event=rule_match] rule=TestRule (chunk start: 16384)
[event=string_match] rule=TestRule (chunk start: 16384)
Callback Events
YaraFFI exposes matches to your Perl code via a small, friendly event object (the YaraFFI::Event class). The object is blessed but intentionally minimal, it stringifies to the rule name so simple test scripts can say $event and get a readable output. It also behaves like a hashref for more detailed inspection in tests or tooling.
Typical events you’ll observe with this minimal binding:
-
rule_match: indicates a rule matched. The object has at least{ event => 'rule_match', rule => 'RuleName' }. -
string_match: a lightweight stand-in for when a string inside a rule matched; it carries{ event => 'string_match', rule => 'RuleName', string_id => '$...' }.
The goal of YaraFFI is simplicity and predictability. Instead of exposing the full complex YR_RULE struct and offsets (which differ across libyara versions), YaraFFI maps the most useful information into a stable Perl object you can inspect or stringify.
Advanced Event Types
YaraFFI now supports additional event types beyond the basic rule_match and string_match events. These advanced events provide more detailed scanning information and can be enabled on demand.
Available Event Types
rule_not_match
Emitted when a rule does not match the scanned data. This is useful for understanding which rules were evaluated but did not trigger.
$yara->scan_buffer($data, sub {
my ($event) = @_;
if ($event->{event} eq 'rule_not_match') {
print "Rule $event->{rule} did not match\n";
}
}, emit_not_match_events => 1);
import_module
Emitted when a YARA module is imported during rule compilation or scanning.
$yara->scan_buffer($data, sub {
my ($event) = @_;
if ($event->{event} eq 'import_module') {
print "Module imported: $event->{module_name}\n";
}
}, emit_import_events => 1);
scan_finished
Emitted when the scanning operation completes. This event is always the last one emitted and can be used to trigger post-scan actions.
$yara->scan_buffer($data, sub {
my ($event) = @_;
if ($event->{event} eq 'scan_finished') {
print "Scanning completed\n";
}
}, emit_finished_events => 1);
Event Configuration Options
All advanced event types are disabled by default to maintain backward compatibility. You can enable them individually:
$yara->scan_buffer($data, $callback,
emit_string_events => 1, # default: 1
emit_not_match_events => 1, # default: 0
emit_import_events => 1, # default: 0
emit_finished_events => 1, # default: 0
);
Complete Example with All Event Types
File: ex-6.pl
use YaraFFI;
my $rules = <<'YARA';
rule MatchingRule {
strings:
$s = "malware"
condition:
$s
}
rule NonMatchingRule {
strings:
$x = "benign"
condition:
$x
}
YARA
my $yara = YaraFFI->new;
$yara->compile($rules) or die "compile failed";
my $data = "This contains malware signature";
$yara->scan_buffer($data, sub {
my ($event) = @_;
if ($event->{event} eq 'rule_match') {
print "[MATCH] Rule: $event->{rule}\n";
}
elsif ($event->{event} eq 'rule_not_match') {
print "[NO MATCH] Rule: $event->{rule}\n";
}
elsif ($event->{event} eq 'string_match') {
print "[STRING] Rule: $event->{rule}, String: $event->{string_id}\n";
}
elsif ($event->{event} eq 'import_module') {
print "[IMPORT] Module: $event->{module_name}\n";
}
elsif ($event->{event} eq 'scan_finished') {
print "[FINISHED] Scan completed\n";
}
},
emit_not_match_events => 1,
emit_import_events => 1,
emit_finished_events => 1
);
Output
$ perl ex-6.pl
[MATCH] Rule: MatchingRule
[STRING] Rule: MatchingRule, String: $
[NO MATCH] Rule: NonMatchingRule
[FINISHED] Scan completed
Event Order Guarantee
When multiple event types are enabled, events are emitted in the following order:
rule_matchorrule_not_match(one per rule evaluated)string_match(ifemit_string_eventsis enabled, follows eachrule_match)import_module(if any modules are imported)scan_finished(always last if enabled)
Limitations
This module is intentionally minimal. That keeps the API stable and easy to understand but it also means a number of features are not present yet. Be aware of these before you build on YaraFFI:
-
Limited match offsets and metadata - by default, exact byte offsets and rule metadata are not extracted. These features are available experimentally via
enable_offsetsandenable_metadataoptions, but they are disabled by default due toYARAversion compatibility concerns. -
No modules / external variables -
YARAmodules (e.g.PE,ELF) and providing external variables to rules are not implemented. -
Assumes
libyaraABI compatibility — the callback currently probes theYR_RULEstructure at runtime to find the identifier pointer. This is fragile across very old/newYARAversions; test with your targetlibyaraversion. -
Scan flags hardcoded to
0— no mechanism exposed yet to changeYARAscan flags (e.g.,SCAN_FLAGS_PROCESS_MEMORYin some deployments).
Why This Design Matters?
Embedding YARA with FFI is about speed and control. Calling the CLI in a subprocess works, but it costs process startup time and complicates embedding in long-running daemons. The small, well-defined surface area of YaraFFI keeps it practical for automation tasks like pre-commit scans, CI checks or evidence triage.
Troubleshooting
-
If
yr_initialize()fails, ensurelibyarais installed and the shared library (libyara.so,libyara.dylib) is on your library path. -
If compile returns
false, double-check your rule syntax with theyaraCLI (e.g.yara -s myrules.yar) to rule out syntax errors. -
On mismatched libyara versions you may see the warning
DEBUG: Could not find valid rule name- this is the callback failing to locate the identifier field in theYR_RULEstructure.
Example
File: ex-7.pl
use YaraFFI;
use File::Slurp qw(read_file);
die "Usage: $0 <file>" unless @ARGV == 1;
my ($file) = @ARGV;
my $rules = <<'YARA';
rule DemoRule {
strings:
$s1 = "test" ascii
condition:
$s1
}
YARA
my $y = YaraFFI->new;
$y->compile($rules) or die "Failed to compile rules";
my $content = read_file($file, binmode => ':raw');
my $matches = 0;
$y->scan_buffer($content, sub {
my ($event) = @_;
print "Match: $event\n";
$matches++;
});
print "Total matches: $matches\n";
Output
$ perl ex-7.pl malicious.bin
Match: DemoRule
Match: DemoRule
Total matches: 2
Happy Hacking !!!