RAG with Perl

DISCLAIMER: Image is generated using `ChatGPT`.

The term RAG has bothered me for a long time. I have heard many people around me talking about it. I am not an AI enthusiast, but last June I wrote a blog post about AWS Bedrock. A friend of mine at work was working with it at that time and it piqued my interest.

I looked online for an easy definition of RAG that I could understand.

This is the best, I found out.

Imagine you are a student taking an open-book exam. Normally, if you have to answer a question, you have to rely only on what you have memorised in your head. If the information isn’t in your memory, you cannot answer the question. This is how a standard AI model works, it relies solely on the vast amount of information it “memorised” during its training. RAG (Retrieval-Augmented Generation) changes this by letting the AI use a textbook during the exam.

As the name suggests, it is a three steps process.

[R]etrieval

When you ask the AI a question, it doesn’t just guess from memory. Instead, it first searches through a specific set of documents to find the exact pages that contain the information relevant to your question.

[A]ugmentation

Once it finds those relevant passages, it gathers them up and hands them to the AI, essentially saying, “Here is some extra information I found on this topic; use this to help formulate your answer”.

[G]eneration

Finally, the AI reads those specific passages and writes a natural, accurate response based on what it just learned, rather than relying only on its own training.

In short, RAG turns an AI from someone who is just “guessing from memory” into an expert researcher who always has the latest facts right in front of them.

It stops the AI from making things up because it is forced to look at your provided facts first. Also you don’t have to retrain the AI every time you have new information. You can simply add a new document to your database, and the AI will be able to “read” it and answer questions about it immediately.

Now I am feeling better as far as RAG is concerned.

Can I see it in action?

Can I use Perl?

There are plenty of videos available on the subject but none for someone who just steps in.

I hope this post would help you understand RAG inside out.

I am using Docker this time as I want to keep my system clean and also because I love Docker.

To demo the full life cycle, I am going to use the following two models:

Llama 3.2 is a collection of open-weight artificial intelligence models developed by Meta.

nomic-embed-text is a high-performance text embedding model.

And I would also need a vector database. ChromaDB is the best for what I aim to achieve.

Time to create docker compose configuration file, docker-compose.yml like below:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    ports:
      - 11434:11434
    entrypoint: >
      sh -c "ollama serve &
             sleep 5 &&
             ollama pull llama3.2 &&
             ollama pull nomic-embed-text &&
             wait"
    volumes:
      - ollama_data:/root/.ollama

  chroma:
    image: chromadb/chroma:latest
    container_name: chroma
    ports:
      - 8000:8000
    volumes:
      - chroma_data:/chroma/chroma

volumes:
  ollama_data:
  chroma_data:

As you see, we have defined two services: ollama and chroma

Inside the ollama container, I am pulling the two models llama3.2 and nomic-embed-text

Let’s start the container now:

$ docker compose up -d

For demo purpose, I am downloading The Art of War book in PDF format.

$ curl -L -o art_of_war.pdf https://sites.ualberta.ca/~enoch/Readings/The_Art_Of_War.pdf

We need to install the package poppler-utils as we are using tool pdftotext.

$ sudo apt install -y poppler-utils

Also some CPAN modules if not already installed:

$ cpanm -vS JSON HTTP::Tiny Digest::SHA Getopt::Long Pod::Usage

Finally we create the Perl script to simulate RAG operations.

Filename: rag-engine

#!/usr/bin/env perl

use strict;
use warnings;

use JSON;
use HTTP::Tiny;
use Pod::Usage;
use Getopt::Long;
use Digest::SHA qw(sha256_hex);

my $ua   = HTTP::Tiny->new();
my $json = JSON->new->utf8->canonical;

my $OLLAMA_URL = 'http://localhost:11434';
my $CHROMA_URL = 'http://localhost:8000/api/v2/tenants/default_tenant/databases/default_database';

my ($help);
my ($collection_name, $collection_id);
my ($search, $query_text, $document, $ingest_document);
my ($list_collections, $delete_collection, $create_collection);

GetOptions(
    'document=s'        => \$document,
    'ingest'            => \$ingest_document,
    'query=s'           => \$query_text,
    'search'            => \$search,
    'collection-id=s'   => \$collection_id,
    'collection-name=s' => \$collection_name,
    'list-collections'  => \$list_collections,
    'create-collection' => \$create_collection,
    'delete-collection' => \$delete_collection,
    'help|h'            => \$help,
) or pod2usage(2);

pod2usage(1) if $help;

if ($ingest_document) {
    ingest_document($document, $collection_id);
}
elsif ($search) {
    query_collection($collection_id, $query_text);
}
elsif ($list_collections) {
    list_collections();
}
elsif ($create_collection) {
    create_collection($collection_name);
}
elsif ($delete_collection) {
    delete_collection($collection_name);
}

Ingest Document

sub ingest_document {
    my ($document, $collection_id) = @_;

    die "ERROR: Missing document.\n"      unless defined $document;
    die "ERROR: Missing collection id.\n" unless defined $collection_id;

    my $full_text = ($document =~ /\.pdf$/)
                    ? `pdftotext "$document" -`
                    : do { local $/; open my $f, '<', $document; <$f> };

    my @chunks = $full_text =~ /.{1,500}/gs;

    foreach my $chunk (@chunks) {
        my $emb = get_embedding($chunk);
        $ua->post("$CHROMA_URL/collections/$collection_id/add", {
            headers => { 'Content-Type' => 'application/json' },
            content => $json->encode({ ids        => [sha256_hex($chunk)],
                                       embeddings => [[@$emb]],
                                       documents  => [$chunk],
                                    })
        });
    }

    print "Document ingested in " . scalar(@chunks) . " chunks.\n";
}

List Collections

sub list_collections {
    my $response = $ua->get("$CHROMA_URL/collections");

    if ($response->{success}) {
        my $collections = decode_json($response->{content});

        if ($collections) {
            foreach my $collection (@$collections) {
                print "Id: ", $collection->{id}, "\n";
                print "Name:", $collection->{name}, "\n";
                print "---\n";
            }
        }
    } else {
        print "Failed to retrieve collections: $response->{status} $response->{reason}";
    }
}

Create Collection

sub create_collection {
    my ($collection_name) = @_;

    my $url = "$CHROMA_URL/collections";
    my $payload = encode_json({ name => $collection_name });

    my $response = $ua->request('POST', $url, {
        headers => { 'Content-Type' => 'application/json' },
        content => $payload,
    });

    if ($response->{success}) {
        my $data = decode_json($response->{content});
        print "Successfully created collection: $collection_name (ID: $data->{id})\n";
    } else {
        print "Failed to create collection: $response->{status} $response->{reason}";
    }
}

Delete Collection

sub delete_collection {
    my ($collection_name) = @_;

    if (!$collection_name) {
        die "Error: No collection name provided for deletion.";
    }

    my $url = "$CHROMA_URL/collections/$collection_name";
    my $response = $ua->request('DELETE', $url);

    if ($response->{success}) {
        print "Successfully deleted collection: $collection_name\n";
    } else {
        print "Failed to delete collection: $response->{status} $response->{reason}";
    }
}

Query Collection

sub query_collection {
    my ($collection_id, $query_text) = @_;

    die "ERROR: Missing collection id.\n" unless defined $collection_id;
    die "ERROR: Missing query text.\n"    unless defined $query_text;

    my $q_emb = get_embedding($query_text);
    my $res   = $ua->post("$CHROMA_URL/collections/$collection_id/query", {
        headers => { 'Content-Type' => 'application/json' },
        content => $json->encode({ query_embeddings => [[@$q_emb]],
                                   n_results        => 3,
                                   include          => ["documents"],
                                })
    });

    my $data = $json->decode($res->{content});

    if (defined $data->{documents} && @{$data->{documents}}) {
        print generate_with_llm(join("\n", @{$data->{documents}[0]}), $query_text) . "\n";
    }
    else {
        print "No relevant documents found.\n";
    }
}

sub get_embedding {
    my ($text) = @_;

    my $res = $ua->post("$OLLAMA_URL/api/embed", {
        headers => { 'Content-Type' => 'application/json' },
        content => $json->encode({ model => 'nomic-embed-text', input => $text })
    });

    if (!$res->{success}) {
        die "Embedding API failed: $res->{status} - $res->{content}\n";
    }

    my $data = $json->decode($res->{content});

    if (defined $data->{embeddings} && ref($data->{embeddings}) eq 'ARRAY') {
        return $data->{embeddings}[0];
    }
    else {
        die "Unexpected API response structure: " . $res->{content} . "\n";
    }
}

sub generate_with_llm {
    my ($context, $question) = @_;

    my $payload = {
        model  => 'llama3.2',
        prompt => "Context: $context\n\nQuestion: $question\n\nAnswer:",
        stream => JSON::false
    };

    my $res = $ua->post("$OLLAMA_URL/api/generate", {
        headers => { 'content-type' => 'application/json' },
        content => $json->encode($payload)
    });

    if (!$res->{success}) {
        die "LLM generation request failed: $res->{status} $res->{reason}";
    }

    my $data = $json->decode($res->{content});

    return $data->{response};
}

Finally some usage pod.

__END__

=head1 NAME

rag-engine - A CLI tool for RAG managing ChromaDB collections.

=head1 SYNOPSIS

./rag-engine [options]

 Options:
   --document <full path> --collection-id <id> --ingest Ingest document in collection id
   --query <text> --collection-id <id> --search         Search text in collection id
   --list-collections                                   List all collections
   --create-collection --collection-name <name>         Create a new collection name
   --delete-collection --collection-name <name>         Delete the named collection
   --help                                               Show this help message

=head1 DESCRIPTION

This tool allows you to manage vector database collections for your RAG engine.

=cut

The application looks like this:

$ ./rag-engine --help
Usage:
    ./rag-engine [options]

     Options:
       --document <full path> --collection-id <id> --ingest Ingest document in collection id
       --query <text> --collection-id <id> --search         Search text in collection id
       --list-collections                                   List all collections
       --create-collection --collection-name <name>         Create a new collection name
       --delete-collection --collection-name <name>         Delete the named collection
       --help                                               Show this help message

Time for some action now:

$ ./rag-engine --list-collections
Id: a9261eca-0106-4464-a78c-66ae20787ce9
Name:my_docs
---
Id: ec0869a5-355e-48f6-b0d2-f1e912e11469
Name:test_1
---

$ ./rag-engine --document art_of_war.pdf \
               --collection-id ec0869a5-355e-48f6-b0d2-f1e912e11469 \
               --ingest

$ ./rag-engine --query "What is the best way to win a war according to Sun Tzu?" \
               --collection-id ec0869a5-355e-48f6-b0d2-f1e912e11469 \
               --search

$ ./rag-engine --create-collection --collection-name test_1

$ ./rag-engine --delete-collection --collection-name test_1

Stop Services

$ docker compose stop

Happy Hacking !!!