10 min read
AI assisted

Comparing Four LightRAG Variants — Same Root, Different Production Strategies

Source-level comparison of RAG-Anything, ApeRAG, and EdgeQuake as LightRAG derivatives

LightRAG is a Graph RAG framework released by the HKUDS research group in 2024. It extracts entities and relationships from documents into a graph, then combines graph traversal and vector search at query time to ground LLM answers. After the open-source release, three derivative projects emerged: RAG-Anything, ApeRAG, and EdgeQuake. All three share the same core algorithm. What differs is the philosophy and scope of what was built on top.


What LightRAG Actually Does

Understanding the shared algorithm before comparing what each project changed.

Indexing:

Document input
  → Chunking (split into segments)
  → LLM extracts entities and relationships per chunk
      e.g. "Samsung expanded semiconductor investment in 2024"
      → Entities: {Samsung(ORG), 2024(DATE), semiconductor(TECH)}
      → Relationship: (Samsung) -[expanded investment]→ (semiconductor)
  → Store entities and relationships in graph DB
  → Store entity description text in vector index

Query:

User query
  → Encode query as vector
  → Find entry entities via vector similarity
  → Collect 1–2 hop neighbors of entry entities
  → Assemble entities / relationships / chunks as context
  → LLM generates answer

All six query modes (Naive / Local / Global / Hybrid / Mix / Bypass) are present in all four frameworks. The algorithm is identical across them. The divergence is in everything layered on top.

LightRAG (original, Python library)
    ├── RAG-Anything  ── multimodal layer added on top, core unchanged
    ├── ApeRAG        ── core deeply modified + production platform
    └── EdgeQuake     ── algorithm rewritten in Rust from scratch

RAG-Anything: The Additive Approach

RAG-Anything imports LightRAG without modifying it and adds five modality processors as mixins. The GraphRAG core is LightRAG exactly.

from lightrag import LightRAG  # unchanged import

class MultiModalRAG:
    def __init__(self):
        self.rag = LightRAG(...)  # core unchanged
        self.processors = {
            'image':    ImageModalProcessor(),    # Vision LLM → description + entities
            'table':    TableModalProcessor(),    # LLM structural analysis
            'equation': EquationModalProcessor(), # equation recognition
            'office':   OfficeModalProcessor(),   # LibreOffice conversion pipeline
            'generic':  GenericModalProcessor(),  # fallback
        }

The distinctive feature is how image content enters the graph. Rather than simply converting images to descriptive text, image-extracted entities become independent nodes in the knowledge graph.

PDF document parse
  → Text chunks → standard LightRAG pipeline
  → Image detected → Vision LLM → description + entity extraction
      → creates "image entity node" in graph
      → connects to related text entities
  → Table detected → LLM structural analysis → "table entity node"

A query about "the trend shown in this chart" can traverse from the image node to connected text entities that discuss the same trend. The modalities become part of the same graph fabric.

Production features: none. No authentication, multi-tenancy, rate limiting, conversation history, or background task management. Install via pip and embed in existing Python applications.

Right fit for: documents containing images, equations, or tables (academic papers, technical diagrams, financial reports) where modality-specific entity extraction matters, and integration into an existing Python codebase.


ApeRAG: Deep Modification + Platform

ApeRAG forks LightRAG and modifies the core algorithm. On top of that it adds Celery distributed task queues, a React WebUI, Kubernetes deployment, and RBAC.

Core Change: Extraction Format

LightRAG extracts entities in JSON. ApeRAG switches to tuples.

LightRAG original:
{"entity": "Samsung", "type": "ORG", "description": "South Korean electronics company"}

ApeRAG modified:
("Samsung", "ORG", "South Korean electronics company")

Tuple parsing is more robust against LLM output variance. JSON format leaves more surface area for formatting errors (missing quotes, trailing commas) that break parsing. ApeRAG also adds entity merging — "Samsung," "Samsung Electronics," and "삼성전자" consolidate to the same entity node.

Graph Storage: Pure Relational, No AGE

A notable design choice: ApeRAG does not use Apache AGE. The graph is implemented as plain PostgreSQL relational tables with SQLAlchemy ORM.

-- ApeRAG graph schema (abstracted)
CREATE TABLE entities (id UUID PRIMARY KEY, name TEXT, type TEXT, ...);
CREATE TABLE relationships (
    source_id UUID REFERENCES entities(id),
    target_id UUID REFERENCES entities(id),
    relation_type TEXT,
    weight FLOAT,
    ...
);

-- 1-hop traversal — single query, no N+1
SELECT DISTINCT e.*
FROM entities e
JOIN relationships r
  ON r.source_id = e.id OR r.target_id = e.id
WHERE r.source_id = $entity_id
   OR r.target_id = $entity_id;

CTE + UNION ALL for 1-hop traversal in a single round trip. No Cypher, no graph extension, no AGE overhead. For 1-hop patterns this is straightforward and efficient.

The Undocumented Structural Limit: Multi-Hop Not Implemented

This is documented in the source code itself:

# pg_ops_sync_graph_storage.py:289
"""
For now, it only supports getting nodes by label pattern
and their immediate connections.
Full graph traversal with max_depth would require
additional Repository methods.
"""

The max_depth parameter exists in the function signature but is not used. Every traversal is 1-hop only.

Since LightRAG's standard retrieval pattern is 1-hop dominant, most queries work without hitting this limit. But it is a hard constraint, not a soft performance limit. Any query pattern requiring traversal beyond 1 hop — multi-step entity chain reasoning, path-based context assembly — is structurally blocked in ApeRAG's current implementation.

Production Infrastructure

ApeRAG's strength is its production feature set:

Feature Implementation
Distributed indexing Celery worker queue
Deployment Docker Compose / Kubernetes Helm
Authentication API key based
Multi-tenancy Collection isolation
Agent workflows React WebUI flow editor
MCP server Supported
External services PG + Qdrant + ES + Redis (4 services)

The four external services each have a defined role: PostgreSQL for graph and KV storage, Qdrant for chunk vector search, Elasticsearch for full-text search, Redis for caching. Clean separation of concerns at the cost of operational overhead.


EdgeQuake: Rust Rewrite

EdgeQuake rewrites the LightRAG algorithm from Python to Rust — 11 crates. This is not a port; it is a reimplementation with algorithmic changes.

Why Rust

Python's GIL (Global Interpreter Lock) allows only one thread to execute Python bytecode at a time. I/O can happen concurrently, but CPU-bound computation serializes. For a high-concurrency inference service, the GIL caps single-instance throughput.

Rust has no GIL. With the tokio async runtime, EdgeQuake handles thousands of concurrent requests on a single instance. The claimed ceiling is 1,000+ concurrent users — a number that would require horizontal Celery scaling to reach in ApeRAG.

LightRAG, RAG-Anything, and ApeRAG all share this GIL constraint. ApeRAG's Celery scaling compensates at the infrastructure level, but a single-node limit remains.

Entity Extraction Quality: Three Improvements

1. Fixed entity types

LightRAG lets the LLM freely assign entity types. The same entity might appear as "company," "corporation," "organization," "기업," or "회사" across different chunks. EdgeQuake constrains to seven types:

PERSON | ORG | LOCATION | CONCEPT | EVENT | TECH | PRODUCT

Fixed types eliminate type vocabulary drift. Every ORG is an ORG. Downstream filtering and graph analytics become more reliable.

2. UPPERCASE_UNDERSCORE normalization

"Samsung"             → "SAMSUNG"
"Samsung Electronics" → "SAMSUNG_ELECTRONICS"
"삼성전자"             → "SAMSUNG_ELECTRONICS" (translated + normalized)
"samsung"             → "SAMSUNG"

This single normalization reduces entity duplicates by 36–40%. Without it, "samsung," "Samsung," "SAMSUNG," and "삼성전자" accumulate as four separate nodes representing the same entity. The graph shrinks substantially, traversal is faster, and context assembly retrieves cleaner results.

3. Multi-pass gleaning

LLMs sometimes miss entities on the first extraction pass — particularly in long chunks or specialized domains where the prompt template doesn't align well with the content structure.

Gleaning is a re-prompting strategy: after initial extraction, the LLM is asked "are there entities in this chunk that you missed?" This iterates until no new entities are found or a maximum pass count is reached. LightRAG performs gleaning once. EdgeQuake performs it across multiple passes. The result: 15–25% improvement in entity recall. Indexing cost increases proportionally, but graph coverage improves.

Community Detection Timing

Community detection timing varies across the four frameworks:

Framework When community detection runs
LightRAG At query time (when Global mode requested)
RAG-Anything Delegates to LightRAG (query time)
ApeRAG Delegates to LightRAG (query time)
EdgeQuake At ingestion time (pre-computed Louvain)

Query-time community detection means the first Global-mode query on a large graph can take tens of seconds while Louvain runs over the full node set. Latency is unpredictable and spiky.

EdgeQuake runs Louvain community detection (plus Label Propagation and Connected Components) during indexing. Each node gets a community_id stored as a column. At query time, community lookup is a column scan — fast and predictable. Indexing cost is higher, but production query latency is consistent.

Graph Storage: PostgreSQL + AGE

EdgeQuake stores the graph in Apache AGE (a PostgreSQL extension for property graphs) and traverses using Cypher queries parallelized with tokio::join!.

Worth noting: in our GraphDB 8-engine benchmark, AGE measured at 78 RPS at u=50 concurrent users under LightRAG-style 1-hop workloads. For single-user or low-concurrency scenarios this is adequate, but production deployments expecting more than 20 concurrent graph-querying users should run independent load tests before committing.

EdgeQuake's 1,000+ concurrent user claim is grounded in Rust's async runtime handling the HTTP and application layers — the graph query throughput ceiling is still subject to the AGE + PostgreSQL layer underneath.

Single-Stack Production Features

EdgeQuake consolidates all storage into PostgreSQL:

Feature Implementation
Graph PostgreSQL + AGE
Vector search pgvector + HNSW
Full-text PostgreSQL tsvector + GIN
KV / cache PostgreSQL
Audit logging PostgreSQL
Authentication JWT + Argon2
Multi-tenancy Workspace isolation, ADMIN/EDITOR/VIEWER
Rate limiting Per-user and per-workspace
Cost tracking Per-job LLM token costs
SDKs Python, TS, Rust, Java, Go, C#, Ruby, Swift, PHP

Where ApeRAG separates concerns across four external services, EdgeQuake collapses everything into PostgreSQL. Operationally simpler, but PostgreSQL becomes a single point of failure for all functionality.


Summary

Criterion LightRAG RAG-Anything ApeRAG EdgeQuake
Language Python Python Python + Celery Rust
LightRAG relationship Original Added on top, unchanged Deeply modified Reimplemented
Multimodal None 5 processors 5 index types PDF Vision only
Graph storage Neo4j / NetworkX (delegates) PG relational tables PG + AGE
Multi-hop traversal Supported Supported Not implemented Supported
Entity normalization None None Synonym merging UPPERCASE + gleaning
Community detection Query-time Query-time Query-time Ingestion-time
Concurrent users Low Low Celery horizontal scale 1,000+
External services 0–1 0–1 4 1 (PG)
Production features None None RBAC + K8s JWT + cost tracking
Agent workflow editor None None WebUI None
MCP server None None Supported Supported

Selection Guide

Use case Choice
Research, prototyping, minimal dependencies LightRAG
Embed in existing Python application LightRAG or RAG-Anything
Documents with images, tables, equations RAG-Anything
Agent workflow editor + Kubernetes horizontal scaling ApeRAG
Single PostgreSQL stack + high throughput EdgeQuake
Multi-hop graph traversal required EdgeQuake (ApeRAG does not implement it)
Entity extraction quality optimization EdgeQuake (gleaning + normalization)
Enterprise features (cost tracking, rate limiting) EdgeQuake
Broad local model support (vLLM, LM Studio, LOLLMS) LightRAG or RAG-Anything