AI May 21, 2026 10 min read

AI assisted

Comparing Four LightRAG Variants — Same Root, Different Production Strategies

Source-level comparison of RAG-Anything, ApeRAG, and EdgeQuake as LightRAG derivatives

#LightRAG #RAG-Anything #ApeRAG #EdgeQuake #GraphRAG #RAG #Graph

LightRAG is a Graph RAG framework released by the HKUDS research group in 2024. It extracts entities and relationships from documents into a graph, then combines graph traversal and vector search at query time to ground LLM answers. After the open-source release, three derivative projects emerged: RAG-Anything, ApeRAG, and EdgeQuake. All three share the same core algorithm. What differs is the philosophy and scope of what was built on top.

What LightRAG Actually Does

Understanding the shared algorithm before comparing what each project changed.

Indexing:

Document input
  → Chunking (split into segments)
  → LLM extracts entities and relationships per chunk
      e.g. "Samsung expanded semiconductor investment in 2024"
      → Entities: {Samsung(ORG), 2024(DATE), semiconductor(TECH)}
      → Relationship: (Samsung) -[expanded investment]→ (semiconductor)
  → Store entities and relationships in graph DB
  → Store entity description text in vector index

Query:

User query
  → Encode query as vector
  → Find entry entities via vector similarity
  → Collect 1–2 hop neighbors of entry entities
  → Assemble entities / relationships / chunks as context
  → LLM generates answer

All six query modes (Naive / Local / Global / Hybrid / Mix / Bypass) are present in all four frameworks. The algorithm is identical across them. The divergence is in everything layered on top.

LightRAG (original, Python library)
    ├── RAG-Anything  ── multimodal layer added on top, core unchanged
    ├── ApeRAG        ── core deeply modified + production platform
    └── EdgeQuake     ── algorithm rewritten in Rust from scratch

RAG-Anything: The Additive Approach

RAG-Anything imports LightRAG without modifying it and adds five modality processors as mixins. The GraphRAG core is LightRAG exactly.

from lightrag import LightRAG  # unchanged import

class MultiModalRAG:
    def __init__(self):
        self.rag = LightRAG(...)  # core unchanged
        self.processors = {
            'image':    ImageModalProcessor(),    # Vision LLM → description + entities
            'table':    TableModalProcessor(),    # LLM structural analysis
            'equation': EquationModalProcessor(), # equation recognition
            'office':   OfficeModalProcessor(),   # LibreOffice conversion pipeline
            'generic':  GenericModalProcessor(),  # fallback
        }

The distinctive feature is how image content enters the graph. Rather than simply converting images to descriptive text, image-extracted entities become independent nodes in the knowledge graph.

PDF document parse
  → Text chunks → standard LightRAG pipeline
  → Image detected → Vision LLM → description + entity extraction
      → creates "image entity node" in graph
      → connects to related text entities
  → Table detected → LLM structural analysis → "table entity node"

A query about "the trend shown in this chart" can traverse from the image node to connected text entities that discuss the same trend. The modalities become part of the same graph fabric.

Production features: none. No authentication, multi-tenancy, rate limiting, conversation history, or background task management. Install via pip and embed in existing Python applications.

Right fit for: documents containing images, equations, or tables (academic papers, technical diagrams, financial reports) where modality-specific entity extraction matters, and integration into an existing Python codebase.

ApeRAG: Deep Modification + Platform

ApeRAG forks LightRAG and modifies the core algorithm. On top of that it adds Celery distributed task queues, a React WebUI, Kubernetes deployment, and RBAC.

Core Change: Extraction Format

LightRAG extracts entities in JSON. ApeRAG switches to tuples.

LightRAG original:
{"entity": "Samsung", "type": "ORG", "description": "South Korean electronics company"}

ApeRAG modified:
("Samsung", "ORG", "South Korean electronics company")

Tuple parsing is more robust against LLM output variance. JSON format leaves more surface area for formatting errors (missing quotes, trailing commas) that break parsing. ApeRAG also adds entity merging — "Samsung," "Samsung Electronics," and "삼성전자" consolidate to the same entity node.

Graph Storage: Pure Relational, No AGE

A notable design choice: ApeRAG does not use Apache AGE. The graph is implemented as plain PostgreSQL relational tables with SQLAlchemy ORM.

-- ApeRAG graph schema (abstracted)
CREATE TABLE entities (id UUID PRIMARY KEY, name TEXT, type TEXT, ...);
CREATE TABLE relationships (
    source_id UUID REFERENCES entities(id),
    target_id UUID REFERENCES entities(id),
    relation_type TEXT,
    weight FLOAT,
    ...
);

-- 1-hop traversal — single query, no N+1
SELECT DISTINCT e.*
FROM entities e
JOIN relationships r
  ON r.source_id = e.id OR r.target_id = e.id
WHERE r.source_id = $entity_id
   OR r.target_id = $entity_id;

CTE + UNION ALL for 1-hop traversal in a single round trip. No Cypher, no graph extension, no AGE overhead. For 1-hop patterns this is straightforward and efficient.

The Undocumented Structural Limit: Multi-Hop Not Implemented

This is documented in the source code itself:

# pg_ops_sync_graph_storage.py:289
"""
For now, it only supports getting nodes by label pattern
and their immediate connections.
Full graph traversal with max_depth would require
additional Repository methods.
"""

The max_depth parameter exists in the function signature but is not used. Every traversal is 1-hop only.

Since LightRAG's standard retrieval pattern is 1-hop dominant, most queries work without hitting this limit. But it is a hard constraint, not a soft performance limit. Any query pattern requiring traversal beyond 1 hop — multi-step entity chain reasoning, path-based context assembly — is structurally blocked in ApeRAG's current implementation.

Production Infrastructure

ApeRAG's strength is its production feature set:

Feature	Implementation
Distributed indexing	Celery worker queue
Deployment	Docker Compose / Kubernetes Helm
Authentication	API key based
Multi-tenancy	Collection isolation
Agent workflows	React WebUI flow editor
MCP server	Supported
External services	PG + Qdrant + ES + Redis (4 services)

The four external services each have a defined role: PostgreSQL for graph and KV storage, Qdrant for chunk vector search, Elasticsearch for full-text search, Redis for caching. Clean separation of concerns at the cost of operational overhead.

EdgeQuake: Rust Rewrite

EdgeQuake rewrites the LightRAG algorithm from Python to Rust — 11 crates. This is not a port; it is a reimplementation with algorithmic changes.

Why Rust

Python's GIL (Global Interpreter Lock) allows only one thread to execute Python bytecode at a time. I/O can happen concurrently, but CPU-bound computation serializes. For a high-concurrency inference service, the GIL caps single-instance throughput.

Rust has no GIL. With the tokio async runtime, EdgeQuake handles thousands of concurrent requests on a single instance. The claimed ceiling is 1,000+ concurrent users — a number that would require horizontal Celery scaling to reach in ApeRAG.

LightRAG, RAG-Anything, and ApeRAG all share this GIL constraint. ApeRAG's Celery scaling compensates at the infrastructure level, but a single-node limit remains.

Entity Extraction Quality: Three Improvements

1. Fixed entity types

LightRAG lets the LLM freely assign entity types. The same entity might appear as "company," "corporation," "organization," "기업," or "회사" across different chunks. EdgeQuake constrains to seven types:

PERSON | ORG | LOCATION | CONCEPT | EVENT | TECH | PRODUCT

Fixed types eliminate type vocabulary drift. Every ORG is an ORG. Downstream filtering and graph analytics become more reliable.

2. UPPERCASE_UNDERSCORE normalization

"Samsung"             → "SAMSUNG"
"Samsung Electronics" → "SAMSUNG_ELECTRONICS"
"삼성전자"             → "SAMSUNG_ELECTRONICS" (translated + normalized)
"samsung"             → "SAMSUNG"

This single normalization reduces entity duplicates by 36–40%. Without it, "samsung," "Samsung," "SAMSUNG," and "삼성전자" accumulate as four separate nodes representing the same entity. The graph shrinks substantially, traversal is faster, and context assembly retrieves cleaner results.

3. Multi-pass gleaning

LLMs sometimes miss entities on the first extraction pass — particularly in long chunks or specialized domains where the prompt template doesn't align well with the content structure.

Gleaning is a re-prompting strategy: after initial extraction, the LLM is asked "are there entities in this chunk that you missed?" This iterates until no new entities are found or a maximum pass count is reached. LightRAG performs gleaning once. EdgeQuake performs it across multiple passes. The result: 15–25% improvement in entity recall. Indexing cost increases proportionally, but graph coverage improves.

Community Detection Timing

Community detection timing varies across the four frameworks:

Framework	When community detection runs
LightRAG	At query time (when Global mode requested)
RAG-Anything	Delegates to LightRAG (query time)
ApeRAG	Delegates to LightRAG (query time)
EdgeQuake	At ingestion time (pre-computed Louvain)

Query-time community detection means the first Global-mode query on a large graph can take tens of seconds while Louvain runs over the full node set. Latency is unpredictable and spiky.

EdgeQuake runs Louvain community detection (plus Label Propagation and Connected Components) during indexing. Each node gets a community_id stored as a column. At query time, community lookup is a column scan — fast and predictable. Indexing cost is higher, but production query latency is consistent.

Graph Storage: PostgreSQL + AGE

EdgeQuake stores the graph in Apache AGE (a PostgreSQL extension for property graphs) and traverses using Cypher queries parallelized with tokio::join!.

Worth noting: in our GraphDB 8-engine benchmark, AGE measured at 78 RPS at u=50 concurrent users under LightRAG-style 1-hop workloads. For single-user or low-concurrency scenarios this is adequate, but production deployments expecting more than 20 concurrent graph-querying users should run independent load tests before committing.

EdgeQuake's 1,000+ concurrent user claim is grounded in Rust's async runtime handling the HTTP and application layers — the graph query throughput ceiling is still subject to the AGE + PostgreSQL layer underneath.

Single-Stack Production Features

EdgeQuake consolidates all storage into PostgreSQL:

Feature	Implementation
Graph	PostgreSQL + AGE
Vector search	pgvector + HNSW
Full-text	PostgreSQL tsvector + GIN
KV / cache	PostgreSQL
Audit logging	PostgreSQL
Authentication	JWT + Argon2
Multi-tenancy	Workspace isolation, ADMIN/EDITOR/VIEWER
Rate limiting	Per-user and per-workspace
Cost tracking	Per-job LLM token costs
SDKs	Python, TS, Rust, Java, Go, C#, Ruby, Swift, PHP

Where ApeRAG separates concerns across four external services, EdgeQuake collapses everything into PostgreSQL. Operationally simpler, but PostgreSQL becomes a single point of failure for all functionality.

Summary

Criterion	LightRAG	RAG-Anything	ApeRAG	EdgeQuake
Language	Python	Python	Python + Celery	Rust
LightRAG relationship	Original	Added on top, unchanged	Deeply modified	Reimplemented
Multimodal	None	5 processors	5 index types	PDF Vision only
Graph storage	Neo4j / NetworkX	(delegates)	PG relational tables	PG + AGE
Multi-hop traversal	Supported	Supported	Not implemented	Supported
Entity normalization	None	None	Synonym merging	UPPERCASE + gleaning
Community detection	Query-time	Query-time	Query-time	Ingestion-time
Concurrent users	Low	Low	Celery horizontal scale	1,000+
External services	0–1	0–1	4	1 (PG)
Production features	None	None	RBAC + K8s	JWT + cost tracking
Agent workflow editor	None	None	WebUI	None
MCP server	None	None	Supported	Supported

Selection Guide

Use case	Choice
Research, prototyping, minimal dependencies	LightRAG
Embed in existing Python application	LightRAG or RAG-Anything
Documents with images, tables, equations	RAG-Anything
Agent workflow editor + Kubernetes horizontal scaling	ApeRAG
Single PostgreSQL stack + high throughput	EdgeQuake
Multi-hop graph traversal required	EdgeQuake (ApeRAG does not implement it)
Entity extraction quality optimization	EdgeQuake (gleaning + normalization)
Enterprise features (cost tracking, rate limiting)	EdgeQuake
Broad local model support (vLLM, LM Studio, LOLLMS)	LightRAG or RAG-Anything