Comparing Four LightRAG Variants — Same Root, Different Production Strategies
LightRAG is a Graph RAG framework released by the HKUDS research group in 2024. It extracts entities and relationships from documents into a graph, then combines graph traversal and vector search at query time to ground LLM answers. After the open-source release, three derivative projects emerged: RAG-Anything, ApeRAG, and EdgeQuake. All three share the same core algorithm. What differs is the philosophy and scope of what was built on top.
What LightRAG Actually Does
Understanding the shared algorithm before comparing what each project changed.
Indexing:
Document input
→ Chunking (split into segments)
→ LLM extracts entities and relationships per chunk
e.g. "Samsung expanded semiconductor investment in 2024"
→ Entities: {Samsung(ORG), 2024(DATE), semiconductor(TECH)}
→ Relationship: (Samsung) -[expanded investment]→ (semiconductor)
→ Store entities and relationships in graph DB
→ Store entity description text in vector indexQuery:
User query
→ Encode query as vector
→ Find entry entities via vector similarity
→ Collect 1–2 hop neighbors of entry entities
→ Assemble entities / relationships / chunks as context
→ LLM generates answerAll six query modes (Naive / Local / Global / Hybrid / Mix / Bypass) are present in all four frameworks. The algorithm is identical across them. The divergence is in everything layered on top.
LightRAG (original, Python library)
├── RAG-Anything ── multimodal layer added on top, core unchanged
├── ApeRAG ── core deeply modified + production platform
└── EdgeQuake ── algorithm rewritten in Rust from scratchRAG-Anything: The Additive Approach
RAG-Anything imports LightRAG without modifying it and adds five modality processors as mixins. The GraphRAG core is LightRAG exactly.
from lightrag import LightRAG # unchanged import
class MultiModalRAG:
def __init__(self):
self.rag = LightRAG(...) # core unchanged
self.processors = {
'image': ImageModalProcessor(), # Vision LLM → description + entities
'table': TableModalProcessor(), # LLM structural analysis
'equation': EquationModalProcessor(), # equation recognition
'office': OfficeModalProcessor(), # LibreOffice conversion pipeline
'generic': GenericModalProcessor(), # fallback
}The distinctive feature is how image content enters the graph. Rather than simply converting images to descriptive text, image-extracted entities become independent nodes in the knowledge graph.
PDF document parse
→ Text chunks → standard LightRAG pipeline
→ Image detected → Vision LLM → description + entity extraction
→ creates "image entity node" in graph
→ connects to related text entities
→ Table detected → LLM structural analysis → "table entity node"A query about "the trend shown in this chart" can traverse from the image node to connected text entities that discuss the same trend. The modalities become part of the same graph fabric.
Production features: none. No authentication, multi-tenancy, rate limiting, conversation history, or background task management. Install via pip and embed in existing Python applications.
Right fit for: documents containing images, equations, or tables (academic papers, technical diagrams, financial reports) where modality-specific entity extraction matters, and integration into an existing Python codebase.
ApeRAG: Deep Modification + Platform
ApeRAG forks LightRAG and modifies the core algorithm. On top of that it adds Celery distributed task queues, a React WebUI, Kubernetes deployment, and RBAC.
Core Change: Extraction Format
LightRAG extracts entities in JSON. ApeRAG switches to tuples.
LightRAG original:
{"entity": "Samsung", "type": "ORG", "description": "South Korean electronics company"}
ApeRAG modified:
("Samsung", "ORG", "South Korean electronics company")Tuple parsing is more robust against LLM output variance. JSON format leaves more surface area for formatting errors (missing quotes, trailing commas) that break parsing. ApeRAG also adds entity merging — "Samsung," "Samsung Electronics," and "삼성전자" consolidate to the same entity node.
Graph Storage: Pure Relational, No AGE
A notable design choice: ApeRAG does not use Apache AGE. The graph is implemented as plain PostgreSQL relational tables with SQLAlchemy ORM.
-- ApeRAG graph schema (abstracted)
CREATE TABLE entities (id UUID PRIMARY KEY, name TEXT, type TEXT, ...);
CREATE TABLE relationships (
source_id UUID REFERENCES entities(id),
target_id UUID REFERENCES entities(id),
relation_type TEXT,
weight FLOAT,
...
);
-- 1-hop traversal — single query, no N+1
SELECT DISTINCT e.*
FROM entities e
JOIN relationships r
ON r.source_id = e.id OR r.target_id = e.id
WHERE r.source_id = $entity_id
OR r.target_id = $entity_id;CTE + UNION ALL for 1-hop traversal in a single round trip. No Cypher, no graph extension, no AGE overhead. For 1-hop patterns this is straightforward and efficient.
The Undocumented Structural Limit: Multi-Hop Not Implemented
This is documented in the source code itself:
# pg_ops_sync_graph_storage.py:289
"""
For now, it only supports getting nodes by label pattern
and their immediate connections.
Full graph traversal with max_depth would require
additional Repository methods.
"""The max_depth parameter exists in the function signature but is not used. Every traversal is 1-hop only.
Since LightRAG's standard retrieval pattern is 1-hop dominant, most queries work without hitting this limit. But it is a hard constraint, not a soft performance limit. Any query pattern requiring traversal beyond 1 hop — multi-step entity chain reasoning, path-based context assembly — is structurally blocked in ApeRAG's current implementation.
Production Infrastructure
ApeRAG's strength is its production feature set:
| Feature | Implementation |
|---|---|
| Distributed indexing | Celery worker queue |
| Deployment | Docker Compose / Kubernetes Helm |
| Authentication | API key based |
| Multi-tenancy | Collection isolation |
| Agent workflows | React WebUI flow editor |
| MCP server | Supported |
| External services | PG + Qdrant + ES + Redis (4 services) |
The four external services each have a defined role: PostgreSQL for graph and KV storage, Qdrant for chunk vector search, Elasticsearch for full-text search, Redis for caching. Clean separation of concerns at the cost of operational overhead.
EdgeQuake: Rust Rewrite
EdgeQuake rewrites the LightRAG algorithm from Python to Rust — 11 crates. This is not a port; it is a reimplementation with algorithmic changes.
Why Rust
Python's GIL (Global Interpreter Lock) allows only one thread to execute Python bytecode at a time. I/O can happen concurrently, but CPU-bound computation serializes. For a high-concurrency inference service, the GIL caps single-instance throughput.
Rust has no GIL. With the tokio async runtime, EdgeQuake handles thousands of concurrent requests on a single instance. The claimed ceiling is 1,000+ concurrent users — a number that would require horizontal Celery scaling to reach in ApeRAG.
LightRAG, RAG-Anything, and ApeRAG all share this GIL constraint. ApeRAG's Celery scaling compensates at the infrastructure level, but a single-node limit remains.
Entity Extraction Quality: Three Improvements
1. Fixed entity types
LightRAG lets the LLM freely assign entity types. The same entity might appear as "company," "corporation," "organization," "기업," or "회사" across different chunks. EdgeQuake constrains to seven types:
PERSON | ORG | LOCATION | CONCEPT | EVENT | TECH | PRODUCTFixed types eliminate type vocabulary drift. Every ORG is an ORG. Downstream filtering and graph analytics become more reliable.
2. UPPERCASE_UNDERSCORE normalization
"Samsung" → "SAMSUNG"
"Samsung Electronics" → "SAMSUNG_ELECTRONICS"
"삼성전자" → "SAMSUNG_ELECTRONICS" (translated + normalized)
"samsung" → "SAMSUNG"This single normalization reduces entity duplicates by 36–40%. Without it, "samsung," "Samsung," "SAMSUNG," and "삼성전자" accumulate as four separate nodes representing the same entity. The graph shrinks substantially, traversal is faster, and context assembly retrieves cleaner results.
3. Multi-pass gleaning
LLMs sometimes miss entities on the first extraction pass — particularly in long chunks or specialized domains where the prompt template doesn't align well with the content structure.
Gleaning is a re-prompting strategy: after initial extraction, the LLM is asked "are there entities in this chunk that you missed?" This iterates until no new entities are found or a maximum pass count is reached. LightRAG performs gleaning once. EdgeQuake performs it across multiple passes. The result: 15–25% improvement in entity recall. Indexing cost increases proportionally, but graph coverage improves.
Community Detection Timing
Community detection timing varies across the four frameworks:
| Framework | When community detection runs |
|---|---|
| LightRAG | At query time (when Global mode requested) |
| RAG-Anything | Delegates to LightRAG (query time) |
| ApeRAG | Delegates to LightRAG (query time) |
| EdgeQuake | At ingestion time (pre-computed Louvain) |
Query-time community detection means the first Global-mode query on a large graph can take tens of seconds while Louvain runs over the full node set. Latency is unpredictable and spiky.
EdgeQuake runs Louvain community detection (plus Label Propagation and Connected Components) during indexing. Each node gets a community_id stored as a column. At query time, community lookup is a column scan — fast and predictable. Indexing cost is higher, but production query latency is consistent.
Graph Storage: PostgreSQL + AGE
EdgeQuake stores the graph in Apache AGE (a PostgreSQL extension for property graphs) and traverses using Cypher queries parallelized with tokio::join!.
Worth noting: in our GraphDB 8-engine benchmark, AGE measured at 78 RPS at u=50 concurrent users under LightRAG-style 1-hop workloads. For single-user or low-concurrency scenarios this is adequate, but production deployments expecting more than 20 concurrent graph-querying users should run independent load tests before committing.
EdgeQuake's 1,000+ concurrent user claim is grounded in Rust's async runtime handling the HTTP and application layers — the graph query throughput ceiling is still subject to the AGE + PostgreSQL layer underneath.
Single-Stack Production Features
EdgeQuake consolidates all storage into PostgreSQL:
| Feature | Implementation |
|---|---|
| Graph | PostgreSQL + AGE |
| Vector search | pgvector + HNSW |
| Full-text | PostgreSQL tsvector + GIN |
| KV / cache | PostgreSQL |
| Audit logging | PostgreSQL |
| Authentication | JWT + Argon2 |
| Multi-tenancy | Workspace isolation, ADMIN/EDITOR/VIEWER |
| Rate limiting | Per-user and per-workspace |
| Cost tracking | Per-job LLM token costs |
| SDKs | Python, TS, Rust, Java, Go, C#, Ruby, Swift, PHP |
Where ApeRAG separates concerns across four external services, EdgeQuake collapses everything into PostgreSQL. Operationally simpler, but PostgreSQL becomes a single point of failure for all functionality.
Summary
| Criterion | LightRAG | RAG-Anything | ApeRAG | EdgeQuake |
|---|---|---|---|---|
| Language | Python | Python | Python + Celery | Rust |
| LightRAG relationship | Original | Added on top, unchanged | Deeply modified | Reimplemented |
| Multimodal | None | 5 processors | 5 index types | PDF Vision only |
| Graph storage | Neo4j / NetworkX | (delegates) | PG relational tables | PG + AGE |
| Multi-hop traversal | Supported | Supported | Not implemented | Supported |
| Entity normalization | None | None | Synonym merging | UPPERCASE + gleaning |
| Community detection | Query-time | Query-time | Query-time | Ingestion-time |
| Concurrent users | Low | Low | Celery horizontal scale | 1,000+ |
| External services | 0–1 | 0–1 | 4 | 1 (PG) |
| Production features | None | None | RBAC + K8s | JWT + cost tracking |
| Agent workflow editor | None | None | WebUI | None |
| MCP server | None | None | Supported | Supported |
Selection Guide
| Use case | Choice |
|---|---|
| Research, prototyping, minimal dependencies | LightRAG |
| Embed in existing Python application | LightRAG or RAG-Anything |
| Documents with images, tables, equations | RAG-Anything |
| Agent workflow editor + Kubernetes horizontal scaling | ApeRAG |
| Single PostgreSQL stack + high throughput | EdgeQuake |
| Multi-hop graph traversal required | EdgeQuake (ApeRAG does not implement it) |
| Entity extraction quality optimization | EdgeQuake (gleaning + normalization) |
| Enterprise features (cost tracking, rate limiting) | EdgeQuake |
| Broad local model support (vLLM, LM Studio, LOLLMS) | LightRAG or RAG-Anything |