LlamaIndex
A data framework for LLM apps, focused on ingesting, indexing, and querying documents for retrieval-augmented generation.
Category
LLM & Agent Frameworks
Difficulty
Intermediate
When to use
You're building a RAG system over a corpus of documents and want ready-made loaders, indexers, and query engines.
When not to use
Your use case is a single-call LLM feature with no retrieval.
Alternatives
LangChain Haystack Custom retrieval + SDK
At a glance
| Field | Value |
|---|---|
| Category | RAG / data framework for LLMs |
| Difficulty | Intermediate |
| When to use | Document-heavy RAG systems |
| When not to use | Non-retrieval LLM features |
| Alternatives | LangChain, Haystack, custom |
What it is
LlamaIndex provides readers (PDF, Notion, Google Drive, SQL, and hundreds more), node parsers (chunking strategies), indexes (vector, tree, keyword, knowledge graph), and query engines that tie them together. The abstraction is explicitly data-first: “how do I ingest, index, and query my documents” rather than LangChain’s more general “how do I chain LLM calls”.
When we reach for it at Ephizen
- Heterogeneous document sets (PDFs, Markdown, Confluence) where its loaders save us days of work.
- Trying different indexing strategies (hierarchical summarization, auto-retriever) without rewriting the pipeline.
- Quickly benchmarking retrieval approaches against the same evaluation set.
Getting started
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
engine = index.as_query_engine(similarity_top_k=5)
print(engine.query("What's the refund window?"))
Gotchas
- Defaults use OpenAI for both embeddings and LLM. Configure a local embedding model and a provider of your choice before the first run, or your bill will surprise you.
- The chunking defaults are generic — tune chunk size and overlap for your corpus.
- LlamaIndex and LangChain overlap a lot. Pick one and stick with it in a single codebase.
Related tools
- DSPyA framework for programming (not prompting) LLMs — declare signatures and modules, then let an optimizer compile prompts and few-shot examples for you.
- HuggingFace TransformersThe library that made pretrained transformers trivially loadable — from BERT to Llama — with a consistent API across tasks.
- LangChainA Python/JS framework for composing LLM calls, prompts, tools, and memory into end-to-end applications.