LlamaIndex

A data framework for LLM apps, focused on ingesting, indexing, and querying documents for retrieval-augmented generation.

Category
LLM & Agent Frameworks
Difficulty
Intermediate
When to use
You're building a RAG system over a corpus of documents and want ready-made loaders, indexers, and query engines.
When not to use
Your use case is a single-call LLM feature with no retrieval.
Alternatives
LangChain Haystack Custom retrieval + SDK

At a glance

FieldValue
CategoryRAG / data framework for LLMs
DifficultyIntermediate
When to useDocument-heavy RAG systems
When not to useNon-retrieval LLM features
AlternativesLangChain, Haystack, custom

What it is

LlamaIndex provides readers (PDF, Notion, Google Drive, SQL, and hundreds more), node parsers (chunking strategies), indexes (vector, tree, keyword, knowledge graph), and query engines that tie them together. The abstraction is explicitly data-first: “how do I ingest, index, and query my documents” rather than LangChain’s more general “how do I chain LLM calls”.

When we reach for it at Ephizen

  • Heterogeneous document sets (PDFs, Markdown, Confluence) where its loaders save us days of work.
  • Trying different indexing strategies (hierarchical summarization, auto-retriever) without rewriting the pipeline.
  • Quickly benchmarking retrieval approaches against the same evaluation set.

Getting started

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

docs = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(docs)
engine = index.as_query_engine(similarity_top_k=5)
print(engine.query("What's the refund window?"))

Gotchas

  • Defaults use OpenAI for both embeddings and LLM. Configure a local embedding model and a provider of your choice before the first run, or your bill will surprise you.
  • The chunking defaults are generic — tune chunk size and overlap for your corpus.
  • LlamaIndex and LangChain overlap a lot. Pick one and stick with it in a single codebase.

Related tools