Llamaindex

v1.0.0

LlamaIndex RAG 框架助手,精通文档索引、检索增强生成、向量存储、查询引擎

0· 126·1 current·1 all-time
MIT-0
Download zip
LicenseMIT-0 · Free to use, modify, and redistribute. No attribution required.
Security Scan
VirusTotalVirusTotal
Benign
View report →
OpenClawOpenClaw
Benign
high confidence
Purpose & Capability
The name/description match the SKILL.md content: it teaches how to install and use LlamaIndex, vector stores, loaders, and query engines. All commands and examples relate to building RAG pipelines.
Instruction Scope
Instructions legitimately show reading local data (./data), creating a local Chroma DB (./chroma_db), and optionally loading web pages. These are expected for indexing/RAG, but they do mean the agent following these instructions could read and write local files and fetch remote pages — users should only point it at data they want indexed.
Install Mechanism
There is no install spec in the registry (instruction-only). The SKILL.md suggests pip install commands (standard PyPI packages). That is normal, but running them installs third-party packages into the environment, so prefer a virtualenv and verify package names/versions before installing.
Credentials
The skill declares no required environment variables, which is consistent. The docs recommend OpenAI-related packages (llama-index-llms-openai, embeddings-openai) — using those will require provider API keys (e.g., OPENAI_API_KEY) though the skill doesn't declare them. Users should only supply credentials for services they intend to use.
Persistence & Privilege
always:false and no install hooks; the skill does not request persistent platform privileges or override other skills. It only provides instructions the agent could follow when invoked.
Assessment
This skill is an instructional guide for using LlamaIndex and is internally consistent. Before installing or following the instructions: (1) run pip installs inside a virtualenv/container and verify package names and versions; (2) be aware the examples read local files (./data) and create a local Chroma DB (./chroma_db) — only point the agent at data you want indexed; (3) if you plan to use OpenAI (or other LLM/embedding providers), you will need to provide API keys (e.g., OPENAI_API_KEY) — do not share unrelated credentials; (4) review and understand any code/examples before executing them, especially web loaders that fetch remote pages.

Like a lobster shell, security has layers — review code before you run it.

latestvk97a0eep359jwqfyf1bt0te1p183cpwr

License

MIT-0
Free to use, modify, and redistribute. No attribution required.

SKILL.md

LlamaIndex RAG 框架助手

你是 LlamaIndex(原 GPT Index)领域的专家,帮助用户构建高质量的检索增强生成系统。

核心概念

概念说明
Document原始数据源(PDF、网页、数据库等)的抽象表示
NodeDocument 切分后的文本块,是索引的基本单元
Index对 Node 的组织结构,支持向量、摘要、知识图谱等类型
QueryEngine查询引擎,从 Index 中检索相关内容并生成回答
Retriever检索器,从 Index 中获取相关 Node

安装

pip install llama-index
pip install llama-index-llms-openai          # OpenAI LLM
pip install llama-index-embeddings-openai    # OpenAI Embedding
pip install llama-index-vector-stores-chroma # Chroma 向量库

快速开始:5 行代码构建 RAG

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("这份文档的主要内容是什么?")

数据加载

from llama_index.core import SimpleDirectoryReader

# 通用文件加载,支持 PDF、DOCX、TXT、CSV 等
documents = SimpleDirectoryReader(
    input_dir="./data",
    recursive=True,
    required_exts=[".pdf", ".md"],
    filename_as_id=True
).load_data()

# 专用 Loader(LlamaHub 生态)
from llama_index.readers.web import SimpleWebPageReader
docs = SimpleWebPageReader().load_data(["https://example.com"])

索引类型

索引类型适用场景说明
VectorStoreIndex语义搜索(最常用)将 Node 转为向量,余弦相似度检索
SummaryIndex全文摘要遍历所有 Node 生成摘要
TreeIndex层级摘要自底向上构建摘要树
KnowledgeGraphIndex知识图谱提取实体关系
KeywordTableIndex关键词检索基于关键词匹配

向量存储集成

import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext

chroma_client = chromadb.PersistentClient(path="./chroma_db")
collection = chroma_client.get_or_create_collection("my_docs")
vector_store = ChromaVectorStore(chroma_collection=collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(documents, storage_context=storage_context)

支持的向量数据库

向量库特点适用场景
Chroma轻量嵌入式,零配置本地开发、小规模
Qdrant高性能,丰富过滤生产环境推荐
Pinecone全托管云服务免运维需求
Milvus大规模分布式亿级向量数据
FAISSMeta 出品,纯内存高性能本地检索

查询引擎高级配置

query_engine = index.as_query_engine(
    similarity_top_k=5,           # 检索 Top-K 个相关片段
    response_mode="compact",      # compact/refine/tree_summarize
    streaming=True                # 流式输出
)

与 LangChain 对比

特性LlamaIndexLangChain
核心定位RAG 专精,数据索引和检索通用 LLM 应用框架
数据处理内置丰富的文档加载和切分需要更多手动配置
索引能力多种索引类型,开箱即用依赖向量库直接集成
查询优化内置 Reranker、路由、子问题分解需要手动编排 Chain
适用场景知识库问答、文档分析Agent、工作流、通用应用
组合使用可作为 LangChain 的 Retriever可集成 LlamaIndex 索引

Files

1 total
Select a file
Select a file to preview.

Comments

Loading comments…