LangChain

LangChain

RAG 检索增强生成

截屏2024-01-31 12.00.56.png

LangChain + InternLM RAG向量检索外挂知识库

包依赖

pip install langchain==0.0.292
pip install gradio==4.4.0
pip install chromadb==0.4.15
pip install sentence-transformers==2.2.2
pip install unstructured==0.10.30
pip install markdown==3.3.7

词向量模型 Sentence Transformer

基于词向量检索,对prompt丰富

<https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2>

Sentence Transformer 需要NLTK资源

https://gitee.com/yzy0612/nltk_data.git

默认去root/nltk_data 找第三方依赖

cd /root
git clone <https://gitee.com/yzy0612/nltk_data.git>  --branch gh-pages
cd nltk_data
mv packages/*  ./
cd tokenizers
unzip punkt.zip
cd ../taggers
unzip averaged_perceptron_tagger.zip

向量数据库构建