前提:有docker运行环境和openai的有效key
1、安装qdrant
docker pull ainize/adrant
2、运行qdrant
docker run -p 6333:6333 \ -v $(pwd)/path/to/data:/qdrant/storage \ qdrant/qdrant
3、python客户端
pip install qdrant_client
pip install openai
4、创建数据库test
from qdrant_client import QdrantClient from qdrant_client.http.models import Distance, VectorParams client = QdrantClient("127.0.0.1", port=6333) client.recreate_collection( collection_name='test', vectors_config=VectorParams(size=1536, distance=Distance.COSINE), )5、插入embedding数据
数据文件db.txt
小明是一个小数生 小明爱游泳,经常去游泳馆 小明的妈妈很爱干净,家里一尘不染
python运行文件
from qdrant_client import QdrantClient from qdrant_client.http.models import Distance, VectorParams import openai, numpy as np openai.api_key = "apikey" def getembedding(txt): resp = openai.Embedding.create( input=[txt], engine="text-embedding-3-small") return resp['data'][0]['embedding'] client = QdrantClient("127.0.0.1", port=6333) with open('db.txt', 'r', encoding='utf-8') as f: for index, line in enumerate(f.readlines()): print(line) embedding = getembedding(line) client.upsert( collection_name='test', wait=True, points=[ PointStruct(id=index+1, vector=embedding, payload={"text": line}), ], )6、模糊匹配
from qdrant_client import QdrantClient from qdrant_client.http.models import Distance, VectorParams import openai, numpy as np openai.api_key = "apikey" def getembedding(txt): resp = openai.Embedding.create( input=[txt], engine="text-embedding-3-small") return resp['data'][0]['embedding'] client = QdrantClient("127.0.0.1", port=6333) question = '小明的妈妈有什么习惯?' search_results = client.search( collection_name='kb', query_vector=getembedding(question), limit=3, search_params={"exact": False, "hnsw_ef": 128} )也可以参考官网的教程
https://qdrant.tech/documentation/embeddings/openai/
网友回复