鑒于最近人工智能支持的API和網(wǎng)絡(luò)開發(fā)工具的激增,許多科技公司都在將聊天機(jī)器人集成到他們的應(yīng)用程序中。
LangChain是一種備受歡迎的新框架,近期引起了廣泛關(guān)注。該框架旨在簡(jiǎn)化開發(fā)人員與語(yǔ)言模型、外部數(shù)據(jù)和計(jì)算資源進(jìn)行交互的應(yīng)用程序開發(fā)過(guò)程。它通過(guò)清晰且模塊化的抽象,關(guān)注構(gòu)建所需的所有構(gòu)建模塊,并構(gòu)建了常用的"鏈條",即構(gòu)建模塊的組合。例如,對(duì)話檢索鏈條可以讓用戶與外部存儲(chǔ)中的數(shù)據(jù)進(jìn)行交互,實(shí)現(xiàn)真實(shí)的對(duì)話體驗(yàn)。
LangChain是如何實(shí)現(xiàn)這一目標(biāo)的呢?OpenAI的語(yǔ)言模型并沒有針對(duì)特定企業(yè)的具體數(shù)據(jù)進(jìn)行訓(xùn)練或優(yōu)化。如果您的聊天機(jī)器人依賴于該框架,您需要在運(yùn)行時(shí)向OpenAI提供數(shù)據(jù)。在檢索步驟中,我們使用向量相似性搜索(VSS)從Redis中獲取與用戶查詢相關(guān)的數(shù)據(jù),并將這些數(shù)據(jù)與原始問題一起輸入到語(yǔ)言模型中。這要求模型僅使用提供的信息(在人工智能領(lǐng)域中稱為"上下文")來(lái)回答問題。
這個(gè)鏈條中的大部分復(fù)雜性都?xì)w結(jié)于檢索步驟。因此,我們選擇將LangChain與Redis Enterprise集成為一個(gè)向量數(shù)據(jù)庫(kù)。這種組合為復(fù)雜的人工智能和產(chǎn)品開發(fā)之間搭建了橋梁。
在這個(gè)簡(jiǎn)短的教程中,我們將展示如何構(gòu)建一個(gè)會(huì)話式的零售購(gòu)物助手,幫助顧客在產(chǎn)品目錄中發(fā)現(xiàn)那些被埋藏的令人感興趣的商品。讀者可以按照提供的完整代碼進(jìn)行操作。
01
構(gòu)建你的聊天機(jī)器人
首先,安裝項(xiàng)目所需的所有組件。
1、安裝 Python 依賴項(xiàng)
這個(gè)項(xiàng)目需要一些Python庫(kù)。這些庫(kù)存儲(chǔ)在github倉(cāng)庫(kù)的文件中。
pip install langchain==0.0.123pip install openai==0.27.2pip install redis==4.5.3pip install numpypip install pandaspip install gdown
2、準(zhǔn)備產(chǎn)品數(shù)據(jù)集
對(duì)于零售聊天機(jī)器人,我們選擇使用Amazon Berkeley Objects數(shù)據(jù)集。該數(shù)據(jù)集包含了大量適用于生成零售助手的亞馬遜產(chǎn)品。
使用Python的pandas庫(kù)來(lái)加載和預(yù)處理數(shù)據(jù)集。在加載過(guò)程中,我們可以截?cái)噍^長(zhǎng)的文本字段。這樣一來(lái),我們的數(shù)據(jù)集會(huì)更加精簡(jiǎn),從而節(jié)省內(nèi)存和計(jì)算時(shí)間。
import pandas as pd
MAX_TEXT_LENGTH=1000 # Maximum num of text characters to use def auto_truncate(val): """Truncate the given text.""" return val[:MAX_TEXT_LENGTH] # Load Product data and truncate long text fields all_prods_df = pd.read_csv("product_data.csv", converters={ 'bullet_point': auto_truncate, 'item_keywords': auto_truncate, 'item_name': auto_truncate })
3、在完全加載了我們的產(chǎn)品數(shù)據(jù)集之后,進(jìn)行一些最后的預(yù)處理步驟,以清理關(guān)鍵詞字段并刪除缺失值。
# Replace empty strings with None and drop all_prods_df['item_keywords'].replace('', None, inplace=True) all_prods_df.dropna(subset=['item_keywords'], inplace=True) # Reset pandas dataframe index all_prods_df.reset_index(drop=True, inplace=True)
4、如果你持續(xù)在跟進(jìn)GitHub上的代碼步驟,可以使用all_prods_df.head()來(lái)查看數(shù)據(jù)框的前幾行。完整的數(shù)據(jù)集包含超過(guò)100,000個(gè)產(chǎn)品,但是對(duì)于這個(gè)聊天機(jī)器人,我們將其限制在2500個(gè)的子集中。
# Num products to use (subset)NUMBER_PRODUCTS = 2500 # Get the first 2500 productsproduct_metadata = ( all_prods_df .head(NUMBER_PRODUCTS) .to_dict(orient='index')) # Check one of the productsproduct_metadata[0]
02
使用Redis作為向量數(shù)據(jù)庫(kù)的設(shè)置
1、LangChain為Redis提供了一個(gè)簡(jiǎn)單的包裝器,可用于加載文本數(shù)據(jù)并創(chuàng)建捕捉“含義”的嵌入向量。在以下代碼中,我們準(zhǔn)備產(chǎn)品文本和元數(shù)據(jù),準(zhǔn)備文本嵌入的提供程序(OpenAI),為搜索索引分配一個(gè)名稱,并提供一個(gè)用于連接的Redis URL。
import os from langchain.embeddings import OpenAIEmbeddingsfrom langchain.vectorstores.redis import Redis as RedisVectorStore # set your openAI api key as an environment variableos.environ['OPENAI_API_KEY'] = "YOUR OPENAI API KEY" # data that will be embedded and converted to vectorstexts = [ v['item_name'] for k, v in product_metadata.items()] # product metadata that we'll store along our vectorsmetadatas = list(product_metadata.values()) # we will use OpenAI as our embeddings providerembedding = OpenAIEmbeddings() # name of the Redis search index to createindex_name = "products" # assumes you have a redis stack server running on local hostredis_url = "redis://localhost:6379"
2、然后,我們將它們整合在一起,創(chuàng)建Redis向量存儲(chǔ)。
# create and load redis with documentsvectorstore = RedisVectorStore.from_texts( texts=texts, metadatas=metadatas, embedding=embedding, index_name=index_name, redis_url=redis_url)
03
創(chuàng)建 LangChain 對(duì)話鏈
現(xiàn)在我們準(zhǔn)備好創(chuàng)建一個(gè)聊天機(jī)器人,使用存儲(chǔ)在Redis中的產(chǎn)品數(shù)據(jù)來(lái)進(jìn)行對(duì)話。聊天機(jī)器人因其極大的實(shí)用性而非常受歡迎。在我們下面構(gòu)建的場(chǎng)景中,我們假設(shè)用戶需要穿搭建議。
1、為了引入更多LangChain功能,我們需要導(dǎo)入幾個(gè)LangChain工具。
from langchain.callbacks.base import CallbackManagerfrom langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandlerfrom langchain.chains import ( ConversationalRetrievalChain, LLMChain)from langchain.chains.question_answering import load_qa_chainfrom langchain.llms import OpenAIfrom langchain.prompts.prompt import PromptTemplate
2、正如在介紹中提到的,這個(gè)項(xiàng)目使用了一個(gè)ConversationalRetrievalChain來(lái)簡(jiǎn)化聊天機(jī)器人的開發(fā)。
Redis作為我們的存儲(chǔ)介質(zhì),保存了完整的產(chǎn)品目錄,包括元數(shù)據(jù)和由OpenAI生成的捕捉產(chǎn)品內(nèi)容語(yǔ)義屬性的嵌入向量。通過(guò)使用底層的Redis Vector Similarity Search(VSS),我們的聊天機(jī)器人可以直接查詢目錄,以找到與用戶購(gòu)物需求最相似或相關(guān)的產(chǎn)品。這意味著您無(wú)需進(jìn)行繁瑣的關(guān)鍵字搜索或手動(dòng)過(guò)濾,VSS會(huì)自動(dòng)處理這些問題。
構(gòu)成聊天機(jī)器人的ConversationalRetrievalChain分為三個(gè)階段:
問題創(chuàng)建:在這個(gè)階段,聊天機(jī)器人評(píng)估輸入的問題,并利用OpenAI GPT模型將其與之前的對(duì)話交互知識(shí)(如果有)結(jié)合起來(lái)。通過(guò)這個(gè)過(guò)程,機(jī)器人可以更好地理解購(gòu)物者的問題,并為后續(xù)的檢索提供準(zhǔn)確的上下文。
檢索:在檢索階段,聊天機(jī)器人根據(jù)購(gòu)物者表達(dá)的興趣項(xiàng),搜索Redis數(shù)據(jù)庫(kù),以獲取最佳的可用產(chǎn)品。通過(guò)使用Redis Vector Similarity Search(VSS)等技術(shù),機(jī)器人能夠快速而準(zhǔn)確地檢索與購(gòu)物者需求相匹配的產(chǎn)品。
問題回答:在這個(gè)階段,聊天機(jī)器人從向量搜索的查詢結(jié)果中獲取產(chǎn)品信息,并利用OpenAI GPT模型幫助購(gòu)物者瀏覽選項(xiàng)。機(jī)器人可以生成適當(dāng)?shù)幕卮穑峁┯嘘P(guān)產(chǎn)品特征、價(jià)格、評(píng)價(jià)等方面的信息,以幫助購(gòu)物者做出決策。
3、雖然LangChain和Redis極大地提升了工作流程的效率,但與大型語(yǔ)言模型(如GPT)進(jìn)行交互時(shí)需要使用"提示(prompt)"來(lái)進(jìn)行溝通。我們創(chuàng)造出一組指令作為提示,以引導(dǎo)模型的行為朝著期望的結(jié)果發(fā)展。為了獲得聊天機(jī)器人的最佳效果,需要進(jìn)一步完善提示的設(shè)置。
template = """Given the following chat history and a follow up question, rephrase the follow up input question to be a standalone question.Or end the conversation if it seems like it's done.Chat History:\"""{chat_history}\"""Follow Up Input: \"""{question}\"""Standalone question:""" condense_question_prompt = PromptTemplate.from_template(template) template = """You are a friendly, conversational retail shopping assistant. Use the following context including product names, descriptions, and keywords to show the shopper whats available, help find what they want, and answer any questions. It's ok if you don't know the answer.Context:\""" {context}\"""Question:\"\""" Helpful Answer:""" qa_prompt= PromptTemplate.from_template(template)
4、接下來(lái),我們定義兩個(gè)OpenAI LLM,并分別使用鏈條對(duì)其進(jìn)行封裝,用于問題生成和問題回答。streaming_llm允許我們逐個(gè)標(biāo)記地將聊天機(jī)器人的響應(yīng)傳輸?shù)絪tdout,從而為用戶提供類似于聊天機(jī)器人的用戶體驗(yàn)。
# define two LLM models from OpenAIllm = OpenAI(temperature=0) streaming_llm = OpenAI( streaming=True, callback_manager=CallbackManager([ StreamingStdOutCallbackHandler() ]), verbose=True, max_tokens=150, temperature=0.2) # use the LLM Chain to create a question creation chainquestion_generator = LLMChain( llm=llm, prompt=condense_question_prompt) # use the streaming LLM to create a question answering chaindoc_chain = load_qa_chain( llm=streaming_llm, chain_type="stuff", prompt=qa_prompt)
5、最后,我們使用ConversationalRetrievalChain將所有三個(gè)步驟封裝起來(lái)。
chatbot = ConversationalRetrievalChain( retriever=vectorstore.as_retriever(), combine_docs_chain=doc_chain, question_generator=question_generator)
04
虛擬購(gòu)物助手已就緒
1、請(qǐng)注意,這并不是一個(gè)全能的聊天AI。然而,通過(guò)Redis的幫助,它存儲(chǔ)了完整的產(chǎn)品庫(kù)知識(shí),我們能夠打造出一個(gè)相當(dāng)出色的體驗(yàn)。
# create a chat history bufferchat_history = []# gather user input for the first question to kick off the botquestion = input("Hi! What are you looking for today?") # keep the bot running in a loop to simulate a conversationwhile True: result = chatbot( {"question": question, "chat_history": chat_history} ) print("\n") chat_history.append((result["question"], result["answer"])) question = input()
2、該機(jī)器人將實(shí)時(shí)與您交互,并根據(jù)目錄中的商品幫助您縮小選擇范圍。以下是一個(gè)簡(jiǎn)單的示例:
Hi! What are you looking for today? >> gold-plated earrings Hi there! I'm happy to help you find the perfect earrings. Do you have a preference for yellow gold plated sterling silver or platinum or gold-plated sterling silver? >> My preference is the yellow gold plated sterling silver Hi there! Are you looking for yellow gold-plated sterling silver earrings with Swarovski Zirconia or Topaz gemstones? We have a few options that might fit the bill. We have yellow gold-plated sterling silver Swarovski Zirconia fancy green stud earrings, yellow gold-plated sterling silver honey topaz stud earrings made with Swarovski Topaz gemstones, and yellow gold-plated sterling silver antique drop earrings set.
3、在聊天機(jī)器人用發(fā)出“你好!今天你在找什么?”此類的招呼后,嘗試一些示例提示,或者自己創(chuàng)建一個(gè)。
05
定制您的鏈條以提高性能
1、LangChain最好的部分之一是每個(gè)類抽象都可以擴(kuò)展或創(chuàng)建自己的預(yù)設(shè)。我們自定義BaseRetriever類,在返回結(jié)果之前執(zhí)行一些文檔預(yù)處理。
import jsonfrom langchain.schema import BaseRetrieverfrom langchain.vectorstores import VectorStorefrom langchain.schema import Documentfrom pydantic import BaseModel class RedisProductRetriever(BaseRetriever, BaseModel): vectorstore: VectorStore class Config: arbitrary_types_allowed = True def combine_metadata(self, doc) -> str: metadata = doc.metadata return ( "Item Name: " + metadata["item_name"] + ". " + "Item Description: " + metadata["bullet_point"] + ". " + "Item Keywords: " + metadata["item_keywords"] + "." ) def get_relevant_documents(self, query): docs = [] for doc in self.vectorstore.similarity_search(query): content = self.combine_metadata(doc) docs.append(Document( page_content=content, metadata=doc.metadata )) return docs
2、我們需要更新檢索類和聊天機(jī)器人,以使用上述的自定義實(shí)現(xiàn)。
redis_product_retriever = RedisProductRetriever(vectorstore=vectorstore) chatbot = ConversationalRetrievalChain( retriever=redis_product_retriever, combine_docs_chain=doc_chain, question_generator=question_generator)
3、大功告成!現(xiàn)在你的聊天機(jī)器人可以在對(duì)話中注入更多的產(chǎn)品信息。以下是另一個(gè)短對(duì)話的示例:
printf("hello Hi! What are you looking for today? >>> fancy footwear for going out Hi there! We have a few great options for women's shoes and sandals. We have the Amazon Brand - The Fix Women's Giana Open Toe Bootie with Pearl Buckle, bright white leather, 9.5 B US, Flavia Women's Beige Fashion Sandals-7 UK (39 EU) (8 US) (FL/236/BEG), Flavia Women's Blue Fashion Sandals-8 UK (40 EU) (9 US) (FL/211/BLU), and The Fix Women's Faris Flat Slide Sandal with Pearls. All of these shoes feature a variety of styles and colors to choose from. Let me know if you have any questions about any of these items! >>> These are nice. However, I am looking for men's shoes. Can you help me? Hi there! We have a great selection of men's formal shoes available. We have Amazon Brand - Symbol Men's Formal Shoes, Amazon Brand - Symbol Men's Leather Formal Shoes, and more. All of our formal shoes are made from high quality materials and feature a variety of closure types, toe styles, and heel types. They also come with a manufacturer's warranty and care instructions to ensure they last. Let me know if you have any questions or need help finding the perfect pair of shoes for you! >>>Can you show me some more men's options? Hi there! I'm here to help you find the perfect item for you. We have a few options available for men's formal shoes. We have the Men's Stainless Steel Link Bracelet, the Amazon Brand - Arthur Harvey Men's Leather Formal Shoes, and the Amazon Brand - Symbol Men's Formal Derby shoes. All of these items feature a variety of features such as leather material, lace-up closure, pointed toe, block heel, and more. If you have any questions about any of these items, please let me know. I'm happy to help! >>> Ok this looks great, thanks!world!");
-
機(jī)器人
+關(guān)注
關(guān)注
213文章
29537瀏覽量
211783 -
人工智能
+關(guān)注
關(guān)注
1804文章
48788瀏覽量
246999 -
VSS
+關(guān)注
關(guān)注
1文章
36瀏覽量
21729
發(fā)布評(píng)論請(qǐng)先 登錄
評(píng)論