LangChainのQuickstartを読む (3) - 会話履歴を利用する

本シリーズでは、LangChainの Quickstart の内容を元にLangChainの使い方について紹介します。

これまで作成してきたチェインはいずれも、単発の質問に対する回答を生成するものでした。
今回は、前回作成した、ドキュメントを参照して回答を行うチェインを、
会話履歴を考慮して回答するように拡張します。

  • retriever: 入力された文字列に対し、関連するドキュメントのリストを返すretriever。
  • document_chain: ユーザの質問とドキュメントのリストを元に、LLMの回答を生成するチェイン。
  • create_retrieval_chain: retrieverdocument_chainを使用して、ドキュメントを参照して回答するチェインを作成する関数。
import os
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import FAISS
from langchain_text_splitters import RecursiveCharacterTextSplitter

# APIキーを環境変数に設定
with open('.openai') as f:
    os.environ['OPENAI_API_KEY'] = f.read().strip()

# LLM読み込み
llm = ChatOpenAI()

# Webページのコンテンツをドキュメント化
loader = WebBaseLoader("https://python.langchain.com/docs/get_started/introduction")
docs = loader.load()

# Embeddings読み込み
embeddings = OpenAIEmbeddings()

# ドキュメントを分割
text_splitter = RecursiveCharacterTextSplitter()
documents = text_splitter.split_documents(docs)

# 分割したドキュメントをembeddingsを用いてベクトル化し、ベクトルストアを作成
vector = FAISS.from_documents(documents, embeddings)

# retriever作成
retriever = vector.as_retriever()

会話履歴を考慮するために、以下2つの調整を行います。

  1. retrieverを会話履歴を参照するように変更する。
  2. document_chainを会話履歴を参照するように変更する。

会話履歴を考慮して関連するドキュメントを抽出するために、以下の動作を行う retriever を作成します。

  1. 会話履歴を元に、検索クエリをLLMを用いて生成する。
  2. 生成したクエリを使用して、ベクトルストアから該当するドキュメントを取得する。

初めに、会話履歴を使用するプロンプトテンプレートを作成します。
MessagesPlaceholderを用いると、メッセージのリストを入力できるようになります。

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

template = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "上記会話に関連する情報を得るための検索クエリを1つ生成してください。")
])

以下のように、chat_historyにはメッセージのリストを入力します。

from langchain_core.messages import HumanMessage, AIMessage


query_chain = template | llm

chat_history = [
    HumanMessage(content="LangChainのインストール方法は?"),
    AIMessage(content="pip install langchain でインストール出来ます。")
]
search_query = query_chain.invoke({
    "chat_history": chat_history,
    "input": "使い方は?"
})
search_query
  • 実行結果
AIMessage(content='LangChainの使い方やデモンストレーションに関する情報を得るための検索クエリ:\n\n- "LangChain usage examples"\n- "LangChain demonstration tutorial"\n- "How to use LangChain for language processing"', response_metadata={'token_usage': {'completion_tokens': 61, 'prompt_tokens': 76, 'total_tokens': 137}, 'model_name': 'gpt-3.5-turbo', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-xxxxx')

次に、生成した検索クエリを用いて関連ドキュメントを取得します。 これは前回作成した retriever を用いて簡単に実行できます。

retriever.invoke(search_query.content)
  • 実行結果

    [Document(page_content="Skip to main contentComponentsIntegrationsGuidesAPI ReferenceMorePeopleVersioningContributingTemplatesCookbooksTutorialsYouTube🦜️🔗LangSmithLangSmith DocsLangServe GitHubTemplates GitHubTemplates HubLangChain HubJS/TS Docs💬SearchGet startedIntroductionQuickstartInstallationUse casesQ&A with RAGExtracting structured outputChatbotsTool use and agentsQuery analysisQ&A over SQL + CSVMoreExpression LanguageGet startedRunnable interfacePrimitivesAdvantages of LCELStreamingAdd message history (memory)MoreEcosystem🦜🛠️ LangSmith🦜🕸️LangGraph🦜️🏓 LangServeSecurityGet startedOn this pageIntroductionLangChain is a framework for developing applications powered by large language models (LLMs).LangChain simplifies every stage of the LLM application lifecycle:Development: Build your applications using LangChain's open-source building blocks and components. Hit the ground running using third-party integrations and Templates.Productionization: Use LangSmith to inspect, monitor and evaluate your chains, so that you can continuously optimize and deploy with confidence.Deployment: Turn any chain into an API with LangServe.Concretely, the framework consists of the following open-source libraries:langchain-core: Base abstractions and LangChain Expression Language.langchain-community: Third party integrations.Partner packages (e.g. langchain-openai, langchain-anthropic, etc.): Some integrations have been further split into their own lightweight packages that only depend on langchain-core.langchain: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.langgraph: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.langserve: Deploy LangChain chains as REST APIs.The broader ecosystem includes:LangSmith: A developer platform that lets you debug, test, evaluate, and monitor LLM applications and seamlessly integrates with LangChain.Get started\u200bWe recommend following our Quickstart guide to familiarize yourself with the framework by building your first LangChain application.See here for instructions on how to install LangChain, set up your environment, and start building.noteThese docs focus on the Python LangChain library. Head here for docs on the JavaScript LangChain library.Use cases\u200bIf you're looking to build something specific or are more of a hands-on learner, check out our use-cases.", metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'}),
     Document(page_content='Introduction | 🦜️🔗 LangChain', metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'}),
     Document(page_content="They're walkthroughs and techniques for common end-to-end tasks, such as:Question answering with RAGExtracting structured outputChatbotsand more!Expression Language\u200bLangChain Expression Language (LCEL) is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.Get started: LCEL and its benefitsRunnable interface: The standard interface for LCEL objectsPrimitives: More on the primitives LCEL includesand more!Ecosystem\u200b🦜🛠️ LangSmith\u200bTrace and evaluate your language model applications and intelligent agents to help you move from prototype to production.🦜🕸️ LangGraph\u200bBuild stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain primitives.🦜🏓 LangServe\u200bDeploy LangChain runnables and chains as REST APIs.Security\u200bRead up on our Security best practices to make sure you're developing safely with LangChain.Additional resources\u200bComponents\u200bLangChain provides standard, extendable interfaces and integrations for many different components, including:Integrations\u200bLangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of integrations.Guides\u200bBest practices for developing with LangChain.API reference\u200bHead to the reference section for full documentation of all classes and methods in the LangChain and LangChain Experimental Python packages.Contributing\u200bCheck out the developer's guide for guidelines on contributing and help getting your dev environment set up.Help us out by providing feedback on this documentation page:NextIntroductionGet startedUse casesExpression LanguageEcosystem🦜🛠️ LangSmith🦜🕸️ LangGraph🦜🏓 LangServeSecurityAdditional resourcesComponentsIntegrationsGuidesAPI referenceContributingCommunityDiscordTwitterGitHubPythonJS/TSMoreHomepageBlogYouTubeCopyright © 2024 LangChain, Inc.", metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'})]
    

create_history_aware_retrieverを使用すると、ここまでの処理を自動で行うretrieverを作成できます。

from langchain.chains import create_history_aware_retriever
from langchain_core.prompts import MessagesPlaceholder


template = ChatPromptTemplate.from_messages([
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
    ("user", "上記会話に関連する情報を得るための検索クエリを生成してください。")
])
retriever_chain = create_history_aware_retriever(llm, retriever, template)
  • 実行例

    chat_history = [
        HumanMessage(content="LangChainのインストール方法は?"),
        AIMessage(content="pip install langchain でインストール出来ます。")
    ]
    context = retriever_chain.invoke({
        "chat_history": chat_history,
        "input": "もう少し詳しく教えて下さい。"
    })
    context
    
    • 実行結果

      [Document(page_content="Skip to main contentLangChain v0.2 is coming soon! Preview the new docs here.ComponentsIntegrationsGuidesAPI ReferenceMorePeopleVersioningContributingTemplatesCookbooksTutorialsYouTube🦜️🔗LangSmithLangSmith DocsLangServe GitHubTemplates GitHubTemplates HubLangChain HubJS/TS Docs💬SearchGet startedIntroductionQuickstartInstallationUse casesQ&A with RAGExtracting structured outputChatbotsTool use and agentsQuery analysisQ&A over SQL + CSVMoreExpression LanguageGet startedRunnable interfacePrimitivesAdvantages of LCELStreamingAdd message history (memory)MoreEcosystem🦜🛠️ LangSmith🦜🕸️LangGraph🦜️🏓 LangServeSecurityGet startedOn this pageIntroductionLangChain is a framework for developing applications powered by large language models (LLMs).LangChain simplifies every stage of the LLM application lifecycle:Development: Build your applications using LangChain's open-source building blocks and components. Hit the ground running using third-party integrations and Templates.Productionization: Use LangSmith to inspect, monitor and evaluate your chains, so that you can continuously optimize and deploy with confidence.Deployment: Turn any chain into an API with LangServe.Concretely, the framework consists of the following open-source libraries:langchain-core: Base abstractions and LangChain Expression Language.langchain-community: Third party integrations.Partner packages (e.g. langchain-openai, langchain-anthropic, etc.): Some integrations have been further split into their own lightweight packages that only depend on langchain-core.langchain: Chains, agents, and retrieval strategies that make up an application's cognitive architecture.langgraph: Build robust and stateful multi-actor applications with LLMs by modeling steps as edges and nodes in a graph.langserve: Deploy LangChain chains as REST APIs.The broader ecosystem includes:LangSmith: A developer platform that lets you debug, test, evaluate, and monitor LLM applications and seamlessly integrates with LangChain.Get started\u200bWe recommend following our Quickstart guide to familiarize yourself with the framework by building your first LangChain application.See here for instructions on how to install LangChain, set up your environment, and start building.noteThese docs focus on the Python LangChain library. Head here for docs on the JavaScript LangChain library.Use cases\u200bIf you're looking to build something specific or are more of a hands-on learner, check out our use-cases.", metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'}),
      Document(page_content='Introduction | 🦜️🔗 LangChain', metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'}),
      Document(page_content="They're walkthroughs and techniques for common end-to-end tasks, such as:Question answering with RAGExtracting structured outputChatbotsand more!Expression Language\u200bLangChain Expression Language (LCEL) is the foundation of many of LangChain's components, and is a declarative way to compose chains. LCEL was designed from day 1 to support putting prototypes in production, with no code changes, from the simplest “prompt + LLM” chain to the most complex chains.Get started: LCEL and its benefitsRunnable interface: The standard interface for LCEL objectsPrimitives: More on the primitives LCEL includesand more!Ecosystem\u200b🦜🛠️ LangSmith\u200bTrace and evaluate your language model applications and intelligent agents to help you move from prototype to production.🦜🕸️ LangGraph\u200bBuild stateful, multi-actor applications with LLMs, built on top of (and intended to be used with) LangChain primitives.🦜🏓 LangServe\u200bDeploy LangChain runnables and chains as REST APIs.Security\u200bRead up on our Security best practices to make sure you're developing safely with LangChain.Additional resources\u200bComponents\u200bLangChain provides standard, extendable interfaces and integrations for many different components, including:Integrations\u200bLangChain is part of a rich ecosystem of tools that integrate with our framework and build on top of it. Check out our growing list of integrations.Guides\u200bBest practices for developing with LangChain.API reference\u200bHead to the reference section for full documentation of all classes and methods in the LangChain and LangChain Experimental Python packages.Contributing\u200bCheck out the developer's guide for guidelines on contributing and help getting your dev environment set up.Help us out by providing feedback on this documentation page:NextIntroductionGet startedUse casesExpression LanguageEcosystem🦜🛠️ LangSmith🦜🕸️ LangGraph🦜🏓 LangServeSecurityAdditional resourcesComponentsIntegrationsGuidesAPI referenceContributingCommunityDiscordTwitterGitHubPythonJS/TSMoreHomepageBlogYouTubeCopyright © 2024 LangChain, Inc.", metadata={'source': 'https://python.langchain.com/docs/get_started/introduction', 'title': 'Introduction | 🦜️🔗 LangChain', 'description': 'LangChain is a framework for developing applications powered by large language models (LLMs).', 'language': 'en'})]
      

次に、ユーザ入力とドキュメントリストからLLMの回答を生成する document_chain を、
会話履歴を参照するように変更を加えます。

これは以下のようにテンプレートに変更を加えることで実現できます。

from langchain.chains.combine_documents import create_stuff_documents_chain


template = ChatPromptTemplate.from_messages([
    ("system", "次の文脈を元にユーザの質問に回答してください:\n\n{context}"),
    MessagesPlaceholder(variable_name="chat_history"),
    ("user", "{input}"),
])
document_chain = create_stuff_documents_chain(llm, template)
  • 実行例

    chat_history = [
        HumanMessage(content="LangChainのインストール方法は?"),
        AIMessage(content="pip install langchain でインストール出来ます。")
    ]
    document_chain.invoke({
        "context": context,
        "chat_history": chat_history,
        "input": "もう少し詳しく教えて下さい。"
    })
    
    • 実行結果

      'LangChainをインストールするためには、以下の手順を実行してください:\n\n1. pipを使用してLangChainをインストールします:\n```\npip install langchain\n```\n\n2. 環境をセットアップするために、LangChainの開発環境を構築します。これには、Pythonの開発環境や必要な依存関係のインストールが含まれます。\n\n3. LangChainを使用してアプリケーションを構築する準備が整いました。Quickstartガイドを参照して、最初のLangChainアプリケーションを構築してください。\n\n詳細な手順や環境構築の詳細については、LangChainのドキュメントを参照してください。'
      

最後に、作成したretriever_chaindocument_chainを使用して、
会話履歴とドキュメントを入力としてLLMから回答を生成するチェインを作成します。
これは前回同様に create_retrieval_chain が使用できます。

from langchain.chains import create_retrieval_chain


retrieval_chain = create_retrieval_chain(retriever_chain, document_chain)

以下のように、会話履歴とユーザ入力を入力して回答を生成することが出来ます。

chat_history = [
    HumanMessage(content="LangChainのインストール方法は?"),
    AIMessage(content="pip install langchain でインストール出来ます。")
]
response = retrieval_chain.invoke({
    "chat_history": chat_history,
    "input": "使い方は?"
})
print(response["answer"])
  • 実行結果

    LangChainの使い方については、以下の手順を参考にしてください:
    
    1. LangChainのクイックスタートガイドに従って、最初のLangChainアプリケーションを構築してください。
    2. LangChainをインストールし、環境を設定してビルドを開始する方法については、こちらを参照してください。
    3. Python LangChainライブラリに関するドキュメントはこちらで提供されています。
    4. より具体的なユースケースや手順を知りたい場合は、利用事例をご覧ください。
    
    以上の手順に従うことで、LangChainの使用方法を理解し、開発を進めることができます。
    

関連記事