File:Retrieval-Augmented Generation chatbot, part 1- LangChain, Hugging Face, FAISS, AWS.webm

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Original file(WebM audio/video file, VP9/Opus, length 24 min 7 s, 1,920 × 1,080 pixels, 531 kbps overall, file size: 91.49 MB)

Captions

Captions

Add a one-line explanation of what this file represents

Summary

[edit]
Description
English: In this video, I'll guide you through the process of creating a Retrieval-Augmented Generation (RAG) chatbot using open-source tools and AWS services, such as LangChain, Hugging Face, FAISS, Amazon SageMaker, and Amazon TextTract.

Part 2: scaling indexing and search with Amazon OpenSearch Serverless!

⭐️⭐️⭐️ Don't forget to subscribe to be notified of future videos. Follow me on Medium at https://julsimon.medium.com or Substack at https://julsimon.substack.com. ⭐️⭐️⭐️

We begin by working with PDF files in the Energy domain. Our first step involves leveraging Amazon TextTract to extract valuable information from these PDFs. Following the extraction, we break down the text into smaller, more manageable chunks. These chunks are then enriched using a Hugging Face feature extraction model before being organized and stored within a FAISS index for efficient retrieval.

To ensure a seamless workflow, we employ LangChain to orchestrate the entire process. With LangChain as our backbone, we query a Mistral Large Language Model (LLM) deployed on Amazon SageMaker. These queries include semantically relevant context retrieved from our FAISS index, enabling our chatbot to provide accurate and context-aware responses.

- Notebook: https://gitlab.com/juliensimon/huggingface-demos/-/tree/main/langchain/rag-demo-sagemaker-textract - LangChain: https://www.langchain.com/ - FAISS: https://github.com/facebookresearch/faiss - Embedding leaderboard: https://huggingface.co/spaces/mteb/leaderboard - Embedding model: https://huggingface.co/BAAI/bge-small-en-v1.5

- LLM: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
Date
Source YouTube: Retrieval-Augmented Generation chatbot, part 1: LangChain, Hugging Face, FAISS, AWS – View/save archived versions on archive.org and archive.today
Author Julien Simon

Licensing

[edit]
This video, screenshot or audio excerpt was originally uploaded on YouTube under a CC license.
Their website states: "YouTube allows users to mark their videos with a Creative Commons CC BY license."
To the uploader: You must provide a link (URL) to the original file and the authorship information if available.
w:en:Creative Commons
attribution
This file is licensed under the Creative Commons Attribution 3.0 Unported license.
Attribution: Julien Simon
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
This file, which was originally posted to an external website, has not yet been reviewed by an administrator or reviewer to confirm that the above license is valid. See Category:License review needed for further instructions.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current15:47, 11 September 202424 min 7 s, 1,920 × 1,080 (91.49 MB)Prototyperspective (talk | contribs)Imported media from https://www.youtube.com/watch?v=7kDaMz3Xnkw

The following page uses this file:

Transcode status

Update transcode status
Format Bitrate Download Status Encode time
VP9 1080P 924 kbps Completed 17:20, 11 September 2024 1 h 22 min 42 s
VP9 720P 542 kbps Completed 16:57, 11 September 2024 1 h 1 min 37 s
VP9 480P 316 kbps Completed 16:36, 11 September 2024 40 min 57 s
VP9 360P 199 kbps Completed 16:12, 11 September 2024 18 min 28 s
VP9 240P 138 kbps Completed 16:05, 11 September 2024 13 min 30 s
WebM 360P 728 kbps Completed 16:14, 11 September 2024 18 min 33 s
Streaming 144p (MJPEG) 1 Mbps Completed 15:53, 11 September 2024 1 min 38 s
Stereo (Opus) 75 kbps Completed 15:59, 11 September 2024 22 s
Stereo (MP3) 128 kbps Completed 15:59, 11 September 2024 29 s

Metadata