Curriculum for the course How to Benchmark Embedding Models On Your Own Data
Learn how to benchmark embedding models on your own data in this course for beginners. In this course, you will learn: - The limitations of extracting text from PDF files with Python libraries and to solve that with the help of VLMs (Vision Language Models). - How to divide the extracted text into chunks that preserve context. - Generation questions for each chunk using LLMs (Large Language Models). - Use embedding models to create vector representations of the chunks and questions. - Use both open source and proprietary embedding models. - Use llama.cpp to run models in the GGUF format locally on your machine. - Perform the benchmarking of different embedding models using various metrics and statistical tests with the help of ranx. - Plot the vector representations to visualize if clusters are being formed. - Understand how to interpret the p-value that a statistical test provides. - And much more! You can find the slides, notebook, and scripts in this GitHub repository: https://github.com/ImadSaddik/Benchmark_Embedding_Models The dataset is available here: https://huggingface.co/datasets/ImadSaddik/BenchmarkEmbeddingModelsCourse To connect with Imad Saddik, check out his social accounts: LinkedIn: https://www.linkedin.com/in/imadsaddik/ YouTube: https://www.youtube.com/@3CodeCampers Website: https://imadsaddik.com/ ⭐️ Course Contents ⭐️ (0:00:00) About the course (0:06:05) Introduction (0:17:58) Extracting text from PDF documents (1:01:08) Divide text into coherent chunks (1:23:10) Generate question-answer pairs from text chunks (1:38:48) Embed text chunks and questions (2:17:06) Statistical tests and metrics (3:12:01) Expanding the dataset and adding more languages (3:45:24) ConclusionWatch Online Full Course: How to Benchmark Embedding Models On Your Own Data
Click Here to watch on Youtube: How to Benchmark Embedding Models On Your Own Data
This video is first published on youtube via freecodecamp. If Video does not appear here, you can watch this on Youtube always.
Udemy How to Benchmark Embedding Models On Your Own Data courses free download, Plurasight How to Benchmark Embedding Models On Your Own Data courses free download, Linda How to Benchmark Embedding Models On Your Own Data courses free download, Coursera How to Benchmark Embedding Models On Your Own Data course download free, Brad Hussey udemy course free, free programming full course download, full course with project files, Download full project free, College major project download, CS major project idea, EC major project idea, clone projects download free