What is the best embedding model for OpenWebUI?
I am currently using Alibaba-NLP/gte-base-en-v1.5 but it's not very good.
And as I understand it embedding models are used to retrieve parts of data from pdf's, text documents etc. according to the user's prompt. So, I imported some harry potter books (.txt files) and asked the AI (Qwen2.5 32b) "can you recall the first paragraph of chapter 10?" but it says "The provided context does not contain the text from Chapter 10, so I cannot recall the first paragraph. Could you provide more details or clarify your request based on available information?" and when I checked the retrievals its completely different from what I want.
And the settings I used for the "Top K" value and the "RAG Template" are from this article.
These are the retrievals and not a single one of them include a paragraph all of them are just "CHAPTER (chapter number)" and "(chapter name)"
Update (for anyone who finds this post): I went into my txt file and remove the new lines between the chapter name/number and the paragraphs and now it gets the paragraph right so I think the problem is solved! but my QwenTile 2.5 is not being able to get the paragraph even though the paragraph is the top retrieval (relevance with about 53%) I think I'll try with different llms.
Update 2: I have solved it the solution was to just remove the lines between the chapter name and the paragraph in my .txt file and once I've done that it works but my LLM was not being able to get the retrievals so I had to go to the model settings in admin panel and change the context length in advance parameters from 2048 to 32k (I don't think it needs to be 32k it just makes my model slow I think I'll change it to 10k) and everything is working perfectly here are my settings and results:
IMPORTANT NOTE: MAKE SURE TO FIRST START THE CHAT WITH ANYTHING (ex: hi, hello etc.) because sometimes the LLM doesn't answer/work/take forever.