Description
llm_answer.GPTAnswer._format_reference: If the number of relevant docs in relevant_docs_list is less than retriever.EmbeddingRetriever.TOP_K, an IndexError is raised while formatting the references.
https://github.com/Wilson-ZheLin/GPT-4-Web-Browsing/blob/038b74ba3ab76f7e3ba7c1d9f33250120f376735/src/llm_answer.py#L24-L33
To Reproduce
- In
main.py: Provide a Google search query that yields less than retriever.EmbeddingRetriever.TOP_K total documents.
- E.g.,
"CS 224" harvard computer science ext:html inurl:index. Google/Serper returns only one result for this query. When scraped, the document text is less than 1000 characters (the default chunk size).
- In
serper_service.SerperClient.serper, change serper_settings["page"] to 1.
- Run
main.py
Expected Behavior
If the number of relevant docs in relevant_docs_list is less than retriever.EmbeddingRetriever.TOP_K, then all of the relevant documents should be used in the formatted reference.