Large Language Models and RAG
The special feature of LLMs: they can be asked questions using natural language and also respond using natural language. The knowledge required for this was stored in the parameters of the LLM during extensive training with huge amounts of data (“parametric memory”). The sometimes inadequate quality of the answers, especially in specific domains, can be improved on the basis of “in context learning” approaches using prompt engineering methods or the more complex fine-tuning of the LLM. This is one of two dimensions of improvement options: changing the capabilities of the LLM and/or the quality of the context provided.
The last article in this series on generative artificial intelligence highlighted the characteristics and weaknesses of LLMs as well as approaches for possible improvements. So far, another dimension of LLM improvement has not been discussed in more detail: contextual enhancement by means of a Retrieval Augmented Generation (RAG) process, which enables the LLM to use external knowledge to provide up-to-date and subject-specific answers. In this blog post, Wilhelm Niehoff presents the RAG method and current development approaches.