After several years of research and teaching on probability theory, mathematical statistics and information theory at the Wilhelms University of Münster, Wilhelm Niehoff (Dipl. Mathematician) headed numerous banking committees on the standardization of digital business and legislative measures in Europe at the Association of German Banks, in particular as project manager for the introduction of euro book money for German banks in Frankfurt, Brussels and Paris. He then moved to Hypovereinsbank/UniCredit in Munich to lead a global e-business project for the bank in New York, working on cross-bank initiatives in Germany. At the bank, he dealt with technological innovations and payment transactions and implemented regulatory projects as project manager, including the group-wide MiFID project, before taking over the management of Bank-Verlag from mid-2007, as spokesman for the management board until the end of 2022. Since the beginning of 2023, he has dedicated himself to highly innovative technology topics and their impact on banking business models at synscale GmbH, which he founded himself. His current focus is on artificial intelligence methods, in particular foundation models, large language models and generative AI.

Large Language Models and RAG

The special feature of LLMs: they can be asked questions using natural language and also respond using natural language. The knowledge required for this was stored in the parameters of the LLM during extensive training with huge amounts of data (“parametric memory”). The sometimes inadequate quality of the answers, especially in specific domains, can be improved on the basis of “in context learning” approaches using prompt engineering methods or the more complex fine-tuning of the LLM. This is one of two dimensions of improvement options: changing the capabilities of the LLM and/or the quality of the context provided.
The last article in this series on generative artificial intelligence highlighted the characteristics and weaknesses of LLMs as well as approaches for possible improvements. So far, another dimension of LLM improvement has not been discussed in more detail: contextual enhancement by means of a Retrieval Augmented Generation (RAG) process, which enables the LLM to use external knowledge to provide up-to-date and subject-specific answers. In this blog post, Wilhelm Niehoff presents the RAG method and current development approaches.

Continue reading »

Properties of LLMs, weak points and improvement measures for the domain adaptation of applications

The transfer of the AI paradigm “Foundation Models” in the language domain leads to Large Language Models (LLMs), which can be used to communicate in natural language and can be used in a variety of ways for different tasks due to their “broad training”. However, this requires adaptations of the models for the specific application domains. In this second part of his blog series, Wilhelm Niehoff presents the three method areas of In Context Learning (ICL), prompt engineering and fine-tuning that are used for this purpose. By addressing and using the LLMs, however, design-related weaknesses such as hallucinations, lack of topicality and expertise in detailed topics occur. In addition to the three method areas, there are “up-to-date” approaches such as DSPy and TextGrad, which aim to relieve the user of the task of constructing prompts. Accordingly, the weaknesses are eliminated by adding further components that are coordinated by LLMs.

Continue reading »

Large Language Models: Origin-Use-Further Development

Large language models (LLMs) have made a quantum leap in the field of natural language processing (NLP) over the last five years, both in terms of understanding (natural language understanding, NLU) and generation (natural language generation, NLG) in the development of communication with computers. With ChatGPT, the general public has also become aware of this. The possible uses in companies are beginning to become more and more relevant. A small series of articles will describe the emergence, integration possibilities in processes and effects of LLMs. In this article, the history of the development of LLMs up to their current status will be outlined, with the aim of presenting strengths, weaknesses and manifestations in the next article on this basis.

Continue reading »