Properties of LLMs, weak points and improvement measures for the domain adaptation of applications

The transfer of the AI paradigm “Foundation Models” in the language domain leads to Large Language Models (LLMs), which can be used to communicate in natural language and can be used in a variety of ways for different tasks due to their “broad training”. However, this requires adaptations of the models for the specific application domains. In this second part of his blog series, Wilhelm Niehoff presents the three method areas of In Context Learning (ICL), prompt engineering and fine-tuning that are used for this purpose. By addressing and using the LLMs, however, design-related weaknesses such as hallucinations, lack of topicality and expertise in detailed topics occur. In addition to the three method areas, there are “up-to-date” approaches such as DSPy and TextGrad, which aim to relieve the user of the task of constructing prompts. Accordingly, the weaknesses are eliminated by adding further components that are coordinated by LLMs.

Continue reading »

Large Language Models: Origin-Use-Further Development

Large language models (LLMs) have made a quantum leap in the field of natural language processing (NLP) over the last five years, both in terms of understanding (natural language understanding, NLU) and generation (natural language generation, NLG) in the development of communication with computers. With ChatGPT, the general public has also become aware of this. The possible uses in companies are beginning to become more and more relevant. A small series of articles will describe the emergence, integration possibilities in processes and effects of LLMs. In this article, the history of the development of LLMs up to their current status will be outlined, with the aim of presenting strengths, weaknesses and manifestations in the next article on this basis.

Continue reading »