Publications

Introduction to the project at the Lightning Talks "Künstliche Intelligenz und Kompetenzentwicklung" - U:FF 2021

Suitability of large language models for the analysis and evaluation of legal texts

This study examines the suitability of large pre-trained language models for analyzing and evaluating legal texts in education, showing that they are not yet fully developed in this domain.
Multilingual local models possess prior knowledge of basic concepts of the "Gutachtenstil" used for teaching purposes. By referencing carefully selected examples, they can assign components of this argumentative style, but still lag behind simpler, non-language-based models.
Large language models are particularly well-suited for evaluating and grading free-text assignments, as they already contain a lot of domain knowledge and do not need to be separately trained. In our experiments, they outperform simpler methods when evaluating English texts. However, this performance has not yet been transferable to the evaluation of complex German-language legal essays.

DeepWrite - The AI corrector?

In this article by recode.law e.V., Olesja Kaltenecker and Jeremias Forssman take a closer look at project DeepWrite. Based on an interview with Christian Braun, Simon Alexander Nonn and Sarah Großkopf from the DeepWrite research project at the University of Passau, they examine the project itself – its strengths and opportunities – as well as the challenges that still exist. With careful preparation, AI can provide appropriate feedback, especially for shorter student solutions. Both the accuracy of the content and the (appraisal-)style as well as grammatical and lexical correctness are taken into account. However, the assessment of long solutions poses a challenge, not only in terms of the coherence of an argumentation throughout an entire exam.

Conceptualising of the artificial intelligence legal learning platform DeepWrite in Germany - Focused on the user interface based on the survey results

This paper by Yujin Kang in the Korean Design Forum (한국디자인포럼) is based on a "Survey to determine the UX needs of law students", which was conducted in the winter semester 2023/24 by the Department of Law at the University of Passau. In this study, the responses of the survey participants with regard to both the User Experience and the integration of Artificial Intelligence into such a learning platform. With regard to the User Experience, the focus is on the User Interface and the design system, which reflect the requirements and preferences of future users. The positive appearance of the learning platform allows users to use the platform over a longer period of time and to maintain the users' attention. In addition, the article deals with the theoretical consideration of the AI- and design process and highlights the importance of Human-Computer-Interaction (HCI) from the point of view of User Experience Design by comparing the user interfaces of the Large Language Model ChatGPT in the versions GPT-2 and GPT-3.5.

Fantastic Prompts and Where to Find Them: A Little Guide to AI-Generated Feedback for Schools and Universities

In her blog post on fiete.ai, project collaborator Veronika Hackl outlines the basics of prompting for AI-generated feedback in the educational context. The feedback prompt process consists of three steps: definition of objectives, prompt formulation, and output evaluation. The text introduces various prompting techniques: Zero-Shot Prompting for simple feedback generation, Few-Shot Prompting for example-based learning, Chain-of-Thought Prompting for transparent evaluations, and Tree-of-Thoughts Prompting for multiple perspectives. Additionally, advanced concepts such as Hyperparameter Tuning and RAG systems (Retrieval-Augmented Generation) are explained. RAG enables the integration of proprietary documents, like teaching materials, into the feedback process. The text also addresses technical aspects, such as adjusting hyperparameters like temperature. The discussion concludes with an overview of current developments and challenges in AI-generated feedback, including integration into learning management systems and handling technical requirements.

Can ChatGPT replace university lecturers?

In this article in the JuS (Juristische Schulung), research assistants Christian Braun, Sarah Großkopf and Simon A. Nonn discuss the question of whether ChatGPT can replace university lecturers - especially with regard to the teaching of legal reasoning skills and the "Gutachtenstil" using AI feedback.

In recent years, the Covid-19 pandemic and the associated rapid developments and advances in the field of digitalization have shown that didactics in higher education is currently undergoing change and that this process can and should be actively influenced in order to maintain the future viability and competitiveness of universities. A large part of this is the use of innovative technology and tools, such as artificial intelligence (AI), in particular large language models (LLM) and natural language processing (NLP), to create digital teaching and learning spaces for students of future generations.

Is GPT-4 a reliable rater? Evaluating consistency in GPT-4's text ratings

This study reports the Intraclass Correlation Coefficients of feedback ratings produced by OpenAI's GPT-4, a large language model (LLM), across various iterations, time frames, and stylistic variations. The model was used to rate responses to tasks related to macroeconomics in higher education (HE), based on their content and style. Statistical analysis was performed to determine the absolute agreement and consistency of ratings in all iterations, and the correlation between the ratings in terms of content and style. The findings revealed high interrater reliability, with ICC scores ranging from 0.94 to 0.99 for different time periods, indicating that GPT-4 is capable of producing consistent ratings. The prompt used in this study is also presented and explained.

Performance analysis of large language models in the domain of legal argument mining

In this study we (Abdullah Al Zubaer, Michael Granitzer and Jelena Mitrović) investigate the effectiveness of GPT-3.5 and GPT-4 for argument mining in the legal domain, focusing on prompt formulation and example selection using state-of-the-art embedding models from OpenAI and sentence transformers. Our experiments demonstrate that relatively small domain-specific models outperform GPT 3.5 and GPT-4 in classifying premises and conclusions, indicating a gap in these models' performance for complex legal texts. We also observe comparable performance between the two embedding models, with a slight improvement in the local model's ability for prompt selection. Our results indicate that the structure of prompts significantly impacts the performance of GPT models and should be considered when designing them.

Presentation at the Lightning Talks "Lehre über und mit KI" - U:FF 2023

How AI is used in higher education

In this interview series, the Federal Agency for Civic Education presents three projects funded by the Federal Ministry of Education, Science and Research. As part of this series, Veronika Hackl was able to introduce the project DeepWrite to readers.

Conference Report on the Second Passau June Conference

From 24-26 June 2022, ELSA-Passau organised the second June conference under the motto "Smart Law". This conference was to address nothing less than the future of law and the digitalisation of the legal profession. In the attached conference report, the project DeepWrite, among others, is discussed.

„KI in der Hochschulbildung" by the German ministry of education and research

22./23.06.2022

Information for...

Information for...

Current students

Prospective students

Academics

Early career researchers

Businesses

Alumni and friends

Staff

Media representatives

Faculties & facilities

Administration

Central facilities

Faculties

Faculty of Law

Faculty of Social and Educational Sciences

Faculty of Humanities and Cultural Studies

School of Business, Economics and Information Systems

Faculty of Computer Science and Mathematics