Testing tools for AI systems
The media ubiquity of OpenAI's new AI application ChatGPT shows that artificial intelligence has reached an impressive level of maturity. The chatbot, which has been trained with data and texts from all over the Internet, responds to questions with answers that are difficult if not impossible to distinguish from texts created by humans. But what about the quality testing of AI systems?
ChatGPT has triggered a new hype around Artificial Intelligence, the possibilities of AI are impressive. At the same time, quality assurance and control of AI systems is becoming increasingly important - especially when they take on responsible tasks. This is because chatbot results are based on huge amounts of data on texts from the Internet. However, systems such as ChatGPT only calculate the most probable answer to a question and output this as a fact. But what testing tools exist to measure the quality of the texts generated by ChatGPT, for example?
KI test catalog
Testing tools in use
Researchers from Fraunhofer IAIS will also be presenting various testing tools and procedures that can be used to systematically examine AI systems along their lifecycle for vulnerabilities and safeguard against AI risks at the Fraunhofer joint booth in Hall 16, Booth A12 at Hannover Messe 2023 from April 17 to 21. The tools support developers and testing institutes in systematically evaluating the quality of AI systems and thus ensuring their trustworthiness. One example is the "ScrutinAI" tool. It enables testers to systematically search for weak points in neural networks and thus test the quality of AI applications. A concrete example is an AI application that detects anomalies and diseases on CT images. The question here is whether all types of anomalies are detected equally well, or some better and others worse. This analysis helps investigators assess whether an AI application is suitable for its intended context of use. At the same time, developers can also benefit by being able to identify shortcomings in their AI systems at an early stage and take appropriate improvement measures, such as enriching the training data with specific examples.
Source and further information: Fraunhofer IAIS