AI quality assurance: verification and validation at the cutting edge

Legal regulations for artificial intelligence (AI) are currently being formulated worldwide and engineers who develop AI-enabled systems must comply with these newly introduced specifications and standards. The EU Commission presented the first proposal for a legal framework to regulate AI back in April 2021. AI systems were to be assessed and regulated differently depending on their risk to users. In May 2024, the EU member states finally adopted the AI Act - the first comprehensive set of regulations for AI worldwide.

The W-shaped development process is a non-linear V&V workflow that ensures the accuracy and reliability of AI models. (Image: The MathWorks, Inc.)

These rules and regulations have a significant impact on safety-critical systems with AI components in particular. V&V (verification & validation) techniques are used to ensure that the outputs of an AI model meet requirements and specifications. Verification checks whether an AI model has been created and developed in accordance with the specified requirements. Validation checks whether the product meets the customer's requirements and expectations. V&V methods also enable early detection of bugs and strategies for dealing with distorted data (data bias).

Christoph Stockhammer, Senior Application Engineer at MathWorks, explains how engineers can set up such V&V processes and the benefits they bring to the development of AI models in safety-critical systems.

Verify and validate: Why AI systems benefit from it

One advantage of using AI in safety-critical systems is that AI models can approximate physical systems and validate the design. Engineers simulate systems with AI components and use the data to test the behavior in different scenarios, including outlier events. Performing V&V ensures that an AI-supported safety-critical system can maintain the required level of performance and functionality under different conditions.

Most industries in which products with AI components are developed require their engineers to comply with safety standards before the products are launched on the market. These include the automotive and vehicle industry as well as the aviation, aerospace and defense industries. Certification processes are used to ensure the integration of certain elements into these products. Engineers use V&V to test the functionality of these elements, which makes it easier or even possible for them to obtain certifications.

From planning to practice: the building blocks of V+V processes

When performing V&V, engineers must ensure that the AI component meets the specified requirements, remains reliable under all operating conditions and is safe and therefore ready for use. The V&V process for AI involves performing software assurance activities. These consist of a combination of static and dynamic analyses, tests, formal methods and operational monitoring in real use. V&V processes may differ slightly depending on the industry, but always include the same general steps:

  • Analyze the decision-making process to solve the black box problem: The black box problem arises when engineers cannot understand how an AI model makes decisions. Feature importance analysis evaluates which input variables (e.g. environmental factors in safety-critical systems) most influence the output values of the AI model. Explainability techniques help to understand the decision logic of a model, e.g. by identifying areas in images that contribute most to model output. Both approaches promote transparency and confidence of engineers and scientists in AI systems.
  • Testing the model using representative data sets: Engineers test AI models with representative data sets to identify limitations and increase the reliability of the model. The data is cleaned and test cases are developed to evaluate aspects such as accuracy and reproducibility. Finally, the model is applied to the data sets, the results are recorded and compared with the expected output. The model design is improved based on the results of the data tests.
  • Carrying out simulations of the AI system: Simulations allow engineers to evaluate the performance of an AI system in a controlled virtual environment. Tools such as Simulink® help to analyze the system behavior under different scenarios, parameters and environmental factors. As with the data tests, the simulation results are compared with expected or known results and the model is iteratively improved.
  • Ensuring model operation within acceptable limits: In order to operate AI models safely and reliably, limits must be defined and the behavior monitored. One of the most common problems with limits occurs when a model has been trained with a specific data set and receives data outside the distribution of this data set as input at runtime. Models are trained with data augmentation (e.g. variability due to different perspectives in images) and data balancing (even distribution of data classes) to reduce bias and increase generalization capability. To make neural networks more robust and less prone to misclassification, rigorous mathematical models can be integrated into the development and validation process to demonstrate certain desirable properties of neural networks.

These steps of the V&V process are iterative and allow for continuous refinement and improvement of the AI system as engineers gather new data, gain new insights and integrate feedback from operations.

Conclusion: V&V as the key to the responsible use of AI

In the age of AI-powered safety-critical systems, V&V procedures play a crucial role in obtaining industry certifications and complying with legal regulations. Building and maintaining trustworthy systems requires the use of verification techniques that provide explainability and transparency for the AI models on which these systems run. This ensures the transparent and responsible use of AI in safety-critical systems.

Author

Christoph Stockhammer - Senior Application Engineer at MathWorks

Source: www.mathworks.com

(Visited 163 times, 1 visits today)

More articles on the topic