In VDE-AR-E 2842-61 the VDE has developed a whole family of standards for trustworthy autonomous/cognitive systems such as AI systems. Even though these “application rules” are not specific to a particular domain, e.g., medical devices, they are still a treasure trove for many medical device manufacturers.
This article will explain to you:
AI systems are a subset of autonomous/cognitive systems that use artificial intelligence techniques.
Autonomous/cognitive systems are defined as follows:
A system is described as autonomous (cognitive) if it can achieve a specified goal independently and in a manner adapted to the situation without human control or detailed programming.
Source: Fachforum Autonome Systeme im Hightech-Forum: Autonome Systeme – Chancen und Risiken für Wirtschaft, Wissenschaft und Gesellschaft. Long version, final report, Berlin, April 2017
The definition in the application rule is:
“is a technical system that is able to generate autonomous and cognitive behavior. Within the context of this AR an A/C-system is part of a solution.
The term autonomous/cognitive system and especially the German variant “autonom/kognitives System” is a new term, a made-up word coined by this AR. It denotes the special characteristic of complex systems in complex environments (covered by the solution level), trustworthiness aspects and potentially but not necessarily the use of AI in one or more elements of the system. Furthermore it takes into account the common use of “autonomous” in the public along with the expectations on complex behavior of such systems (e.g. in a shop one would rather order an “autonomous car” than a “fully automated car”).”
VDE-AR-E 2842-61-1, section 3.1.8
The standard does not define the property “autonomous” in quite the same way. But, in the context of the standard, “autonomous” means “without human control.” The standard also uses the terms “cognitive” and “cognitive loop” to describe situation-specific behavior.
As there are a very high number of situations that such systems have to be able to react to, “detailed programming” is usually not possible. Therefore, a lot of autonomous/cognitive systems use artificial intelligence techniques.
Conversely, however, not all systems that use AI are also AI systems and thus they are not all autonomous/cognitive systems. For example, software that uses AI to detect cancer on a CT image is not an AI system according to this definition.
Read the article on autonomous systems to find out what the specific advantages and risks of such systems are and what regulatory requirements have to be complied with.
This article uses the term “AI system” from here on, as this is the more popular term and “AI systems” fall within the scope of VDE-AR-E 2842-61.
Examples of AI systems include:
Trustworthiness should be understood here as a meta-term that covers safety, cybersecurity, effectiveness, usability, etc.
Trustworthiness […] combines several aspects of trustworthiness in a quite generic way: for every product the set of aspects can be suitably selected and remains unchanged throughout the project. Aspects of trustworthiness include but are not limited to system safety, functional safety, safety of use, security, usability, ethical and legal compliance, reliability, availability, maintainability, and (intended) functionality.
VDE-AR-E 2842-61-1 section 3.1.43
The article on autonomous systems has already detailed some of the risks that are specific to this class of system. These include risks resulting from:
There are also additional risks specific to AI systems, which are described in the following sections.
Manufacturers must define an intended purpose that specifies the
for the AI system.
If the intended purpose is not clear with regard to these aspects, the foundations for all subsequent development phases will be lacking. For example, it must be clear whether the surgical robot can also be used to operate on a knee that already contains implants.
Cobots (collaborative robots), in particular, can be used for different purposes with simple reprogramming or “teaching.” However, this doesn’t mean that the manufacturer is saved from having to define the intended purpose for these individual use cases.
Because it is very difficult to predict every situation, manufacturers do not always manage to produce complete specifications.
Even when manufacturers anticipate a situation, it is often difficult to specify the optimal system behavior for each situation.
Without these precise specifications and product requirements, development departments and data scientists will find it difficult to derive specific requirements for AI models and for collecting data for their training.
If a situation was not anticipated during the product specification and development phases, the behavior of the product in this situation is not always predictable.
The requirements must be clearly and specifically documented at all abstraction layers in development. This also applies to AI systems and the AI components they contain.
During the development phase in particular, self-learning AI components tempt developers to write unclear specifications as these AI components will “learn what they have to do.” But this is a misunderstanding: AI components can learn “how” to do something, but not the “what.”
AI components cannot be used as a catch-all for unclear or inaccurate specification. This would doom the development to failure. Therefore, the requirements, including the trustworthiness attributes (performance, safety, security, usability, etc.), must be clearly traceable.
Any technical errors made during the development of AI models can also lead to risks from AI systems, for example:
A comprehensive collection of best practices for minimizing these risks can be found in the Johner Institute’s AI Guidelines, a modified version of which is used by notified bodies.
In the case of AI systems especially, poor usability can lead to particularly high risks. This “human factor” aspect can even lead to an “irony of automation.” These ironies include:
Deep fakes are just one example of how AI can be abused. In the case of medical devices that use AI, it has been shown, at least in the laboratory, how systems for the classification of images can be fooled or caused to divulge sensitive (training) data.
Application rule VDE-AR-E 2842-61 aims to contribute towards controlling these risks and thus ensuring the trustworthiness of AI systems.
In this context, VDE-AR-E 2842-61 claims to cover the entire product life cycle from the product idea through to the phase known to the medical device world as “post-market surveillance”
VDE-AR-E 2842-61 offers several approaches to handling the risks posed by AI systems in the event of incomplete or unclear intended purposes.
Problem | Solutions in VDE-AR-E 2842-61 |
For generic systems such as cobots (see above), no specific intended purpose is defined. These systems can be easily reprogrammed or adapted for new intended purposes. | Concepts for generic proofs of safety (“trustworthiness out of context”) based on the automotive standard ISO 26262 |
No clear expectations, but a belief that “the AI will somehow solve everything intelligently” | Definition of use cases and the intended benefit These are backed up with a clear ontology, which is also used in the requirements description (incl. traceability) and is refined and made usable in later phases up to data set coverage metrics. |
The AI system is not considered over its entire life cycle. For example, aspects of the intended use, such as the update and maintenance of the system, are missing. | Model the product lifecycle using a customer journey map or UX/experience map |
VDE-AR-E 2842-61 also offers solutions for the risks described above resulting from specification gaps and unknown situations.
Problem | Solutions in VDE-AR-E 2842-61 |
Imprecise and ambiguous specifications | Modeling and notations (see above):
|
The division of tasks and responsibilities between users and the AI system is not precisely defined (e.g., between the surgeon and surgical robot) | Defined notation (e.g., BPMN/SysML), to model a “solution concept” for the system black box. |
Specific risks caused by different situations, e.g., availability of system components |
|
AI is used as a placeholder for unclear functionality or technical implementation | Formulate a functional model of the AI system based on sense-plan-act or another cognitive theory. This also provides the “white box” model of the [text missing] in the next phase. |
VDE-AR-E 2842-61 also offers solutions for typical risks during the development of AI systems.
Problem | Solutions in VDE-AR-E 2842-61 |
The output of data-driven models is subject to uncertainty. |
|
The design does not sufficiently take into account the requirements of the specification. |
|
Selected architecture is not the “best” | Use design patterns to demonstrate trustworthiness and achieve certain AI-relevant product properties (continuous learning, explainability, etc.) |
Problem | Solutions in VDE-AR-E 2842-61 |
Proofs of safety are harder to produce |
|
Suboptimal model chosen |
|
VDE-AR-E 2842-61 describes solutions for all levels of the verification and validation.
Problem | Solutions in VDE-AR-E 2842-61 |
Incorrect conclusions from test results, e.g., because a claim is made in the trustworthiness assurance case based on test results, but this claim is not valid. For example, because specifics of the actual use context (target application scope) were not taken into account |
|
Incomplete proofs | Clear definition of these objectives based on the trustworthiness analysis; traceability; and proof in the “trustworthiness assurance case” with suitable, structured argumentation (e.g., with GSN) that uses appropriate tangibles (test reports, analyses, etc.) as evidence. |
Problem | Solutions in VDE-AR-E 2842-61 |
The assurance cases contain a lot of assumptions about the use context. These could prove to be inaccurate in practice. |
|
The VDE-AR-E 2842-61 is applicable to all industries and all applications that fall into the class of autonomous/cognitive systems, especially AI systems. It also refers to these systems as “systems of systems.” Therefore, in the context of medical device law, “system” would mean “device,” not a system in the sense of Article 22 (“Systems and procedure packs”).
However, VDE-AR-E 2842-61 does not make any specific reference to medical devices.
Nevertheless, the standard is also recommended for medical devices to help build safety arguments to use with authorities and notified bodies. It adds new aspects to existing regulations, such as uncertainty.
VDE-AR-E 2842-61 is recommended
The application rule VDE-AR-E 2842-61 consists of a whole family of standards.
Part | Title | Status |
VDE-AR-E 2842-61-1 | Terms and concepts | Available |
VDE-AR-E 2842-61-2 | Management | Available |
VDE-AR-E 2842-61-3 | Development at Solution Level | Completed, in approval |
VDE-AR-E 2842-61-4 | Development at System Level | Expected for 2021-Q2 |
VDE-AR-E 2842-61-5 | Development at Technology Level | Expected for 2021-Q2 |
VDE-AR-E 2842-61-6 | After Release of the Solution | Available |
VDE-AR-E 2842-61-7 | Application Guide | Postponed |
Standard concept | Example |
The standards divide the requirements along the life cycle of the device into sections and subsections. | The 3rd part of the standard contains sections 7 “Solution Concept” and 8 “Trustworthiness Concept.” |
Every section requires the person responsible to set objectives. | The objectives of the “trustworthiness concept” include:
|
In order to achieve the respective objective, certain tasks must be completed. | The standard pairs the objective “determining the relevant aspects of trustworthiness” with the corresponding task. |
Each task consists of a set of activities. | The 10 activities include identifying relevant standards (e.g., IEC 61508) and defining the trustworthiness aspects (e.g., those of ISO 25020). |
For some of these activities, the standard specifies the inputs that have to be taken into account. | To define the trustworthiness aspects, the person(s) responsible must take the user requirements and the regulatory requirements into account. |
For these activities, means, such as tools, templates or other resources, can be defined. | These means include, for example, the aforementioned standards and literature references. |
VDE-AR-E 2842-61 is not a harmonized standard. No harmonization is planned either. The likelihood of an auditor or reviewer at a notified body requiring this family of standards as the state of the art is (still) low.
The concepts of the family of standards complement those of ISO 14971, IEC 60601 and IEC 62304 well. This applies in particular to
The VDE-AR-E 2842-61 family of standards takes a very systematic approach. It uses a data model and itself uses a clear and largely comprehensive terminology.
It is also good that it covers the entire lifecycle of autonomous/cognitive systems, such as AI systems, and aligns its structure with these lifecycle phases. This makes assignment easier.
The authors are obviously experts in AI systems who think in logical structures and concepts.
VDE-AR-E 2842-61 is not a sector-specific standard. Therefore, the reader is sometimes not quite sure how to implement the concepts presented in accordance with ISO 14971. This is also due to the fact that the family of standards does not (yet) define key terms such as “risk” and “hazard,” and does not (yet) specify which solution should be used for which risk.
The seven parts (which have not all been published yet) are, together, several hundred pages long. And, in some places, it all seems a bit academic. Some sentences leave you wishing for more precision:
The person responsible for the solution level shall use the knowledge gained on hazards and degraded modes to define trustworthiness measures to cover all trustworthiness goals and further constraints given by their attributes (e.g. timely detection and control of relevant functional – consider safe state).
Source: VDE-AR-E 2842-61-3, section 8
But the annexes do contain examples. Nevertheless, it would have been better if the seventh part (the “Application Guide”, of all things) had not been put on hold.
Anyone who, for example, like McKinsey consultants, uses the MECE principle and the “pyramid concept” will wonder whether the hierarchy of concepts is sufficiently clear-cut. The following example is from the eighth section of the third part (section 3-8):
Element | Example from VDE-AR-E 2842-61 | Comment |
Objectives | The objectives of this section are: (1) to define the applicable trustworthiness aspects and to integrate relevant analysis methods from other standards; | That is more of a task than an objective. The actual objective of this section is to develop a “trustworthy solution concept.” |
Tasks | to define the applicable trustworthiness aspects and to integrate relevant analysis methods from other standards; | This is exactly the same wording as one of the objectives. |
Activities | The person responsible for the solution level shall define the applicable trustworthiness aspects. | This wording is almost exactly the same as the previous wording. |
Anyone who works in AI system development should not only be familiar with VDE-AR-E 2842-61, they should use it. It provides a good overview of the state of the art and helps to ensure that no relevant life cycle activities are forgotten.
Therefore, the family of standards concerns not only development but all lifecycle phases: from defining the intended purpose through to the post-market surveillance.
The family of standards combines well with the concepts in ISO 14971, IEC 60601-1 and IEC 61508. It also provides valuable guidance to manufacturers of devices that fall within the scope of IEC 60601-1, even if these medical devices are not autonomous/cognitive systems.
Users of VDE-AR-E 2842-61 must be able to think in abstract concepts and apply the best practices it contains to specific use cases. This requires a high level of skill. Skills that should be expected from individuals who develop AI systems for medicine.
VDE-AR-E-2842-61 is available from VDE.
Dr. Rasmus Adler of the Fraunhofer IESE and Dr. Henrik Putzer of fortiss, the research institute of the Free State of Bavaria for software-intensive systems, and Cogitron contributed to this article. Both will be happy to answer any questions.