You are watching a preview-version of the website. Click here to log out.

Journal of Software Engineering for Autonomous Systems

In Press, Uncorrected Proof, Available Online: 2 April 2026
AI Safety Requirements: A Perspective From the Automotive World
Downloads:
12
Full-Text Views:
5
Citations (Scopus):
0
Citations (Crossref):
0
Cite This Article

1. INTRODUCTION

Artificial Intelligence (AI), especially Generative AI (Gen-AI), is rapidly getting adopted across domains while its societal impact especially on the safety aspects is not well studied [1]. A domain that has strict safety standards and has seen adoption of AI at scale is automotive. With an estimated 1.4 billion cars on the road and a direct market capitalization of 3 trillion dollars1, it has seen adoption of AI before the GPT showed its disruptiveness2. This article focuses on how safety of automotive software and the electronics that run the software was handled pre-GPT era. In the automotive world, these safety aspects are referred to as functional safety [2] and safety of intended functionality [3]).

There is an inadequacy in safety of automotive software and electronics that run it, evidenced by software and related defects leading to the recall of 7.5 million and 5.5 million vehicles in the United States in 2020 and 2021 respectively. These recalls dominated the top reasons for recalls both in the number of recalls and the number of affected vehicles3. Another example is the infamous Uber automated driving vehicle crash, which led to the death of a pedestrian4. NTSB’s5 investigation found that software and inadequate safety culture in developing software and related systems were among the reasons for this crash6. Such inadequacies in ensuring safety can cause fatalities, economic losses, brand damage, destruction of traffic infrastructure, and indirect (economic) losses. A fundamental step for safer automotive software systems is incorporating safety into the automotive product life cycle right from the requirement elicitation stage.

There are many kinds of safety requirements based on the underlying cause of the requirement. For instance, safety requirements caused by deficiencies in specified driving behavior [4], safety requirements relating to failure or malfunction of components (also referred to as functional safety [2]), safety requirements resulting from functional insufficiencies of the intended functionality (also referred to as safety of intended functionality [3]). This article focuses on the latter two types of safety requirements, in the context of systems, software, and hardware that runs the software. In the rest of this article, safety requirements refer to these two types.

Our goal is to characterize the state-of-the-art in safety requirement elicitation processes and techniques in the automotive context via a Systematic Literature Review (SLR) in the pre-GPT era (until mid-2021).

Secondary studies on safety requirement elicitation focused on: (a) the broader safety-critical systems domain [5]; (b) practices and challenges in the development of embedded systems [6]; (c) integration between requirement and safety engineering [7]; (d) managing safety in mobile robotic systems [8]; and (e) security in the context of product lines [9]. However, to our knowledge, there are no secondary studies on safety requirement elicitation for automotive systems.

A secondary study on safety requirement elicitation for automotive systems can consolidate dispersed knowledge and provide structured insights into what processes and techniques are most effective for capturing safety needs in this uniquely complex and high-stakes domain. Unlike other safety-critical fields, automotive systems depend on the operator (driver) who is not trained to handle safety-critical issues. Other safety-critical domains like aviation, space, and nuclear always rely on highly trained operators. Furthermore, the industry is moving towards automated driving, thereby eliminating the operator from the loop. Thus, introducing the scenario of autonomous systems working among humans and non-automated driving vehicles.

The secondary study can also help current industry practice because without synthesized evidence on best-known elicitation practices, organizations risk repeating mistakes, ultimately impacting safety assurance, regulatory compliance, and timely development of reliable vehicle software.

This article presents the first SLR on safety requirement elicitation for automotive software and systems. Specifically, we focus on the technical aspects of safety requirement elicitation, including different processes, their steps, techniques to perform these steps, and use cases for each process.

We translate our goal into the following research questions:

  • RQ1: What processes are used for or applicable to safety requirement elicitation in the automotive domain?

    By answering RQ1, we intend to identify, summarize, and compare the various safety requirement elicitation processes.

  • RQ2: What techniques are used for safety requirement elicitation in the automotive domain?

    In RQ2, we dive deeper into the techniques (used for different steps in the processes) and compare and taxonomize them.

Our primary contributions are:

  • A summary, analysis and synthesis of the body of knowledge in safety requirement elicitation for the automotive domain through an SLR over 102 primary studies.

  • Empirical validation of the need for such an SLR via a systematic qualitative analysis.

  • Taxonomies of techniques for safety requirement elicitation in the automotive domain.

This study targets practitioners and researchers alike. For researchers, this study outlines the automotive safety requirements elicitation research field, research gaps and future research opportunities. For practitioners, this article provides a concise guide toward choosing one or more processes and techniques for safety requirement elicitation in their projects.

The rest of this article is structured as follows. Section 2 presents this study’s planning phase and design choices. Section 3 describes an overview of the research landscape via qualitative metrics. Sections 4 and 5 answer our research questions and discuss our findings. Section 6 elaborates implications of this study for research and practice, while Section 7 presents validity threats. Related work is outlined in Section 8. Finally, Section 9 concludes the article. Note that in the rest of the article the terms AI and ML are used interchangeably to denote the same underlying set of technologies.

2. STUDY DESIGN

The different steps in conducting an SLR can be organized into three phases: planning, conducting and reporting [10]. This section primarily discusses the study’s planning phase and our design choices in detail. Reporting includes drawing conclusions, considering threats and disseminating results which are covered in later sections of the article.

The planning phase of this study is divided into the following 5 stages (based on various guidelines [10,11,12,13,14,15]):

  • Evaluate the need for this SLR (Garner et al. [11]);

  • Form a search strategy (Kitchenham et al. [10], Petticrew et al. [12], and Petersen et al. [13]);

  • Create a primary studies’ selection procedure (Kitchenham et al. [10]);

  • Identify quality assessment criteria for the primary studies (Kitchenham et al. [10], Tiwari et al. [14], and Wieringa et al. [15]); and

  • Extract and synthesize data and insights from primary studies (Kitchenham et al. [10]).

The rest of this section explains each of these stages in detail.

2.1. Need for SLR

To evaluate the need for this study, first we conduct an initial search for secondary studies, followed by a qualitative empirical evaluation.

For the initial search, we use the search string: “functional safety” AND (automotive* OR vehic*) AND (“systematic map” OR “systematic mapping” OR “systematic literature”) . The search string is created by combining the topic (“functional safety”), domain (automotive* OR vehic*) and the kind of studies we are looking for (“systematic map” OR “systematic mapping” OR “systematic literature”). Note that we used the term “functional safety” instead of “safety” since the former resulted in a low signal-to-noise ratio. Functional safety refers to safety concerning (malfunction of) software and related systems. The term is universally adopted in the automotive domain [2,3].

Our Google Scholar search (on 10 November 2020) did not identify any secondary study that focuses on safety requirement elicitation in the automotive domain. However, we found an SLR by Martins et al. [5] on approaches to elicit, model, specify and validate safety requirements in the broader context of safety-critical systems. In this SLR, automotive formed 5.29% (8 out of 151) of all the primary studies [5]. Given the study’s broader scope and rigor, we assumed that primary studies until 2013 are covered (since the primary studies search was conducted in 2014). We consider this study by Martins et al. [5] as a basis to evaluate the need for our SLR.

We evaluate the need in two steps: (1) An initial analysis of the trend in the number of relevant publications since 2014 as shown in Fig. 1, which indicates a positive trend; and (2) A systematic approach for qualitative empirical evaluation, presented in the rest of this section.

Figure 1

Number of articles in the result of Google Scholar search of query: “functional safety” AND (automotive* OR vehic* OR driv* OR automobile* OR AUTOSAR OR car). The query was performed for articles until and including July 2021.

We use the 3PDF framework [11] to validate the relevance of conducting this SLR empirically. The 3PDF framework is an empirical framework consisting of three sequential phases. A positive outcome of all three phases ascertains the need for an SLR. Note that the 3PDF framework [11] was originally designed to address the relevance of repeating an existing SLR. Meanwhile, we use it to identify the significance of this study using the SLR by Martins et al. [5] as a basis. In the rest of this section, we explain the execution of the 3 phases of the 3PDF framework.

  • Phase 1: Assess currency , aims to identify whether the research questions are already answered by available evidence or no longer considered relevant [11,16]. This phase consists of the following three yes/no questions and is passed if and only if all of them are answered positively.

    1. 1.

      Does the published SLR still address a current question?

      Yes. The study by Martins et al. [5] focuses on safety requirement elicitation for safety-critical systems with ≈5% of their primary studies from the automotive domain.

    2. 2.

      Has the SLR had good access or use?

      Yes. The prior SLR [5] has a yearly average citation count of 15.5 (with a total citation count of 62 when checked via Google Scholar on 10 November 2020). Citation count is a measure of access, use and relevance [17,18]. In the software engineering domain, a yearly average citation count of 6.82 or above is judged as having had good access or use [17].

    3. 3.

      Has the SLR used valid methods and was it well-conducted?

      Yes. Martins et al. [5] followed the well-established guidelines of Kitchenham [10] for conducting the SLR and the work has been published in a top-tier peer-reviewed journal from the domain.

      Since all the questions are answered yes, we proceed to the next phase.

  • Phase 2: Identify relevant new methods, studies and other information. This phase focuses on whether new information is not covered by the existing study, including study design, evidence synthesis and new primary studies. The phase consists of two yes/no questions and proceeding to the next phase needs at least one question to be answered positively.

    1. 1.

      Are there any new relevant methods (in conducting the SLR)?

      Yes. In addition to the approaches used in the study design of the prior SLR [5], we add the following new methods: (a) full-text search; and (b) both forward and backward snowballing.

    2. 2.

      Are there any new studies or new information?

      Yes. The period of our initial search is between the end of Martins et al.’s study (2014) and 2020. This ensures that every primary study we considered is not included in their research. Furthermore, this period experienced landscape shifts in the automotive industry with automated and connected driving enabled by software-intensive sub-systems [19].

      Both the questions are answered with yes; therefore, we advance to the next phase.

  • Phase 3: Assess the effect of updating the review. This phase aims to assess whether the information from the new primary studies influences the conclusion compared to the base SLR. A Yes answer to any of the two following questions empirically validates the need for a new study.

    1. 1.

      Will the adoption of new methods (for conducting the SLR) change the findings, conclusions, or credibility? Maybe. The new methods that we adopted were full-text search and snowballing. Our initial set of primary studies is disjoint from the set of primary studies considered in the prior SLR [5]. Also, snowballing has resulted in identifying new techniques and wider adoption of some of the processes and techniques for safety requirement elicitation, which were not evident otherwise.

    2. 2.

      Will the inclusion of new studies, information or data change findings, conclusions, or credibility? Yes. Only 5.29% of primary studies in Martins et al. [5] are from the automotive domain. We scope our work specifically to the automotive domain. This makes a direct comparison of our study with theirs inaccurate. Also, the study does not differentiate or taxonomize processes and techniques or present challenges specific to the automotive context.

Thus, the execution of the 3PDF framework empirically establishes the need for our SLR.

2.2. Search Strategy

Our search strategy consists of identifying search keywords and multiple iterations to compose a search string followed by automatic search. We perform both forward and backward snowballing to widen the set of primary studies beyond the initial search.

The search string was constructed from two sets of strings, one for scoping: (automotive, vehicle, vehicular, drive, driving) and another related to the intervention: (functional safety, hazard, accident, risk). The final search string was formed by iterative refining (via piloting) to reduce the amount of noise while covering as much relevant literature as possible. The following search string was formed after several iterations:

The term “functional safety” is searched within keywords, title, and abstract AND the following terms:

(automotiveORvehicleORvehicularORdriveORdriving)AND(hazardORriskORaccident)
were searched in the full text of publications.

To identify and compose the search string, we use the PICOC (Population, Intervention, Comparison, Outcome, Context) method [10,12,13] as detailed below.

  • Population: peer-reviewed publications describing safety requirement elicitation processes and techniques as well as application of such approaches to the automotive domain [10,12,13].

  • Intervention: processes or techniques for safety requirement elicitation [10,12].

  • Comparison: compare the different processes and techniques of safety requirement elicitation by means of identifying the different strategies used, and their context of application [10,12,13].

  • Outcome : safety requirement elicitation process, different stages in the processes, techniques used to conduct each stage, and the following aspect of each process/technique: (a) context of use, abstraction level of application, and applicable components [10,12,13].

  • Context : any phase in the automotive product life cycle that comprises safety requirement elicitation [10].

We use the same search databases as in the related studies [5,7]. The databases included are IEEE Xplore, ACM Guide to Computing Literature, ScienceDirect, and SpringerLink. Except for SpringerLink, automatic search is directly performed in the corresponding database. Since SpringerLink does not have a feature for the intended search string, we used broader search criteria (searching in the whole body of publications rather than title, tags, and abstract), with which a bibliography is retrieved. A further refined search is performed within this bibliography using the reference manager Mendeley.

Once the initial set of primary studies is finalized based on full-text reading, we applied snowballing to gather additional studies. We applied the guidelines by Wohlin et al. [20] to conduct backward and forward snowballing. For backward snowballing, we checked the references of primary studies (using titles only) until no new studies were found. Likewise, for forward snowballing, we looked at the articles (titles only) citing our primary studies until no further studies were found. For forward snowballing, we used the ‘cited by’ feature of Google Scholar and went through the citations to every primary study. The number of studies considered in each of the steps mentioned above is presented in Fig. 2.

Figure 2

Procedure to select primary studies.

2.3. Study Selection

We create inclusion-exclusion criteria tailored to focus on the technical aspects of safety requirement elicitation, which is the focus of this study. We excluded studies on other dimensions of safety requirement elicitation. Some examples are: (a) social and human-related factors that play a role in eliciting requirements like meetings, reviews, communication among different parties; (b) a combination of technical, social, and human-related factors; (c) processes and techniques that merge safety requirement elicitation with requirement elicitation for other quality attributes like security; and (d) influences of agile and develops processes on safety requirements elicitation.

Our inclusion and exclusion criteria are as follows:

Inclusion criteria:

  • I1 Any study that presents, compares, or discusses approaches (techniques, models, frameworks, methods, processes, or methodologies) to (help) elicit safety requirements, either used in or usable for the automotive domain.

  • I2 Studies relating to safety requirements in the context of safety analysis, hazard analysis, or safety-critical standards from the automotive domain.

The exclusion criteria cover secondary studies; articles that are not written in English; non-peer-reviewed articles (gray literature); short articles (< 4 pages); studies below a quality threshold of 50% according to the quality assessment criteria detailed in Section 2.4 below; and studies that do not explicitly specify or are not from the automotive domain.

To ensure reproducibility, we conducted an inter-rater agreement. Before the first author selected primary studies, the rest of the authors evaluated the selection criteria. To assess the quality of the selection process, the second and third authors independently used the selection criteria on two disjoint random samples of the initial set of studies. The first author independently evaluated these two sets of studies employing the same criteria, resulting in the inter-rater agreement measured to 0.8 and 1 with the second and third authors, respectively, according to Cohen’s kappa statistics [21]. This shows the highest level of inter-rater agreement in both cases7. Each disagreement between two researchers was discussed and resolved with the intervention of a third researcher.

We followed a five-step study selection procedure as presented in Fig. 2. We incrementally apply the inclusion-exclusion criteria in steps 1, 3, 4 and 5.

2.4. Quality Assessment

Quality assessment of primary studies is crucial (i) “to investigate whether quality differences provide an explanation for differences in study results”; (ii) as a “means of weighting the importance of individual studies when results are being synthesised”; and (iii) “to guide recommendations for further research” [10].

We derived the quality assessment criteria from the guidelines of Kitchenham et al. [10], Tiwari et al. [14], and Wieringa et al. [15], as summarized in the first column of Table 1. We chose those questions from the guidelines that apply to our list of primary studies. For example, the question “Was there any control group present with which the treatments can be compared” [14] does not apply to studies that present the application of a requirement elicitation technique to an automotive component.

Percentage Yes Partially No
Aim clearly presented 100% 0% 0%
Approach clearly explained 83% 17% 0%
Clarity of application context 31% 63% 6%
Threats to validity taken into consideration 6% 1% 93%
Presence of discussion on results 46% 31% 23%
Limitations/scope discussed 6% 9% 85%
Related work discussed 64% 27% 9%
Table 1

Quality assessment summary of primary studies. The bold face part shows two aspects that only a few articles report.

All the questions that form our assessment criteria have three possible answers “Yes”, “Partially,” and “No,” with a score of 1, 0.5, and 0, respectively. The score of each primary study is the sum of scores for every pertinent question (see Table 1). We use this scoring method to gauge the primary studies’ credibility, completeness, and relevance.

2.5. Data Extraction and Synthesis

We iteratively created a digital data extraction form. We created an initial set of attributes to be extracted from the primary studies and applied this to 10% of the initial set of primary studies. This form was iterated based on data synthesis at 30%, 50%, and 70% (of the initial set of primary studies) based on the categories and information emerging from data extraction. We applied a backward pass on the prior studies in cases where extra attributes were added to the data extraction form during the process. Finally, the following categories of data were extracted from primary studies:

  • Administrative information: Article ID; Title; and Source.

  • Literature characteristics: Authors; Year; Venue type (journal, symposium, conference, or workshop); Venue;

  • Data pertaining to RQ1: What processes are used for or applicable to safety requirement elicitation in the automotive domain?

    • What process(es) are employed, compared, or discussed?

    • What is the motivation or reason to choose a specific process (or how a process compares with other processes)?

    • What are the steps followed in executing the process?

    • Which component or setting is the process applied to?

  • Data related to RQ2: What techniques are used for FSR elicitation in the automotive domain?

    • What are the techniques employed, compared, or discussed?

    • Which high-level step (of a process) is accomplished using the specific techniques used or discussed?

    • What is the motivation or reason to choose a specific technique (comparison among techniques)?

    • Which component or setting is the technique applied to?

The data extraction form is a table with one column for each of the above categories and a row for each article.

Data synthesis was performed after aggregating the information collected using digital forms. The data synthesis is achieved by cross-reading each column to answer the research questions.

3. CHARACTERISTICS OF PRIMARY STUDIES

Our study selection process resulted in 102 primary studies [P1,P2,P3,P4,P5,P6,P7,P8,P9,P10,P11,P12,P13,P14,P15,P16,P17,P18,P19,P20,P21,P22,P23,P24,P25,P26,P27,P28,P29,P30,P31,P32,P33,P34,P35,P36,P37,P38,P39,P40,P41,P42,P43,P44,P45,P46,P47,P48,P49,P50,P51,P52,P53,P54,P55,P56,P57,P58,P59,P60,P61,P62,P63,P64,P65,P66,P67,P68,P69,P70,P71,P72,P73,P74,P75,P76,P77,P78,P79,P80,P81,P82,P83,P84,P85,P86,P87,P88,P89,P90,P91,P92,P93,P94,P95,P96,P97,P98,P99,P100,P101,P102]. This section presents an overview of the publication landscape, quality, and a preliminary analysis of the primary studies. We analyze the publication trend across venues. We also classify the studies based on their domain and on whether the study is a contribution from industry, academia, or both.

Quality assessment of primary studies shows that most existing studies do not report limitations and scope of their solution. We assessed the quality of each study according to the criteria summarized in Table 1. On application of our quality assessment criteria, we found that only a few articles (at least partially) report the following two aspects (highlighted with boldface in Table 1): (1) threats to validity, taken into account in 7% (7 out of 102) of primary studies; and (2) limitations, discussed in 15% (15 out of 102) primary studies. The first observation is not surprising since only a few studies (4 out of 102) report empirical analyses. However, with a few studies reporting scope or limitations makes the reuse and replication of a majority of studies difficult. Our advice for future articles is to explicitly state the limits and scope of proposed solution.

Quality assessment of primary studies: Most existing studies do not report limitation and scope, further limiting our ability to reuse. We advise future studies to report it.

Most studies in safety requirement elicitation are published at conferences. We classified studies based on the type of venue as shown in Fig. 3a. Most of the primary studies (66 out of 102) are published at conferences and symposiums. Twenty studies are published in workshops and sixteen in journals.

Figure 3

An overview of the publication, domain, and contribution landscapes of the primary studies.

The publication landscape of safety requirement elicitation has substantial contributions from industry. We classified the studies according to industry, academia, or industry–academia collaboration contributions. Two intuitive means to identify the source of contributions are (1) the affiliation of authors and (2) the source of case studies and data. The latter is not feasible in our context since many primary studies do not explicitly specify the source of the case study, data, or the examples they use. Therefore, we choose to classify the source of the study based on the author’s affiliation into three categories: (1) industry; (2) academia; and (3) industry–academia collaboration. If all the authors are affiliated with the industry, then the study is considered from industry and similarly for academia. If a study has author affiliations from both industry and academia, it is classified as industry–academia collaboration. The resulting classification is presented in Fig. 3b. This categorization shows that most of the primary studies (54 out of 102) have some contributions from industry, making the safety requirement elicitation publication landscape one with a solid industrial contribution.

Publication landscape: A majority of studies are published in conferences and symposiums with 53% of all studies having contributions from industry.

Safety requirement elicitation is a multi-disciplinary field of research. We examined the domain of each primary study. We identified the domains by research areas listed on the web page of the publication venue of each primary study. When a venue represented one domain, publications were classified to that domain. There were also cases where a venue represented more than one domain. Here, we chose to report the domains together if most of the venues listed two or more areas together, like reliability engineering and safety engineering. Otherwise, we counted the study in both areas. For this analysis, we excluded venues like the International Conference for Convergence in Technology, which does not have specific research areas, or venues like the International Conference on Networks, Communication, and Computing, which specify a wide variety of unrelated domains. This does not affect our classification since less than 1% of the primary studies are from such venues. All the domains that cumulatively covered less than 5% of the publications were classified as ‘other’ in the final list of domains. The resultant set of domains is presented in Fig. 3c. The set of domains depicted shows the multi-disciplinary nature of safety requirement elicitation research.

Multi-disciplinarity: Primary studies span across different disciplines with a domination of software, systems, reliability and safety engineering.

4. SAFETY REQUIREMENT ELICITATION PROCESSES (RQ1)

In this section, we identify, summarize, and compare processes (also referred to as methodologies [P32,P59]) used or discussed in the primary studies for safety requirement elicitation in the automotive context. In our context, processes or methodologies refer to the high level steps that describe the elicitation of safety requirements [2,3,24,P20,P40,P46,P59,P77,P86]. For example, the safety requirement elicitation process prescribed by the industry-standard ISO 26262 consists of four high-level steps as shown in Fig. 4.

Figure 4

Different safety requirement elicitation processes from the automotive domain. The ISO 26262 process (highlighted in red) forms the basis for safety requirement elicitation in the domain. The circled numbers (①, ②, ⑤-⑦) point to the part of Section 4.1 in which the corresponding methodology is discussed. Similar steps across different processes are color-coded the same. The methodologies proposed by Schönemann et al. [P59], Saberi et al. [P86], and Kochanthara et al. [P32], described in ③ and ④ in Section 4.1 use the entire ISO 26262 process without any modification (with the primary difference of partitioning the scenarios space before starting the ISO 26262 process). Therefore, we have not presented them in the figure.

We organize the rest of this section into two parts. Section 4.1 compiles the safety requirement elicitation processes proposed, used, or discussed in the primary studies. Section 4.2 analyzes the application of the processes reported in the literature, their temporal trend, the associated research gaps, and the upcoming domain trends.

4.1. Findings

The primary studies mention 9 unique processes for safety requirement elicitation. We organize them into the following 7 categories (① – ⑦).

There are 9 unique processes for safety requirement elicitation in the automotive domain.

ISO 26262: The ISO 26262 standard [2] forms the basis for safety requirement elicitation in the automotive domain. One focus area of the standard is risks from systematic and random hardware failures in automotive electronic systems and the software that runs them (also termed functional safety). The standard is an adaptation of IEC 61508 [25] to the automotive domain, the latter being a generic standard that outlines safety guidelines for developing any electronic and programmable systems that carry out safety functions.

The safety requirement elicitation process outlined by the standard (ISO 26262: part-3 [2]), highlighted by the dashed red rectangle in Fig. 4, comprises four steps: item definition, hazard analysis, risk assessment, and safety analysis. The item definition step specifies the system and its boundaries on which the rest of the steps will be performed. The hazard analysis step first identifies possible situations that can be a safety risk, followed by safety goals to prevent harm in those situations. The risk assessment step assesses the level of risk for each safety goal based on three factors:

  1. (a)

    Exposure (frequency or probability of the safety goal violation);

  2. (b)

    Controllability (in situations of safety goal violation by the driver); and

  3. (c)

    Severity (of harm on violation of the safety goal). It then allocates a score from A through D, indicating the importance of covering a safety goal during further development steps, with D indicating the highest risk level and A the lowest. This score is called the Automotive Safety Integrity Level (ASIL) score. Note that these safety goals have a system-wide scope. Finally, the safety analysis step maps these safety goals to safety requirements for individual sub-systems or components, which can then be used in developing individual sub-systems.

The above process, prescribed by the ISO 26262 standard, forms the basis of safety requirement elicitation in the automotive domain. Note that we do not differentiate between the 2018 and 2011 versions of ISO 26262 or its regional variants (for example the Chinese GB/T 34590 safety standard, used by Zhou et al. [P85]) since the fundamental concepts and high-level steps do not differ among them. The other processes from the primary studies that are described in this section build on or complement ISO 26262 in various ways. They fundamentally differ from ISO 26262’s process either in its concept, model, or in the high-level steps.

Basis: The process outlined by ISO 26262 (part-3) forms the basis of all safety requirement elicitation processes in the automotive domain.

The rest of this section is organized as follows. Parts ② – ④ describe five processes that extend ISO 26262 requirement elicitation. After those we elaborate on three methodologies proposed as replacement for ISO 26262 in parts ⑤ – ⑥. Finally, we present a process (⑦) that complements the ISO 26262 elicitation process.

Social safety requirements : Gharib et al. [P21] argue that ISO 26262 does not consider safety requirements specific to driver behavior, which they term as “social safety requirements” .An example of a social safety requirement is “identify available information concerning the driver state (e.g., head pose and motion, hands and foot location and motions) at any point in time” [P21]. Gharib et al. [P21] propose to add a step to elicit social safety requirements at the end of ISO 26262 requirement elicitation, as shown as a white box below the ISO 26262 process flow in Fig. 4. They also present an example of the proposed extension on a maneuver assistance system to detect and respond to drivers’ unintended maneuvers. However, they did not describe any systematic method to elicit social safety requirements.

Social safety requirements are safety requirements specific to driver behaviour, proposed by Gharib et al. [P21]. Details on how to elicit these requirements are not available.

Safety requirements for connected driving : Connected driving refers to multiple vehicular systems and traffic infrastructure communicating and working as a single system. This forms a system of systems and enables collective traffic optimizations. Such a system of systems perspective is missing in ISO 26262 [P32,P86], thus making ISO 26262 unsuited for use in the connected driving context. Saberi et al. [P86] and Kochanthara et al. [P32] propose augmentations to the ISO 26262 process for this context.

The process by Saberi et al. [P86] uses the steps for safety requirement elicitation from ISO 26262 but is executed separately, from a system of a systems perspective. Their extension to ISO 26262 is inspired by hazard analysis for the system of systems from other domains [26]. Saberi et al.’s [P86] also partially outlines the usage of their method in the context of a connected driving use case of trucks, where a lead truck is driven by a driver and a sequence of other trucks autonomously follows the leader.

Kochanthara et al. [P32] suggest a two-step extension to the ISO 26262’s process, considering the heterogeneity of connected systems, i.e. traffic infrastructure and multiple kinds of vehicles (for example, trucks and cars or different types of cars) forming a connected system. Their first step is partitioning the scenarios for connected driving operations into sets of scenarios specific to each different system (vehicle) and those common to or involving the entire connected system. Thus a connected driving system consisting of n distinct types of vehicles will have n + 1 sets of scenarios. Then, the ISO 26262 process is applied to each of the n distinct types of vehicles separately. The partitioning and applying ISO 26262 process on each of the n distinct types of vehicles, is the first step. In the second step, the scenarios specific to the connected system, i.e., common to the entire connected system or involving more than one vehicle, are considered. A connected system architecture is created using the individual systems’ (distinct types of vehicles’ and traffic infrastructure’s) components. The connected system architecture should show the possible interactions across the individual systems’ components. Now, the ISO 26262 method should be applied to the entire connected system using its architecture and the scenarios specific to the connected system. They also show the applicability of their approach in a similar use case as in Saberi et al.’s [P86].

Connected driving refers to multiple vehicular systems and traffic infrastructure communicating with each other and working as a single connected system. To elicit safety requirements for connected driving, Saberi et al. [P86] and Kochanthara et al. [P32] extends the ISO 26262 process to cover both individual-vehicle perspective(s) and a connected system perspective.

Scenario-based safety analysis for automated driving: Schönemann et al. [P59] argue that the number of operating situations to be considered for safety requirement elicitation for automated driving might be unlimited, making the ISO 26262 process infeasible. Their proposal extends the ISO 26262 process for automated driving with the abstraction of the system’s behavior to limit the number of parameters. The proposed process starts with decomposing the system’s functional behavior into a higher abstraction of functional scenarios and then performing ISO 26262 safety requirement elicitation specifically for each functional scenario. Thus the steps (other than decomposing the system’s functional behavior into functional scenarios) are the same as ISO 26262.

New requirement elicitation process for automated driving: Warg et al. [P77] argue that ISO 26262 assumes a driver for fallback in case of a malfunction or failure of a component. At the same time, there is no driver to fall back to in the case of entirely automated driving. To address this gap, Warg et al. [P77] present an approach based on the ideology of adequately specifying the system and its function iteratively using their version of hazard analysis. This process ensures that the automated driving functions are adequately specified. It also ensures that all the relevant hazardous events related to all functions that enable automated driving are covered. Their process differs from the one by Schönemann et al. [P59] (see ④ above) to automated driving in two aspects: (1) Schönemann et al. [P59] uses the ISO 26262 process as is, while Warg et al. [P77] proposes a new iterative process to replace it; and (2) Schönemann et al. [P59]’s process is aimed to scale the ISO 26262 process to an infeasibly high number of operating situations via abstraction. At the same time, Warg et al.’s ideology is based on iteratively defining a function and minimal representative set of hazardous conditions.

Warg et al. proposed a new process as shown as the rightmost flow in Fig. 4. It differs from the ISO 26262 process in the first two steps: preliminary feature description and new hazard analysis. The preliminary feature description step describes the proposed feature’s anticipated benefits and initial scope. The hazard analysis step consists of two sub-steps: finding dimensioning hazardous events and function refinement. The dimensioning hazardous events are the sets of hazardous events that are sufficient to identify all critical safety goals. The function refinement sub-step aims to elicit requirements for the intended nominal functionality and define the scope of each function. Therefore the hazard analysis step results in (ideally) a minimal set of hazardous events and refined functionality from the preliminary feature description.

Another significant difference between this process and ISO 26262 is that the definition of the system and its boundaries (item definition) is an intermediate step in Warg et al.’s [P77] process, yet it is the starting step in ISO 26262 (see Fig. 4).

Automated driving refers to a vehicle taking over (part of the) driving and eliminating the need for a human for (some) driving tasks. Schönemann et al.’s [P59] and Warg et al.’s [P77] proposes processes to address two limitations of the ISO 26262 process in an automated driving context: (a) the possibility of unlimited number of conditions [P59] and (b) the assumption of driver fallback which leads to inadequate specification of functionalities, respectively.

Systems Theoretic Process Analysis (STPA): The primary difference between STPA and ISO 26262 is that STPA treats accidents as a control problem rather than a failure problem.

STPA is proposed for safety requirement elicitation in the general context of safety-critical systems from a system theoretic point of view [24]. In literature, STPA has been used in the automotive domain in two contexts: (1) as a methodology for safety requirement elicitation that complies with the ISO 26262 standard [P33,P40,P94]; and (2) as a technique for the steps (similar to) hazard analysis and safety analysis [3,P89], from the ISO 26262 and ISO 21448 (described in ⑦) processes. In this section, we discuss the former context.

STPA intends to prevent accidents by enforcing constraints on component behavior and interactions. The underlying model of STPA promises to cover more causes of accidents than component failures, like design errors and flawed requirements [24]. Therefore, two dominant reasons to advocate STPA over the ISO 26262 process are: (1) the wide variety of error types it covers (ISO 26262 only covers random hardware failures and systematic failures) [P33]; and (2) less dependence on expert knowledge [P33,P89]. The error types STPA covers include component failures, inadequate component interactions, software failures, human error, and system failure [P33]. Another advantage of STPA is that expert knowledge is not always required for requirement elicitation [P33,P89].

However, the original STPA process [24] lacks a risk assessment step. Mallya et al. [P40] showed that the STPA process could be augmented to include a risk assessment step from ISO 26262 as shown in flow ⑥ STPA) in Fig. 4. The STPA safety requirement elicitation process itself consists of 4 steps, as shown in Fig. 4. The first step (identify systemic hazards) identifies the possible accidents and hazards that can cause these accidents and is similar to the hazard analysis part of ISO 26262. The second step of defining a functional architecture (control structure in STPA’s original terminology) includes defining the system’s interaction with the environment and stakeholders in addition to defining the system and its boundaries as is done in the ISO 26262 process. The third step, identifying unsafe control actions using the functional architecture and hazards from the previous steps, is similar to a partial merge of the hazard analysis and safety requirement elicitation steps in ISO 26262. We mention partial merge here since identifying hazards (possible accidents) from hazard analysis step and identifying safety requirements from safety analysis step of ISO 26262 is not a part of identifying unsafe control actions step in STPA. These two parts from hazard and safety analysis comes in first and last step of STPA respectively.

After third step in STPA, Mallya et al.’s augmentation [P40] adds the risk assessment from ISO 26262. The final step is similar to the last step from ISO 26262 (safety analysis). It identifies safety requirements based on the unsafe control actions (and their corresponding risk level in Mallya et al.’s augmentation [P40]).

System Theoretic Process Analysis (STPA) originates from system theory rather than control theory on which ISO 26262 is based. STPA promises inclusion of requirements beyond systematic and hardware failures (which is the scope of ISO 26262) including inadequate component interactions and human errors.

ISO 21448: The standard ISO 21448 [3] alternatively known as Safety of Intended Functionality (SOTIF), is introduced to complement ISO 26262 in contexts “where proper situational awareness is essential to safety and where such situational awareness is derived from complex sensors and processing algorithms” [3]. The SOTIF process focuses on hazards that do not exist because of failures, but instead because of insufficiency of specification, performance limitations, insufficient situational awareness, incorrect and inadequate Human-Machine Interface design, impact from active infrastructure and vehicle to vehicle communication, and external systems [3]. Note that attacks exploiting vehicle security vulnerabilities, and intentional actions that violate the system’s intended use, like a substitute hand to fool an “hands-on wheel” detection safety measure, is out of the scope of SOTIF. Also, the standard is in its development stage, and a complete version is yet to be released to the public.

The SOTIF safety requirement elicitation process consists of five steps, as shown in Fig. 4. The first step, specification and design, is analogous to the item definition step from ISO 26262. This step consists of defining the operating environment like road surface and climatic conditions, the system’s interactions with users and traffic participants, and the system and its boundaries. The second step, SOTIF-related hazard identification, is analogous to the hazard analysis step from ISO 26262 but scoped to intended functionality rather than failure scenarios. The third step, SOTIF-related risk evaluation, is analogous to risk assessment from ISO 26262 and is based on three aspects: occurrence frequency of a given scenario, the severity of a triggering condition (same as in ISO 26262), and the effectiveness of measure taken for the safety of the intended functionality. The fourth step identifies insufficiencies of specification, performance limitations, and conditions that trigger the limitations that could initiate hazardous behavior. The final step is deriving safety requirements to avoid or prevent hazardous behavior.

Safety of Intended Functionality (SOTIF, ISO 21448) complements the ISO 26262 process for automated driving and include requirements relating to insufficiencies of specification, performance limitations, insufficient situational awareness, incorrect and inadequate Human-Machine Interface design, impact from active infrastructure and vehicle to vehicle communication, and external systems.

4.2. Analysis

We analyze the processes presented in the prior section based on their temporal adoption trend and context of usage. Based on this information and the comparison presented in the preceding section, we present research gaps and upcoming domain trends.

A process adoption’s temporal trend is an indicator of its use. A positive trend points to tooling and community support. Both of these are essential for the application of the process in large-scale systems. The adoption of processes across years does not indicate any apparent overall patterns as shown in Fig. 5. However, the number of publications that mention the usage of ISO 26262’s process peaked in 2019 with a substantial decrease in 2020. We present three hypotheses for this decline: (1) The 2020 drop in ISO 26262 usage may reflect disruption from COVID-19, which slowed certification workflows, audits, and formal safety assessments across automotive development pipelines; (2) Some organizations may have shifted attention to rapid deployment of ADAS and automated driving features, emphasizing agile experimentation over strict standards compliance; (3) The rise of AI-centric components and novel architectures may have pushed researchers toward alternative frameworks or extensions (e.g., ISO 21448), temporarily reducing explicit ISO 26262 references.

Figure 5

Number of primary studies that discuss safety requirement elicitation processes plotted across years.

The usage of the upcoming ISO 21448’s process shows a steady increasing trend. The number of publications that report usage of STPA first peaked in 2016, followed by a downfall and a small increase again. Adopting the rest of the processes is reported primarily in the articles that describe them or in articles by the same authors. The only exception is one article by Stolte et al. [P66] which reports usage of Warg et al.’s [P77] process (described in ⑤). In summary, the application of processes proposed by industry-specific standards dominates across years, while the other processes proposed to extend or replace the standards do not show wide adoption except for STPA. Note that the adoption estimates are based on the primary studies and the real industry adoption might be different. Since industrial usage might not always be apparent from articles published, it will have to be substantiated via future research, for instance, using interviews or surveys.

High adoption of industry standard processes: ISO 26262’s safety requirement elicitation process dominates in adoption across years while the adoption of ISO 21448’s process shows a steady increase. Other processes proposed to extend or replace the ISO 26262’s process do not show wide adoption with the exception of STPA.

The practicality and limitations of processes in different use-cases will only be known by the application of the process in those use-cases. Therefore, prior application contexts are important for researchers, practitioners, educators, and beginners to identify avenues for future research, the starting point for similar use-cases, and to act as a guide for the application of processes. For this, we first group the components and use cases where the processes have been applied into the following six categories based on existing classification schemes in literature [19,27].

  1. 1.

    Vehicle-centric functional components are the components necessary for the functioning of the vehicle. Examples are powertrain, battery management system, braking and steering system (including their safety systems such as anti-lock braking system), and fuel injection system.

  2. 2.

    User-centric functional components, which in our context is the Human-Machine Interface (HMI), for interacting with the vehicle.

  3. 3.

    Advanced driver assistance systems like adaptive cruise control, traffic jam assist, and lane management system, which can assist a human driver in a driving task but cannot accomplish a complete driving task on their own.

  4. 4.

    Advanced active safety systems like advanced emergency braking system, collision warning and avoidance systems, and maneuver assistance system that use multiple sensors and sensor fusion techniques to actively ensure the safety of traffic participants.

  5. 5.

    Highly automated driving use-cases where the vehicle is capable of handling a complete use-case without driver intervention like automated valet parking, automated shuttles that can operate without a human driver on specific routes, and completely automated driving that can drive ideally anywhere or at least in a geofenced area without the active intervention of a human driver.

  6. 6.

    Connected driving use-cases like platooning where multiple vehicles form a vehicle train with only the lead vehicle driven by a human and the rest autonomously following the vehicle in front.

The number of articles that report safety requirement elicitation in each of the above categories and the processes they use is plotted in Fig. 6. The ISO 26262’s process is used in primary studies for every category except for User-centric functional components (HMI); STPA and ISO 21448 processes and the processes proposed for automated driving have been reported to be used in more complex use-cases (except in one study for battery management systems) and HMI. HMI is not seen traditionally as a safety-critical component; however, with the advent of advanced driver assistance systems, and highly automated and connected driving, HMI is becoming safety-critical.

Figure 6

Number of primary studies presenting the usage of different safety requirement elicitation processes.

Essential versus Advanced functions: The ISO 26262’s process has shown to be usable for safety requirement elicitation of essential automotive functions that do not require complex situational awareness. STPA and ISO 21448’s processes cumulatively dominate in the use-cases that need complex situational awareness.

Informed by the adoption, temporal trend, and comparison of processes, we point to the following three dimensions that need further exploration. One, components and use-cases to which the current processes can be applied, but have not been presented in the literature. Two, new contexts and emerging technologies like neural processing units and Machine Learning (ML) based software, which might require new processes for safety requirement elicitation. Three, the introduction of new systems outside the vehicle like intelligent traffic infrastructure and use of the cloud for (assisting) automated driving. The scope and criticality of these external systems will have to be quantified to identify (1) whether they need a separate or integrated requirement elicitation and (2) whether the existing process can be used. We elaborate on each of these aspects in the rest of this section.

The literature lacks case studies on categories like advanced active safety systems like blind-spot detection and advanced driver assistance systems like adaptive cruise control, as evident from Fig. 5. These features have matured from research and development to production and are standard features in today’s high-end cars. Upcoming technologies currently in research and development, like highly automated and connected driving, also lack case studies compared to other categories. The adoption of processes other than the industry-standard processes is still low. The lower adoption makes the iterative improvements and scope of their applicability harder to identify and hinders further research. The majority of existing case studies on advanced systems for automated and connected driving are rudimentary examples. The practicality of new processes presented within the scope of these advanced systems is yet needed to be shown in practice. To summarize, new processes proposed in primary studies need more case studies in their respective context for a conclusive verdict on the feasibility of their application in real-world settings.

Case studies: There are few to no case studies that use requirement elicitation processes other than ISO 26262 or on advanced components and use-cases.

All the advanced driver-assist and safety features as well as automated and connected driving are enabled by special-purpose hardware and software that utilizes ML. Yet neither of the two standards ISO 26262 and ISO 21448 mentions [28] any requirements or processes regarding developing neural networks for usage in such systems. The basis and underlying model on which the requirement elicitation processes were built belonged to an era before ML-based software were mainstream. In ML-based systems, the system’s behavior is dictated by the input data and not by human written logic. The newly proposed processes build on the basic safety requirement elicitation framework proposed by ISO 26262 and mainly focus on the increased complexity, non-existence of driver (controllability aspect), sufficient specification of (safety) requirements, and foreseeable misuses in the context of automated and connected driving, rather than on the above mentioned fundamental changes.

ML-based sub-systems: Whether the current processes are sufficient to elicit safety requirements for special purpose hardware and ML-based software is an open question.

Connected driving builds on collective traffic optimization by a set of vehicles communicating with each other and external entities (roadside infrastructure) using peer-to-peer networking or via the cloud. With the intelligent traffic infrastructure being a part of a driving task and the cloud acting as a data intermediary or a sensor, both of them become safety-critical components. However, the new methods proposed to elicit safety requirements for the connected driving context focus on the vehicles and do not consider the safety requirements for the infrastructure involved.

Smart infrastructure and intermediaries: How to elicit safety requirements for traffic infrastructure and communication intermediaries like the cloud in the context of connected driving is another open question.

The landscape shifts in the automotive industry with automated driving will also impact safety requirement elicitation. Like the smartphone replaced three separate devices for music, taking photos, and communication; automated driving bundles advanced tasks together. Therefore, many of the categories specified in Fig. 6 might be a single one in the future. This changes prior assumptions on driver fallback and turns many non-safety-critical systems into safety-critical. For example, maps, previously a non-safety-critical part, used at human discretion, have an active role in automated driving and are sometimes considered as a sensor. Previously, the driver was assumed to be vigilant and ready all the time for fallback. Now, HMI can be used to decide on a driving task and for emergency takeover, making HMI a critical safety component. The impact of such changes on safety requirements and their elicitation is yet to be seen.

Blurring lines: No more separation of advanced driver assistance and safety systems; these all are bundled together in highly automated driving.

5. SAFETY REQUIREMENT ELICITATION TECHNIQUES (RQ2)

In this section we identify, review, taxonomize, and compare techniques (also referred to as methods [2,24,P20,P46,P78]) for safety requirement elicitation in the automotive domain. Techniques or methods refer to the alternative ways to conduct each high-level step [3,7,24,P40,P46,P59,P77] in safety requirement elicitation processes as presented in Section 4. For example, two widely used techniques to conduct the safety analysis step in the ISO 26262 process (the last step in the red highlighted part of Fig. 4) [2] are fault tree analysis [29] (FTA) and failure mode effect analysis [30] (FMEA).

We organize the rest of this section into two parts. Section 5.1 present a summary and taxonomies of techniques for safety requirement elicitation. Section 5.2 analyzes the techniques, their scope of use, the associated research gaps, and the upcoming domain trends.

5.1. Findings

There are 38 distinct safety requirement elicitation techniques discussed in the primary studies. Our taxonomy is inspired by stages in automotive product development and the reference product life cycle from industry standards [2,3]. In automotive product development, the norm is to design and evaluate at the system level, then subsystem level, followed by the design and development of the individual hardware and software components. We follow the same structure for creating our taxonomy. Our taxonomy is organized based on the level of application (system, software, and hardware), the usage context (general, automated driving, and connected driving), and the scope of application of the techniques, as shown in Tables 2 to 4. This allows easy identification of the choice of techniques by both researchers and practitioners. We organize the entire taxonomy of techniques based on the steps in safety requirement elicitation processes (since the techniques are alternate ways to conduct these steps).

Table 2

Hazard analysis techniques from primary studies. All the above techniques are reported in the corresponding primary studies with the scope of identifying/deriving hazardous events or situations and thus deriving safety goals.

To group techniques based on the safety requirement elicitation steps, we must adhere to the definitions of what each step is intended to achieve in the safety requirement elicitation processes. However, in the literature, there is little consensus on this subject. For example, Vilela et al. [7] use the term safety analysis to denote any technique used in safety requirement elicitation in the larger context of safety-critical systems; while Kolln et al. [P33], and Abdulkhaleq et al. [P89] use both the terms hazard analysis and safety analysis to compare the same set of techniques. Since this study focuses on the automotive domain, we follow the terminology from the industry standards ISO 26262 and ISO 21448. These standards have industry consensus and wide adoption (as is evident from the discussion in Section 4.2 above).

For consistency and brevity, we group techniques for similar steps of safety requirement elicitation processes (shown with the same color in Fig. 4) together and refer to each group with the terms (for steps) from the ISO 26262 process. However, in our taxonomy, we refer back to the original processes (refer to the fourth column of Tables 2 to 4 that shows our taxonomy). Note that the social safety requirement elicitation step [P20 ], shown in the white box in Fig. 4, is not similar to any of the four steps from the ISO 26262 process. We did not find any technique for this step in primary studies including the original study itself [P20 ]. Therefore, we do not consider this step for our taxonomy. Also, our taxonomy does not have the first step of the ISO 26262 process (item definition) nor similar steps in other processes. This step (formally or informally) defines the system, its boundaries, the environment it operates in, the traffic participants it interacts with, and the stake-holders. Such a definition uses languages for modeling, which is a topic in itself for another study and beyond the scope of this work. Therefore, we group techniques based on their usage for each of the other three steps (and corresponding similar steps in other processes shown in Fig. 4).

We present the techniques in the following three parts (corresponding to the three steps in ISO 26262 process): Hazard analysis, Risk assessment, and Safety analysis. Tables 2 to 4 show our taxonomy, one for each of the three steps. The techniques are presented in the third column of each table. The primary studies that mention these techniques along with the corresponding process (identified from each primary study) are presented in the fourth column. The first two columns present two dimensions of our taxonomy, and a third dimension is specified in the caption of each table. Note that we only considered explicitly mentioned techniques. For example, Wang et al. [P75] uses the method of brainstorming but does not mention it explicitly; thus, the study is not considered for the technique. Also, if a method is used for more than one step, it is presented in all steps.

Techniques: There are 38 distinct techniques used in primary studies for safety requirement elicitation. We group them into three categories: hazard analysis, risk assessment, and safety analysis. We taxonomize techniques in each category based on the application level, usage context, and scope, as shown in Tables 2 to 4.

5.1.1. Hazard Analysis

The objective of hazard analysis (and similar steps) is to identify (a) (minimal set of) hazardous events either caused by a system’s malfunctioning behavior or related to the system’s intended functionality, (b) unsafe control actions, (c) possible accidents, and (d) hazards that can cause the accidents. These can then be used to derive safety goals or to refine the system’s functionality (see Section 4.1 for more details). We taxonomize hazard analysis techniques as shown in Table 2 and discuss each of the techniques starting with the simplest one.

Brainstorming ( and in Table 2) is a technique to identify hazardous events and eventually safety goals for hardware, system, and software specific requirements [P34,P68,P96]. The idea is to first think of possible scenarios and situations that can lead to potential hazards; and then use these to identify safety goals to avoid harm in those hazardous situations.

Guide-word based brainstorming ( and in Table 2) is a more structured, systematic form of brainstorming. It uses guide-words to explore the hazardous situation space [P32,P34,P96,P97,P98]. Guide-words are predefined words like No and Late, which, when combined with scenarios, can thus be used to identify potentially hazardous situations. This structured method is shown to be applied in general [P98] and in automated [P34,P96] and connected driving [P32,P97] contexts.

HAZard and OPerability analysis or HAZOP ( in Table 2) is the technique from which the usage of guide words stems. In its original form, HAZOP is a structured and systematic method with a specific documentation style. HAZOP also recommends a specific team composition (to conduct the analysis) with certain roles inside the team [31]. HAZOP is reportedly used in its traditional form at the hardware level [P15,P27].

Three extensions to HAZOP (, , and in Table 2) were proposed in the context of automated driving [P5,P34,P42]. Martin et al. [P42] showed that HAZOP, based on limitations of sensors like cameras and LiDARs, can be used to identify hazardous situations related to the safety of intended functionality ( in Table 2). Another extension of HAZOP uses a skill graph as a functional model of the system with scenarios for automated driving to find hazardous events [P5] ( in Table 2). Kramer et al. [P34] uses a HAZOP-inspired brainstorming approach using guide-words for identifying hazards in the context of automated driving ( in Table 2). They also use it in the context of safety of intended functionality, focusing on environment triggers that can cause hazards.

Failure Mode and Effect Analysis or FMEA ( and in Table 2) is another traditional and systematic technique [30]. FMEA is a bottom-up method, starting from the failure of a component(s) and then identifying the potentially hazardous situations it can lead to. FMEA is reported to be used for hazard analysis at system, software, and hardware level [P68,P76,P84,P85].

Undesired combination state templates ( in Table 2) is a technique proposed by Aceituna et al. [P1] for identifying hazards which can be caused by a combination of system components’ (and environmental) states. They argue that traditional hazard analysis techniques like FMEA focus on hazards relating to the state of (failure of) one component or event. The proposed method identifies hazards relating to multiple (failure) events while reducing the number of state permutations needed in combinatorial approaches. The technique uses templates to identify the combination of events/states (rather than a single event/state in FMEA), which can lead to a potential hazard. Thus it forms a complementary method to the traditional hazard analysis techniques.

Iterative hazard analysis and function refinement ( in Table 2) is a technique proposed by Warg et al. [P77] that can help reach completeness of safety goals for the completely automated driving functionality. They argue that in completely automated driving settings, the traditional methods used for hazard analysis are insufficient to ensure safety goals’ completeness. They suggest that hazard analysis should have a broader scope to ensure that a vehicle’s function fulfills its specifications for completely automated operation. Their technique uses an iterative procedure using trees of hazards and operational situations that can help reach a more complete and minimal set of safety goals than other techniques. They use hazard analysis as an aid rather than an afterthought, when defining the scope of vehicular functions. On the contrary, other techniques define the functions before hazard analysis.

Shared and multi-level hazard analysis ( in Table 2) is a technique proposed based on the idea of shared responsibility in automated driving. The hazard analysis techniques discussed above take only the vehicle (and its driver) responsible for the potential hazards and associated safety goals. Monkhouse et al. [P47] suggest that automated driving comes with shared responsibility between multiple traffic participants to avoid hazardous situations. The study recommends performing hazard analysis considering the division of responsibilities, and handling hazard analysis at various levels, for instance, one level for each participant.

Hazard analysis using vehicle level simulator and item functional model ( in Table 2) is proposed to increase the accuracy and reliability of hazard analysis techniques. Most methods mentioned above (except FMEA) do not use a detailed system model for hazard analysis. Sini et al. [P62] and Tao et al. [P68] use simulators to model system and its environment (item functional model) to aid hazard analysis [P62,P68].

Hazard analysis techniques: There are 12 distinct techniques for hazard analysis and similar steps discussed across 17 primary studies. Brainstorming forms the basis for a majority (, , , , , , and in Table 2) of techniques. Some techniques (, , , , and in Table 2) are proposed by the primary studies while the rest of them use existing techniques in the automotive context. Most of the techniques are reported in the context of the safety requirement elicitation processes of industry standards ISO 26262 and ISO 21448.

5.1.2. Risk Assessment

Risk assessment intends to allocate scores for safety goals such that the score indicates the urgency and reliability level needed to address the safety goal. Table 3 shows a taxonomy of techniques for risk assessment derived from the primary studies. Unlike hazard assessment techniques, the risk assessment techniques have different scopes of application, as listed in the second column of Table 3. We elaborate on each group of techniques based on their scope of application.

Table 3

Risk assessment techniques from primary studies. Except for the quantitative risk norm (fourth row in the third column), the rest of the methods are used in general context. The quantitative risk norm is proposed and used specifically to replace the ASIL levels in the context of automated and connected driving.

Identification of ASILs ( and in Table 3) is the problem of identifying the risk level of a safety goal. ASIL is the risk scoring scheme from the ISO 26262 standard (see Section 4.1 under ① for more details). The standard also provides a metric that combines the three individual ratings to identify the ASIL level.

These metrics are used by a majority of the studies [P6,P15,P27,P30,P32,P35,P36,P38,P39,P40,P41,P60,P62,P63,P68,P69,P72,P76,P82,P85,P94,P96] (first row of and in Table 3). These studies use the metrics with the three assumed individual ratings derived from operational scenarios.

All the studies use the metrics for their case studies except Khastgir et al. [P30]. They explore how to improve inter-rater reliability of risk assessment while using the metrics in ISO 26262. They look at the two following settings: one, a rerun of the same techniques by different teams using the same data; two, with no restrictions on techniques and data (i.e. not necessarily the same methods and data) by different teams but with the same analysis scope and objectives [32]. They propose a rule set for risk assessment to improve the reliability of risk assessment (the severity and controllability ratings) in the automotive context. They conclude that subjective interpretation and resulting unreliability and variation could be reduced using an exhaustive and explanatory rule-set [P30].

Augmenting the ISO 26262 metrics with plant8 and fault modeling (second row of in Table 3) is another method to reduce subjectivity of the ISO 26262 ASIL determination. The method stems from control theory and is proposed by Zhang et al. [P84]. They propose the use of mathematical modeling of the system and faults. Such modeling enables quantitative analysis via simulation. This can provide sound reasoning for exposure and controllability ratings, and evidence for the ratings can be provided by fault simulations.

Identify severity ratings using simulation (third row of in Table 3), the usage of formal models and physics based simulations for risk assessment. This contrasts with the abovementioned studies, where severity level is computed based on various assumptions. Duracz et al. propose a method that allows computing severity levels for specific operational scenarios with accurate bounds on all the modeled parameters like the pre-collision and post-collision velocities, which contribute to a hazardous event [P17].

Quantitative Risk Norm or QRN to replace ASIL ratings for automated and connected driving (fourth row of in Table 3) is a risk assessment scoring system proposed by Warg et al. [P78]. They suggest that the automated driving systems’ safe behavior results from a combination of tactical and operational decisions. Therefore, safety guarantees can be achieved by adjusting the proactive decision-making, in addition to addressing random and systematic failures. These two kinds of failures are independent of the traffic situations and are the only kinds of failures the ASIL ratings takes into account [P78]. QRN is proposed to substitute the fixed risk assessment criteria of ISO 26262. Warg et al. [P78] suggests to define what is regarded ‘sufficiently safe’ at design time. This definition is to used to identify discrete risk levels called consequence classes. Each consequence class receives a total norm frequency informing how often, at most, this kind of consequence is allowed to occur. The consequence classes along with their norm frequency is QRNs. QRNs can be used to classify incidents into a set of incident types. Now each safety goal will be associated with one incident type. Warg et al. [P78] demonstrate the applicability of QRN for automated driving while Bergenhem et al. [P9] show its applicability for connected driving.

ASIL decomposition ( and in Table 3) is the process of decomposing a higher ASIL rating of a functional component by implementing the functional component with redundancy. Here, each redundant component will have a lower ASIL rating than the original functional component, while combined, the functionality they achieve will still have a higher ASIL rating. The idea is that redundant components with individually higher probability of failure will have a lower overall failure rate. Readers can refer to Park et al. [P51] (first row in Table 3) for an example implementation of the guideline.

Merging ASIL allocation method from ISO 26262 with ASIL decomposition (second row of in Table 3) is proposed by Lidström et al. [P37]. They argue that the allocation and decomposition of ASILs should not be separate as specified in ISO 26262 and instead should be merged into one. They point out that separating the two processes applies only to systems with a specific kind of redundancy where two or more safety mechanisms check some state and take action sequentially. Such separation is not applicable in cases of redundancy where only one of the redundant components is operating and the other is on standby. They propose a method to combine the allocation and decomposition of ASILs. They apply it in the context of an actuation system for automated driving.

Dependent Failure Analysis or DFA ( in Table 3) is a part of the ASIL decomposition where a higher ASIL rating is splited to lower ASILs among a set of components. DFA [33] is performed to ensure that a single root cause cannot lead to the failure of all the lower ASIL components: whether the ASIL rating, when split into multiple components, does not lead to dependent failures among the components. Young et al. [P83] propose a new scoring system for root causes of dependent failures. This scoring system can be used to compare failures’ root causes, which can aid ASIL decomposition. They claim that their scoring system is more exhaustive and compelling than that of the predecessor of the ISO 26262 standard, IEC 61508.

Optimal ASIL allocation ( and in Table 3) problem arises when there are multiple ways to allocate ASIL ratings to individual components which together perform a function (forming a functional component). According to Azevedo et al., the ASIL allocation problem is a complex optimization problem with a vast search space of possible ASIL allocations to individual components [P3]. They present a genetic algorithm and tabu search algorithm with cost heuristics to find a strategy that minimizes the total cost of development and production while meeting the desired ASIL level with the least effort [P3]. Following Azevedo et al.’s work, Sorokos et al. [P102] also showed the application of tabu search to ASIL, allocation in the context of a braking system. Another work in the direction of optimal ASIL allocation is using the penguin search algorithm by Gheraibia et al. [P93]. They claim that the penguin search algorithm can produce optimal or near-optimal results within the least amount of time and resources for the computation. Detailing these search algorithms is beyond the scope of this article.

Reducing ASIL allocation search space ( in Table 3) is important since, in any practical case, the search space for finding an optimal ASIL allocation that meets the safety and cost requirements has a huge solution space due to the combinatorial nature of the problem [P23]. Searching through this space may become impracticable in large and complex systems [P23]. Therefore Gheraibia et al. [P23] propose two solutions that can be sequentially performed to reduce the solution space: (1) Cut-set based reduction where cut sets9 with different orders are formed as trees with their roots and nodes as the cut sets and leaves as basic events. The idea is to limit the possible ASIL range for a basic event (leaf node) and thus reduce the search space by reducing the possible ASIL allocations; (2) Heuristic cost-based reduction10 which reduces solution space by grouping the ASIL allocations to equivalence classes and creating a priority list of the allocations in each equivalence class. Gheraibia et al. [P24] further builds on their earlier work [P23] and adds an ant colony optimization algorithm for further solution space reduction.

Note that techniques like FMEA and its augmentation [P11] are used to assess whether a component or system adheres to a specific ASIL level (ASIL evaluation). This is out of our scope since it does not belong to eliciting safety requirements but instead assesses the fulfillment of safety requirements.

Risk assessment techniques: There are thirteen distinct techniques for risk assessment and similar steps discussed across 34 primary studies. These are techniques to (1) identify/allocate risk scores to safety goals; (2) decompose risk scores to multiple components; (3) identify optimal allocation of risk scores; or (4) reduce risk score allocation space. Most studies discuss or use techniques belonging to the first category. The only process specified for risk assessment in all the primary studies is ISO 26262’s process.

5.1.3. Safety Analysis

Safety analysis is the process of deriving functional safety requirements from safety goals and allocating them to the individual components, system, software, or hardware architecture11. A taxonomy of techniques used for safety analysis in the primary studies is presented in Table 4. Now we summarize each method in the order of predominance of usage and simplicity.

Table 4

Safety analysis techniques from primary studies. All the above techniques are reported in the corresponding primary studies with the scope of identifying/deriving safety requirements from safety goals using architecture of the system or the specific component.

Fault Tree Analysis or FTA (, , and in Table 4) is a top-down, tree based, safety analysis technique that starts from a safety goal and leads to safety requirements and their allocation to architecture components [29]. The safety goal (or a potential hazard) is taken as the root node or top event of a tree made of logic gates as intermediate nodes. The safety requirements form the leaf nodes of the tree. The tree is constructed top-down from the root to the leafs. The fault tree can be qualitative (without any labels on edges connecting the nodes) or quantitative (with the edges labelled with failure probabilities). FTA has been applied for all levels and all usage contexts that we considered in this article [P10,P14,P15,P27,P32,P35,P36,P38,P45,P48,P58,P60,P82,P86].

FTA with fault classification ( in Table 4) is proposed by Dajsuren et al. [P13] for identifying the relative contributions of different groups of faults (fault classes) to the safety goals. They use fault classification – a key-value structure indicating the frequency of different faults – rather than failure probabilities to label the fault tree starting with the leaf nodes. They use this method to identify the percentage of total potential failures caused by vehicle-to-vehicle communication faults in connected driving.

Dynamic Fault Trees or DFTs ( in Table 4) are proposed by Ghadhab et al. [P92] to augment fault trees for faithful representation of vehicle system model. They suggest that the traditional fault trees are not sufficiently expressive for faithful representation of vehicle system models. DFTs extend fault trees with the following four specific gates: sequence-enforcers for restricting sequence between children of a node; priority-and, for indicating priority between children of a node; spare-gates, for supporting reduced or zero failure rate; and functional dependencies supporting modeling feedback loops and triggers. They show the applicability of DFTs in the case study of a vehicle guidance system.

Environment Fault Tree or EFT ( in Table 4) proposed by Kramer et al. [P34] extends fault trees to specify environmental conditions using special gates. In EFT, environmental conditions are modeled as leaf nodes that trigger higher-level faults. EFTs classify the causes for deviation from correct behavior to (random) hardware faults, (systemic) design faults in hardware or software, (systemic) specification faults either due to incorrect assumptions or lacking a structural approach. This method is specified in the context of safety of intended functionality in automated driving.

Common Cause Fault analysis or CCF ( in Table 4) is a safety analysis technique that uses fault trees to identify faults caused by the same set of causes or conditions. Such common cause faults can be fatal in case of redundancy, especially in cases where a higher risk level is addressed by using redundant components rated with lower risk levels. Here a common cause fault can lead to the failure of all redundant units at once, potentially leading to a sub-system or system-wide failure. Frigerio et al. [P18] suggest that such faults should be avoided across individual components contributing to redundancy. An application of CCF is presented by Huang et al. [P27] in the context of the steer-by-wire system’s hardware.

Failure Mode and Effect Analysis or FMEA ( in Table 4), in contrast to FTA, is a bottom up safety analysis technique [30]. The analysis starts with the possible malfunction or failure of individual components. It works backward to identify the effects of the failure in the system and which safety goals they (failure of a component) violate. The potential failure modes are typically derived from experience with similar products and processes. In our primary studies, FMEA is reported in the contexts of the hardware part of steer-by-wire system [P27], brake-by-wire system [P35], and part of ignition system [P53]. Further optimizations for streamlining FMEA are proposed in [P48].

Aging FMEA ( in Table 4) tailor fits FMEA to focus on aging effects for circuits in automotive. Aging FMEA proposed by Scharfenberg et al. [P57] analyze the electrical properties’ change due to aging and identify aging-dependent critical hardware components that can lead to a potential hazard or safety goal violation. The method is an adaptation of FMEA assisted with simulation. They show the feasibility of the method using a fuel injection system case study.

Failure Mode Effects and Diagnostic Analysis or FMEDA ( in Table 4) builds on FMEA [34] with adding three aspects to each failure mode that affects safety goals: (1) failure rate or the rate at which the component experiences faults; (2) whether there is a safety mechanism to detect the failure mode or probability to detect internal failures; and (3) the effectiveness of the safety mechanism at detecting faults. The end product of this analysis consists of the hardware parts associated with each failure or safety goal and different hardware metrics that show the level of safety readiness. In primary studies, FMEDA has been applied in the contexts of hardware of anti-lock braking system [P45,P50], system-on-chip [P12], FPGA [P28], and powertrain electronics [P95].

FMEDA augmented with simulation ( in Table 4) is presented by Sini et al. [P61] to help designers, especially in cases where the system’s behavior is highly coupled with the vehicle’s behavior. They simulate the system/component and generate possible failures or misbehaviors using fault injection and then propagate these mis-behaviors to the vehicle level using a vehicle dynamics simulator. They use it for failure effect classification by taking the predicted effects on dynamics and drivability of the vehicle.

Component Integrated Component Fault Trees or C2FTs ( in Table 4) are proposed as a combination of FTA and FMEA [35]. The resulting tree structure’s root nodes represent safety goal violations or hazards of a system, leaf nodes represent basic failure modes, and the intermediate nodes present the relation between failure modes and hazards with Boolean gates. Domis et al. [P16] use C2FT in the context of product lines.

Dependent Failure Analysis or DFA ( in Table 4) focus on safety goal violation due to possible common cause(s) and cascading failures between elements [33]. DFA aims to identify the common causes that can violate required independence or freedom from interference between elements and, in turn, causes a safety goal violation. Nardi et al. [P50] mentions the usage of this method in the case study of an anti-lock braking system.

Model based safety analysis or MBSA ( and in Table 4) is an umbrella term that is used to denote any safety analysis that uses a (formal) system model created ideally using a model-based development process, extended with a fault model [36]. Using a system model in the safety analysis can minimize subjectivity and be more complete, consistent, and error-free than using an informal system model or no model. The underlying safety analysis technique(s) can be any of the above-discussed techniques. MBSA, in conjunction with FTA and FMEA, is reported in primary studies [P48,P53,P84]. In addition, a study by Tlig et al. [P70] extends MBSA with simulation for safety analysis of automated driving systems.

Safety analysis techniques: There are 13 distinct techniques for safety analysis and similar steps discussed across 28 primary studies. These techniques build on two base techniques: (1) the top-down Fault-Tree Analysis or FTA; and (2) the bottom-up Failure Mode and Effect Analysis or FMEA. FMEA and its extensions are predominantly reported in hardware-specific contexts, while FTA and its extensions dominate usage in systems and software. While ISO 26262’s process dominates the underlying process for which these techniques are used, a good number of primary studies (5 out of 28) do not specify any process.

5.2. Analysis

We analyze the techniques presented in the prior section regarding their scope and context of usage. Based on this information and the comparison presented in the preceding section, we present research gaps and upcoming domain trends.

Each step in safety requirement elicitation can be conducted using multiple kinds of techniques. Especially for beginners, it is important to understand which techniques can be employed in a given context; for educators, which techniques to teach. Our study can aid in these directions. We find that no one technique fits all levels of an application or all use-cases for any step in the safety requirement elicitation. For any real-life use case, we believe it is best to use a combination of techniques to identify safety requirements. Each technique has its strengths and drawbacks. For instance, FTA can easily be applied for the safety analysis at system, software, and hardware levels; however, it might not be able to find dependent failures. To the best of our knowledge, no cheat sheet lists the strengths and drawbacks of individual techniques. Our taxonomy is a mere first step in this direction. However, an in-depth comparison of the techniques for their use in the automotive context is out of the scope of this study and is an essential future research direction.

No silver bullets: No one technique fits all application contexts; it is best to use a combination of techniques to identify safety requirements.

The repeatability of safety requirement elicitation techniques is a key factor in ensuring safety from a safety engineering and requirements engineering perspective. For instance, one element in the safety certification in almost all domains is based on an assessment by an independent team by repeating or assessing the safety cases. Such activities require objective safety requirement elicitation. The primary studies show that most methods rely on expert knowledge, rendering them subjective. Surprisingly there is little effort to quantify the subjectivity. We found only one study on the subjectivity of a technique, which focused on a specific risk-assessment technique [P30]. Even though methods like model-based safety analysis are proposed to increase reliability and repeatability of the safety analysis, (1) their adoption rate is low (only one of the primary studies uses model-based safety analysis in their case study), and (2) there are fewer such methods for hazard analysis and risk assessment.

Repeatability: Most methods rely on expert knowledge rendering them subjective and thus hampering repeatability. We emphasize the need for standardized, tool-supported techniques to reduce subjectivity.

Informed by our taxonomy of techniques presented in the prior section, we foresee four aspects that need further exploration: (1) comparison among techniques; (2) completeness and coverage of safety requirements; (3) lack of techniques to support steps of newer processes; and (4) whether the current techniques are a match for the new application contexts arising along with the automated and connected driving. We elaborate on each of these aspects below.

Comparison among techniques is necessary to identify the most suited techniques for use by both practitioners and researchers. There is little empirical evidence on which method to choose among the available options. Current studies only compare the techniques FTA, FMEA, and STPA, where STPA is considered as a technique rather than a process [P33,P89]. Existing studies argue that STPA is better than FTA and FMEA. However, no studies specify the scope of the techniques which could allow practitioners to make an informed decision on which technique to choose for a specific use case. Also, most of the studies that compare the methods are on toy case studies, which might not represent methods’ real-world efficacy.

Comparison: There is a lack of studies that systematically compare the requirement elicitation techniques.

Ensuring the completeness and coverage of safety requirements is essential for the rest of the development process. Failing to ensure this can lead to high costs, impact the timeline of product development, and potentially catastrophic consequences during operation [37,38]. However, we did not find any studies that look in this direction for any of the specified techniques.

Completeness & coverage: Completeness and coverage of safety requirements, especially in the context of automated and connected driving, are seldom explored.

Any process for safety requirements is ineffective unless there are systematic techniques to support or perform the individual steps in the process. Even though the steps in newer processes like STPA are similar to the steps from more established processes like the one from ISO 26262, the way to conduct them and their intended outcomes are either different or have different scopes. Thus the techniques to conduct the steps in traditional techniques do not apply to the newer processes, and we did not find any other systematic techniques in the primary studies to conduct the individual steps. For example, no systematic method to perform the individual steps of STPA is used in any of the primary studies that apply STPA [P27,P40,P42,P87,P91,P98]. Rather these studies use informal guidance and previous examples to conduct STPA. The only exception is a technique for hazard analysis, namely iterative hazard analysis ( in Table 2), which is specified to conduct a step similar to hazard analysis in a new process for automated driving (⑤ in Fig. 4).

New processes versus old techniques: The newer processes, especially based on fundamentally different approaches, lack systematic techniques to support their steps.

One primary enabler for current innovations in automated and connected driving is the use of ML-based sub-systems, especially for perception and planning [19]. Most of the techniques mentioned in this section were developed in and for an era prior to the development of these technologies. The special-purpose hardware like Neural Processing Units uses a different style of instruction execution than traditional processing units [39]. The applicability and adequacy of current techniques, which might be built on the assumptions for traditional general-purpose hardware, need further studies. It is also yet to be seen whether the current techniques can be applied in the context of safety requirements for developing neural networks, which are increasingly used in perception and planning subsystems of automated driving stacks.

ML-based systems: The adequacy of current techniques to elicit requirements for special purpose hardware and ML-based software is yet to be seen.

6. IMPLICATIONS

We presented a taxonomy and comparison between different processes. One use-case of this work is as a cheat sheet or guidebook for practitioners and educators. This work outlines what exists in the peer-reviewed literature in almost every technical dimension of automotive safety requirement elicitation.

This study has broad implications from research to practice. We present the implications in the rest of this section in the following five parts:

  • current state of safety requirement elicitation;

  • a trend of creating islands of knowledge;

  • the changing landscape of the automotive domain;

  • the lessons that can be learned and reused beyond automotive software engineering; and

  • education

6.1. Current State

We discuss two aspects of the current state of safety requirement elicitation research: maturity of the research field and transparency & replicability of the primary studies.

Maturity: The safety requirement elicitation for older technologies has matured while the newer concepts need further research. Four ways to empirically measure the maturity of a research area are (1) author divergence [40], where a diverse set of authors indicate a mature research field; (2) prevalence of case studies [41] where the bulk of case studies point to maturity; (3) relation between academic work and what is applied in the field [40,42] where more evidences of industry/practitioner participation form an indicator of maturity; and (4) convergence of best practices [42] where a majority of studies showing adaptation of a set of similar practices indicate maturity. Based on these four parameters, we classify the safety requirement elicitation into two contexts: for traditional components and the newer age concepts and components.

We define traditional components as the components in production for more than a decade. Examples are the steering system and most items belonging to vehicle-centric functional components as shown in Fig. 6. We define newer age concepts as those that have not entered or are currently entering the production stage and components that enable the implementation of the concepts. Example concepts are highly automated driving and example components are ML-based perception systems that enable highly automated driving. The safety requirement elicitation relating to the former context (traditional components) is more mature than the latter based on the above mentioned four metrics. The evidences are high author divergence, a wide range of case studies, and increased industry participation (based on author affiliation). For the fourth metric, the convergence of best practices, we have two angles: processes and techniques. In the context of processes, we can see convergence to ISO 26262. In the context of techniques, we can see convergence to brain-storming and related techniques for hazard analysis, usage of metrics from ISO 26262 for risk assessment, and FTA and FMEA-related techniques for safety analysis.

Transparency & replicability: A majority of the primary studies do not specify details on techniques that the studies employ, hampering transparency and replicability. In the case of hazard analysis, many studies do not specify which technique they use for hazard analysis; instead, they directly present the results of hazard analysis. For risk assessment, the assumptions on coming up with a specific value for exposure, controllability, and severity are often not specified. For safety analysis, the intermediate results are often not presented, making it hard to understand and replicate the final result. For future studies, we recommend reporting the techniques, intermediate results from these techniques, associated assumptions, and their scope.

6.2. Islands of Knowledge

From our 102 primary studies, we noticed a systematic trend of creating islands of knowledge inside companies where the detailed knowledge on safety requirement elicitation stays inside the companies and are not available via peer reviewed publications. We present two related aspects below.

Sharing specifications, intermediate results, and related data of research: Continuing our prior discussion on transparency & replicability, the majority of the studies, except for a few (e.g., [P32]), do not share the details on case studies. For example, an operational design domain (ODD) definition is essential for making an informed judgment on any result of hazard analysis of a system. However, most studies do not provide details but only discuss the final result. The assumptions (e.g., exclude snowing conditions) during the process and the intermediate steps (e.g., fault trees; hazardous events) are essential to judge the results produced in the articles and, most importantly, to build on for future studies. In its current format, this “unknown details” makes it difficult for newer researchers and other related disciplinarians to enter this field. The sharing of related data on research should be the norm as in other software engineering fields like mining software repositories.

Sharing real-case studies: Given the existence of myriad vehicle types with various capabilities, it is safe to assume that considerable safety requirement elicitation has been performed for their development. Yet, to our knowledge, no real-world, industry-scale case studies have been published on this topic in peer-reviewed literature. So far, publications in collaboration with industries either use toy case studies (e.g., [P1]) or very high-level abstractions (e.g., [P86]) hampering any real-world reproducibility. The publication of real-world case studies can help researchers and the automotive community to identify current issues facing the industry and contribute to rectifying and proposing methods and techniques, rather than creating islands of knowledge inside companies. This can reduce the work to re-invent the knowledge and benefit the companies to get better talent and suggestions from academia. Additionally, with the higher reliance on software and related components for automated and connected driving, openness to independent safety verification/certification (in contrast to self-certification) should be made a norm.

6.3. Changing Landscape

Is safety requirement elicitation catching up? The last decade marks arguably the most significant paradigm shift in the automotive industry since its inception. Four dimensions of this shift are the transition from internal combustion to electrification, automated and connected driving, ongoing shift to open source software development, and start-ups entering the field and finding success bringing disruptive ideas and new business models. This means profound changes in almost all dimensions of automotive software and the electronics that run them. Whether the safety requirements elicitation is catching up is still an open question. There is relatively more research on transition to electrification (especially on battery and power-train) as evident from Fig. 6. The open-sourcing trend is still unfolding [19] and its safety requirement elicitation side might be too early to research. Automated and connected driving is achieved by combining special purpose hardware, ML-based software, and traditional software to make them work seamlessly. We discuss three dimensions of this topic further: (a) functional safety; (b) safety of intended functionality in the context of ML-based components; and (c) connected driving-specific issues. All the above aspects are particularly challenging to safety certification bodies across the globe.

Another paradigm change happening in automotive industry is over-the-air-updates (OTA). Through OTA new features are introduced and bug-fixes for existing features are performed. When safety-critical functionalities are either introduced or updated via OTA, how to evaluate implications on safety requirements and safety analysis is unclear. The current methodologies does not consider such updates.

ML-based components & functional safety. Both software and hardware components specific to ML (and mainly neural networks) function fundamentally different from traditional components. For example, the software components are built from data rather than human logic. The hardware components are based on multi-threaded execution, parallelism, and many levels of optimizations compared to non-parallel execution units otherwise used. Given the ML-based components are now becoming a part of safety-critical applications, we need (functional) safety requirement elicitation methods that take the peculiar nature of these components into account, which is not the case currently. This might need thinking with a different perspective altogether than the traditional kinds.

When we look at the case studies (on functional safety), the vast majority use traditional processes and techniques. Even though newer methods and techniques for automated and connected driving are proposed, there is little evidence for their applicability and suitability in the real world. Also, the sufficiency of current processes and techniques to address ML-based systems is another daunting question for industries and safety certification bodies. Multiple processes and techniques might be applicable to ML-based systems. While our study presents a comparison among processes, no study on in-depth comparison of techniques, in the context of the automotive domain. To summarize, there are many potential directions for research in the ML-based component’s context, including case studies on newer processes and techniques and in-depth comparison among existing techniques for their suitability, sufficiency, and scope.

ML-based components & safety of intended functionality. Safety of intended functionality, especially in the context of removing the human operator fallback, is out of the scope of traditional processes and techniques. Even though methods like STPA is developed to cover some aspects, its real-world adaption is low, and so far, the scope of STPA is at the system level. The standard, ISO 21448, is proposed to tackle the safety of intended functionality. It is still in its beginning stage and gives only conceptual directions than concrete guidelines. Safety of intended functionality in both hardware and software levels still needs processes and techniques that can be applied to special purpose hardware like neural processing units. The processes and techniques could be entirely new, extension of current ones, or studies show that existing techniques are applicable in such settings.

Formal modeling and verification for functional safety and safety of intended functionality. While formal modeling and verification is a mature field which can be used to demonstrate and ensure safety guarantees, only a small fraction of the primary studies [P17,P48,P53,P70,P84] at least mentions some form of modeling or verification being employed. From both a functional safety and safety of intended functionality perspective, there is research and practice gap for a more comprehensive treatment. This could be formal modeling of (parts) of the system combined with model checking and conformance testing. We believe that limitations and feasibility of application of such formal methods to completely automated driving in real-life settings need to be explored further.

Connected driving specific issues. There can be three kinds of communications connected driving: vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I), and vehicle-to-road users (V2R). The first two kinds are more established than the third kind. As we mentioned in Section 4, there is a lack of methods that integrates V2I along with the traditional automotive safety requirement elicitation. There are at least two challenges here, (1) safety requirement elicitation for communication intermediaries like cloud for connected driving12 and (2) requirements spanning across multiple vendors (e.g., the manufacturers for smart traffic infrastructure and vehicles might be different). Further research is needed in these two directions.

While vulnerable road users form about a third of fatalities in road accidents in the USA13, safety requirement elicitation for vehicle-to-road user communication is not mentioned in any of the primary studies. It is still an open question how to complement the information on road users (cyclists and pedestrians) beyond vehicle sensors (like camera and LiDAR) and remove their blind spots. Safety requirement elicitation regarding the same is also in its nascent stages and needs further research.

6.4. Multi-Disciplinarity: Beyond Automotive Software Engineering

This article explored the technical dimension of safety requirement elicitation. However, there are other dimensions, e.g. combining safety requirement elicitation with aspects like security, tool support, and the human and social factors of safety requirement elicitation. These directions are yet to be explored.

While this article considered a system and software view of safety requirement elicitation, the area is primarily multi-disciplinary, as shown in Fig. 3c. Also, there are similar domains like advanced robotics (e.g., robots from BostoDynamics)14, which are multi-disciplinary, with similar challenges in perception systems and unsupervised deployment (in the future). It might be interesting to see how these domains compare and what can be learned from each discipline.

6.5. Education

The next generation of engineers who will have to build more of these new systems is being educated today. In the context of software engineering, most curricula do not focus on safety, let alone the safety of the newer types of systems. Since software systems are becoming more safety-critical, this trend needs to change, and safety has to be incorporated as a topic. We believe it is equally important to educate on the caveats and limitations of current techniques since it is crucial to understand the scope of existing processes and techniques for their correct usage in real-life. Also, in typical software engineering curricula, a systems view is seldom included but is crucial in the context of safety requirements. We strongly encourage including this in the curricula for educating future software engineers.

From a practitioner’s perspective, it might be difficult to have a definitive idea on which processes and techniques to use in different contexts, especially for ML-based systems. While this article can be used as a cheat-sheet, there is still room for an in-depth comparison of the techniques. Some directions are: (a) the scope and possible application contexts of each technique; and (b) objective criteria on when to and when not to choose a technique.

7. THREATS TO VALIDITY

This section presents the threats inherent in our study despite our best attempts at mitigating them during design, implementation and interpretation. Below we use the framework by Cook and Campbell [43] to discuss the threats to validity.

Construct validity: For a systematic literature review to be useful, the articles analyzed should be representative, presenting an overview. To identify representative scientific articles, we employed multiple methods, including:

  1. 1.

    the PICO method (see Section 2.2) to create the search string;

  2. 2.

    both forward and backward snowballing on the primary studies to improve the coverage of relevant research articles and to make the process as insensitive to the choice of the search string and databases as possible;

  3. 3.

    documentation of our search process, inclusion-exclusion criteria and information extracted from the selected articles for transparency and replica-bility of our research; and

  4. 4.

    independently assessing the intermediate steps as authors and resolving disagreements with discussions. Still, we might have missed relevant articles, for instance, non-peer-reviewed articles.

Conclusion validity: In comprehending the scientific literature for presenting insights, we foresee two threats: one, characteristics of the available literature, and two, researcher bias in interpretation. For the first part, we have limited control over the availability of scientific literature. For instance, we observed that case studies on real-life vehicles could have added deeper insights for investigation. Unfortunately, such studies are limited. To mitigate researcher bias in interpretation, we followed the definitions from the literature and did not make assumptions or choices if the text was not clear. Also, we explicitly defined the criteria we followed to ensure objectivity.

Internal validity: As indicated above, our findings are susceptible to publication bias since we only analyze published scientific literature. Our choices ensure that our analysis rests on systematic scientific findings but implies that non-peer-reviewed articles (e.g., negative results) are systematically removed. It is possible that a study including other sources of information can present a different picture than the one depicted here.

External validity: The representativeness of our study hinges on the domains we considered for our analysis and the representativeness of the studies within these domains. We search for relevant articles on a broader range: using multiple databases and applying techniques such as snowballing. But our investigation is also limited by our choices to study articles written in English, non-grey literature, and published in reputed venues. All of the above factors influence our findings’ generalizability and define the scope within which our results apply.

8. RELATED WORK

We classify the secondary studies that cover safety requirements during product development into three categories: (i) studies applicable to multiple domains; (ii) studies on safety assurance; and (iii) studies that focus on the automotive domain. Since our scope is limited to safety, we do not discuss secondary studies on the intersection of safety and security. This section discusses relevant secondary studies and excludes primary studies as this article is a secondary study.

8.1. Cross Domain Studies

Pereira et al. conducted an SLR on requirements to be considered and on requirement engineering practices a challenges in developing embedded systems [6]. They summarize open issues that can be potential research directions, which include: “Requirements Specification for automotive system”; “Improve the development process for ensure functional safety requirements”; “Handling of non-functional requirements such as, safety,”; “Specification of safety requirements”; and “Analysis of hazard and threats,, and safety”.

Martins et al. present an SLR on approaches to elicit, model, specify and validate safety requirements in the context of safety-critical systems and usage of these approaches in industrial settings [5]. This work forms a precursor to our study, as detailed in Section 2.2.

Vilela et al. performed an SLR to explore methods to improve the integration between RE and safety engineering [7]. They identified techniques and tools used for hazard analysis and safety analysis (to be) used by requirement engineering and safety engineering teams. They also generated a taxonomy of these techniques along with a separate taxonomy of the safety information used or generated by these techniques. Vilela et al. also performed a systematic mapping study on approaches to improve the integration of specification, analysis, and exchanging information on all artifacts involved in the requirements engineering process among stakeholders [44]. Most of the approaches found by this study are domain-independent (78%), while 5% of them are specific to the automotive domain. The study also found that over 76% of the approaches do not follow any safety standard. These studies are complementary to our focus.

Bozhinoski et al. [8] presents a systematic mapping study on managing safety in mobile robotic systems from a software engineering perspective. Their primary studies on safety management in the context of self-driving cars conclude that “self-driving vehicles lack a standardized platform, processes, and tools for designing and analyzing safety approaches” [8]. However, they did not consider safety standards from the automotive domain.

Another direction pursued in literature is functional safety in product line engineering. Baumgart et al. conducted a systematic mapping study to summarize topics covered and evaluation approaches used in the literature on the functional safety of product lines [9]. Gadelha Queiroz et al. performed an SLR for identifying approaches, methods, and methodologies from the intersection of the product line engineering and model-driven engineering for safety-critical systems [45]. We consider this direction of research out of scope for this work.

8.2. On Safety Assurance

Safety assurance is about providing evidence that safety is ensured. Safety requirements are a part of safety assurance, and in this SLR, we focus on eliciting functional safety requirements. Nair et al. conducted an SLR towards establishing evidence for system safety [46]. They focused on three aspects: the information that constitutes evidence; structuring of evidence; and evidence assessment. There are also model-based methods based on goal structured notation and conceptual models for establishing evidence for system safety [47,48,49].

Bolbot et al. examined different sources of complexity introduced to cyber-physical systems and explained different methods for safety assurance in those contexts [50]. This work does not specify methods used in the automotive domain nor uses any systematic method to identify the different techniques in the literature for safety assurance.

Gleirscher et al. performed a rigorous literature survey on design and argument patterns for the assurance of system safety [51]. They primarily focus on reusable design patterns and categorize them into software, electrical/electronic, and mechanical hardware.

Ingibergsson et al. [52] conducted a systematic mapping study on (coding) practices in developing safety-critical software for field robots. Field robots are machinery for outdoor tasks like agriculture. They concluded that most approaches propose solutions to attain safety, focus on behavior modeling, and do not stress reliable software development e.g. involving formal verification. Half of the literature they considered uses non-standardized methods to develop software. They also note an increase in safety-related issues with the introduction of computer vision.

In this work, we concentrate on approaches for generating safety requirements rather than safety assurance. However, conclusions from these studies, for example, the nascence of technologies like computer vision which does not follow traditional development methods, equally, apply in the context of our work.

8.3. On Safety in the Automotive Domain

Some secondary studies from the last decade discuss safety and are specific to the automotive domain. A few studies ([53,54]) outline safety-related methods across the entire product life cycle, while others focus on a specific stage of product development [55], a specific technology [56] or a specific use-case [57]. In addition to the secondary studies, there is a proliferation of experience reports [58,59]. We do not cover experience reports in our related work since they form a different class of studies than secondary studies.

Studies by Kannan et al. and Gosavi et al. consider the entire product life cycle [53,54]. Kannan et al. [53] present a gap analysis between the objectives of the ISO 26262 standard and safety-related techniques to achieve these objectives. They conclude that there is a need for tooling for conducting HARA, combining subphases of product development, integration among the product development phases of design, requirement management, and validation and verification. They also find a lack of methods for ASIL decomposition for large-scale systems.

Gosavi et al. [54] summarize four primary studies from the product development phase at the system, hardware, and software levels covered by parts 4, 5, and 6 of ISO 26262. They noted the lack of a standard and inadequacy of ISO 26262 to address functional safety in the development of autonomous and semi-autonomous vehicles.

Gheraibia et al. [55] reviewed different approaches for the specific sub-phase of product development, ASIL allocation. ASIL allocation methods find optimal allocations of ASILs (QM, A, B, C, D, with QM being negligible risk and D being highest risk) to system architecture components such that safety requirements are guaranteed to be met at the lowest possible cost. They classify approaches for ASIL allocation into two categories – exact and optimization approaches – and present the pros and cons of each method.

Borg et al. present a short review of validation and verification (V&V) methods for the safety of the specific technology, machine learning and deep learning methods, to be used in the automotive domain [56]. They classify support for V&V of machine learning and deep learning-based systems as formal methods, control theory, probabilistic methods, process guidelines and simulated test cases. They conclude that the V&V methods for machine learning and deep learning lag compared to other areas. Also, those V&V methods suggested by ISO 26262 do not apply to developing components that use machine learning and deep learning methods. They stress the need for a new standard for them.

Axelsson [57] presented an SLR and a gap analysis on safety in the specific use-case of platooning. Platooning is a coordinated movement of vehicles as a train with minimal distance between vehicles in the pack. Specifically, they focus on safety analysis methods, hazards and failures, and solutions to improve safety for platooning. The authors also note a lack of studies from commercial settings and current research primarily from research prototypes or technology demonstrators.

Even though many of these studies overlap with our work, we did not find any systematic study on eliciting safety requirements for the automotive domain.

9. CONCLUSIONS

This work characterizes safety requirement elicitation in the automotive domain and presents a comprehensive overview of the pre-GPT era landscape through a systematic literature review. This article is a first step in the direction of summarizing, taxonomizing and comparing processes and techniques for safety requirement elicitation from the automotive domain.

We identified nine safety requirement elicitation processes and 38 distinct techniques. Out of the nine processes, we observed that the process outlined in the ISO 26262 standard forms the basis for requirements elicitation in the automotive industry. Other processes have been proposed to complement, extend or replace the process outlined in the standard. This article offers an overview, comparison and taxonomy of such processes and corresponding techniques for safety requirement elicitation in the automotive domain. Based on this information, temporal adoption trend, usage context and scope, we presented research gaps and discussed current domain trends.

We empirically showed the immaturity of the safety requirement elicitation concerning newer concepts and components like automated and connected driving. For instance, the majority of processes and techniques for safety requirement elicitation were proposed more than two decades ago. At the same time, the upcoming perception and decision systems for automated driving rely on ML-based components whose feasibility of usage in real-world scenarios (e.g., neural networks) was demonstrated only a decade ago. The applicability and scope of the processes and techniques to these new-age components is still an open question. Another example is the lack of studies on safety requirements regarding traffic infrastructure and communication with vulnerable road users, which are critical to accomplishing connected driving. Other dimensions, e.g. combining safety requirement elicitation with aspects like security, tool support, and the human and social factors of safety requirement elicitation, are yet to be explored.

We emphasize the importance of safety requirement elicitation of software and related systems and show the importance of a systems and multi-disciplinary perspective. This work opens up many future avenues for research and provide a concise and comprehensive guide to practitioners and educators.

Conflict of Interest

The authors declare that they have no conflicts of interest. Note that because the third and fourth authors of this article are Editors-in-Chief of the journal, the peer review process has been managed without any involvement from these Editors-in-Chief.

Data Availability

This article is self-contained and any further data that support the findings of this study are available on request.

Funding

This work is part of the i-CAVE research programme (14897 P14–18) funded by the NWO (Netherlands Organisation for Scientific Research).

Authors’ Contribution

S. Kochanthara contributed to study conceptualization, study design, study implementation and writing of the manuscript. L. Cleophas, Y. Dajsuren and M. van den Brand supervised the study and contributed to the writing (reviewing & editing) of the manuscript.

Footnotes

National Transportation Safety Board (NTSB) is a U.S. government agency for civil transportation accident investigation.

Since Cohen’s Kappa statistic being at least 0.80 is considered best agreement [22,23].

Plant (in control theory) refers to a machine or system. Plant modeling refers to specifying the system as a relation between output and input signals. Plant and fault modeling refers to modeling both the system and probable faults.

A cut-set is a minimal combination of failures of components, which, if they occur in conjunction, lead to a hazard” and fault tree analysis is used to find cut sets [P23].

Cost heuristics are functions that determine the cost of associating the ASILs to each component of the system” [P23].

In some processes, the intermediate safety goal step is skipped, and safety requirements are directly derived from hazardous events.

Connected driving refers to a set of vehicles and traffic infrastructure communicate various parameters to optimize the collective traffic behavior. This communication is often achieved by using intermediaries such as cloud.

REFERENCES

ISO, ISO/PAS 21448:2019, Road Vehicles — Safety of the Intended Functionality, International Organization for Standardization (ISO), Geneva, Switzerland, 2019.
K Czarnecki, Automated Driving System (ADS) High-Level Quality Requirements Analysis. Driving Behavior Comfort, Waterloo Intelligent Systems Engineering (WISE) Lab Report, University of Waterloo, Waterloo, Canada, 2018.
T Pereira, D Albuquerque, A Sousa, FM Alencar, and J Castro, Retrospective and Trends in Requirements Engineering for Embedded Systems: A Systematic Literature Review, Proceedings of the CIbSE Conference, CIbSE, #SLR, IEEE, Buenos Aires, Argentina, 2017, pp. 427-440.
B Kitchenham and S Charters, Guidelines for Performing Systematic Literature Reviews in Software Engineering, EBSE Technical Report, Keele University, Keele, UK, 2007.
NG Leveson and JP Thomas, STPA Handbook, MIT, Cambridge, MA, USA, 2018, pp. 1-188.
IEC, IEC Functional Safety and IEC 61508, Standard, International Electrotechnical Commission, IEC, Geneva, Switzerland, 2010.
DH Stamatis, Failure Mode and Effect Analysis: FMEA From Theory to Execution, ASQ Quality Press, Milwaukee, WI, USA, 2003, pp. 455.
D Domis, Integrating Fault Tree Analysis and Component-Oriented Model-Based Design of Embedded Systems, Verlag Dr. Hut, München, Germany, 2012.
A Joshi, M Whalen, MP Heimdahl, and SP Miller, Model-Based Safety Analysis: Final Report. Technical Report, University of Minnesota, Minneapolis, MN, USA, 2005.
SM Kannan, Y Dajsuren, Y Luo, and I Barosan, Analysis of ISO 26262 Compliant Techniques for the Automotive Domain, Proceedings of the International Workshop on Modelling in Automotive Software Engineering (MASE) co-located with 2015 ACM/IEEE 18th International Conference on Model Driven Engineering Languages and Systems (MODELS), ACM/IEEE, Ottawa, Canada, 2015, pp. 33-42.

PRIMARY STUDIES

Cite This Article

ris
TY  - JOUR
AU  - Sangeeth Kochanthara
AU  - Loek Cleophas
AU  - Yanja Dajsuren
AU  - Mark van den Brand
PY  - 2026
DA  - 2026/04/02
TI  - AI Safety Requirements: A Perspective From the Automotive World
JO  - Journal of Software Engineering for Autonomous Systems
SN  - 2949-9372
UR  - https://doi.org/10.55060/j.jseas.260402.001
DO  - https://doi.org/10.55060/j.jseas.260402.001
ID  - Kochanthara 2026
ER  -
enw
bib