Managing the Risk of Industrial Incidents
Traditional risk models overlook key factors. A consequence-based method offers a clearer path to practical mitigation
Industrial Control Systems (ICS) cybersecurity risk calculation is traditionally conducted through the formula R ($) = Impact ($) x Probability of occurrence (% ). While this formula is broadly used, experts agree that its applicability is problematic. It does not consider all the applicable risk factors, including the known vulnerabilities that were disclosed but not mitigated, or the effectiveness of deployed defense measures.
Consequence-based Risk Analysis
This may be considered a more practical approach than the conventional “probability of loss” because of the difficulty of assessing the likelihood of a cyber-attack and the probability of causing damage.
Step 1: Consider possible causes of an industrial incident
Refer to these distinct factors that might lead to an incident :
a. Unexpected Hardware or Software failures – Those incidents might happen at any moment, even after years of regular operation, without anyone acting or making a mistake.
b. A mistaken action of an authorized employee; An incident might occasionally happen, and in such a case, the operator in the control room might get confused and act wrongly.
c. A disgruntled employee causes internal sabotage or might be initiated by a crime organization, a hired adversary motivated by a hostile nation, etc.
d. Internal or externally-initiated cyber-attacks – Initiated by inserting an infected USB stick or after compromising the barrier between the infected IT zone and the ICS-OT Network.
e. Supply chain originated attack – This is considered among the most severe risks, as a broad range of attackers might start through this vector.
STEP 2 – Assess the type of damage
The list of attack paths reaching the ICS-OT zones may cause severe damage to the organization, resulting in financial losses and disrupting business continuity. Among those are:
a. Operation Outage – The outage might last hours, days, or weeks. If the organization is well organized, its team can react quickly, stop the harmful process, and shorten the outage.
b. Damage to production machinery – Adversaries may cause an extended outage by damaging the production machinery. Repairs may be expensive and last weeks to several months.
c. Harming lives – In extreme cases, damaged machinery (caused by destruction, release of dangerous fumes, fire, or smoke) might harm people working nearby.
Step 3: Decide on the response
The combined figure, estimated impact, and cost may provide a basis for determining an appropriate response. Organizations may consider this guideline a more practical and granular method for calculating risks that must be managed.
a. Accept risks - An organization may be prepared to absorb a certain level of damage and cover that from available internal funds.
b. Transfer Risk – An organization may purchase an applicable type of cyber or other type of insurance covering all the expected losses.
c. Reduce Risk – An organization may redesign its operations to lower productivity, thereby reducing the consquences and damages in case of an incident.
Conclusion
Adherence to the above-proposed methodology might not be accurate for all verticals, and might not match all regulations. However, it will enable organizations to be better prepared for unexpected incidents, thereby minimizing consequential damage.
---------------------------------------------------------------------------------------------------
Daniel Ehrenreich, BSc. is a vendor-independent consultant and lecturer acting at Secure Communications and Control Experts (SCCE) and periodically teaches and presents at industry conferences on the integration of cyber defense with industrial control systems; Daniel has over 34 years of engineering experience with ICS and OT systems for electricity, water, gas, and power plants. LinkedIn