Technical Assessments for ICS—Know the Risks
Although value can be derived from offline methods such as paper-based framework assessments, many critical discoveries can only be uncovered through a technical assessment using online, active assessment techniques.
Three reasons why you need a technical assessment
- Understanding your attack surface. In software security, attack surface represents the sum of the points with which an unauthorized user can interact. We can broaden that concept beyond just software to the larger system encompassing the networks, servers, workstations, infrastructure, embedded devices, etc. The concept of attack surface then becomes incredibly useful for understanding the opportunities a malicious actor or malware has to interface with the system in question.
- Understanding your attack path. Once we understand attack surface, we can also begin to identify logical attack paths—the opportunities an attacker or malware has to move or advance from one network or system to the next. Very often in ICS we are looking at how the attacker could make it from an Internet-connected environment such as the business network. Perhaps the easiest logical path is through a jump server in a DMZ, a multi-homed historian, or an ICS DMZ host with vulnerable software, to name just a few examples.
- Understanding your vulnerability. Understanding the attack surface and attack paths, we can now look at specific component vulnerabilities through a realistic lens (i.e., one that takes into consideration the ICS impact) and begin to formulate an understanding of the vulnerability of the system and the process as whole.
None of these discoveries are possible without an online, technical assessment. While other techniques like documentation review and stakeholder interviews are useful and add to the picture, they rarely give you the full picture.
Three perceived challenges to using online tools
Simply put, the challenge is performing the online assessment in a manner that is safe for production operations—safe in the sense that, depending on the function of your ICS, human health and safety is not adversely affected and there is no system outage in the process of conducting the assessment. It can be done but there are some perceived challenges to overcome.
No one would argue that understanding potential risks within your ICS environment is important. Yet there continues to be resistance within the industry to embrace technical assessment tools and techniques. Many reason that the risk of jeopardizing the safety and/or reliability of production environments is too great—and not without cause.
Traditional IT/automated scanning and testing techniques certainly can cause system interruption if not performed correctly. Additionally, legacy systems, poorly-designed and fragile technology stacks can present a challenge.
There have been many cases of assessors or well-meaning security professionals running an assessment tool in an ICS environment and causing a negative outcome. Examples of such ‘negative outcomes’ include everything from bricking the device, overriding or destroying data, or forcing reboot of controllers or PLC's to the point where you have to reload a configuration. And if you don’t have a backup of that configuration then you're in real trouble! So, caution is justified.
But without some level of technical assessment, you're left with a paper-based approach to assess your risk. A paper-based assessment often relies on interviewing stakeholders, reviewing documentation and architecture diagrams, system walkdowns, and assessment and testing in offline, backup systems. Such methods do provide value (and are very often done in conjunction with technical assessment), but in and of themselves they fall short of telling a complete story.
Three components to a solid plan
Assessment success starts with a plan and plan starts with understanding the purpose.
- Determine the primary objective (and cadence) of the assessment. Is the primary objective asset identification, asset management, vulnerability management, risk assessment, or something else? Often the primary purpose is asset discovery. Because of configuration drift and cases where devices are added without documentation, you don't always have an accurate representation of what's there. Understanding your asset base is a critical step to assessing risk in ICS environments. That initial asset inventory can then set the foundation for other security functions such as vulnerability management, configuration and change management, etc. Understanding what the long-term objective of the assessment will help inform the plan.
- Consider the impact to operations. Regardless of what that purpose is when you're in this planning phase, we need to work with people like instrumentation and automation (I&C) technicians and engineers to make sure that we understand potential impacts. Across multiple levels—the system, the network, the device, and the component levels—we need to understand potential impacts to the production process if a device or service is taken offline.
- Perform a physical walk through. We also recommend that you perform a physical walk through. Pay attention to things like the vintage of the system. The vintage of the equipment must be considered when deciding what level of online assessment is going to work for a particular environment. Over time there has been some maturity advancement in the way that vendors built these systems so a system from the 90s likely will not have the same level of robust communication stacks as one just deployed this year.
All of this leg work will inform what level of online assessment is going to be successful in your environment.
Three reasons you “can’t” that shouldn't stop you
Here's what you need to know to be able to discuss risk in ICS. The consequence of not performing any active scanning or technical assessment is an indirect acceptance of an unknown risk. Here are some examples:
- Fragile technology implementation may be creating a fragile operational process or one prone to failure.
- Lack of redundancy creates risk. If lack of redundancy is a challenge such that you cannot accept any potential impact, obviously this is its own risk.
- If it can't handle a low impact scan, then will it be able to handle something as simple as a misconfiguration or an equipment failure that's sending stray traffic onto the network?
If your system is so fragile—and at the same time—so critical that it cannot undergo any kind of online assessment, that’s a risk in and of itself. If there's no redundancy, if there's no backup plan, if there's no tolerance for any kind of downtime, that is a risk that needs to be addressed. These are important counterpoints for system owners who may be hesitant to use online assessment techniques. Although fragile environments are becoming less common, the historic reactions and attitudes persist.
Top level takeaway? Don’t fall victim to indirect risk acceptance. You need to be doing some level of online assessment of your ICS. If not, you're accepting some unknown risk.
Going to S4x19 next week?
Catch Jason's session for the follow-on discussion to this post, High Value, Low Impact Assessment Tools for ICS
About the Author
Jason leads Revolutionary Security’s Industrial Control Systems (ICS) practice. He has been actively involved in helping secure SCADA, DCS, and other Operations Technology (OT) for over 15 years with experience spanning the utility, oil and gas, chemical, and manufacturing industries.