Pattern: Safely Importing Data

Overview

Computer systems often require interaction with external environments to function effectively, making it vital to import information into your organization’s systems without unintentionally bringing in harmful code.

Unfortunately, the same pathways that facilitate the import of legitimate data also serve as potential entry points for attackers seeking to infiltrate your systems with malware. This vulnerability applies to all forms of data import, including network connections and removable media. Every external system may pose a risk, making it essential to assess all sources critically.

This guidance outlines a series of technical controls that can help manage risks associated with data imports conducted over networks. It is especially crucial for systems where integrity or confidentiality is key, such as those managing sensitive information, classified data, significant transactions, or controlling industrial operations.


Content Structure

The guidance is organized into the following sections:

  1. Defensive measures for network attacks
  2. Preventing the import of harmful content
  3. Suggested practices for data import
  4. Monitoring for breach attempts


A. Defensive Measures for Network Attacks

Data is typically introduced into a system via network connections and suitable transport protocols, which might include generic protocols like SFTP or SMTP, or system-specific APIs.

Unfortunately, both network connections and transport protocols can be vulnerable to exploitation. A successful breach of either could allow an attacker to gain control over destination servers or network devices.

Nature of the Attack

A network-based attack may focus on any layer within the OSI model.

Media layers (physical, data link, network)

Possible attack methods include:

  • Sending a malformed Ethernet frame to exploit vulnerabilities in the Ethernet driver of the target device
  • Crafting a malformed IPv4 or IPv6 header that exploits vulnerabilities in the device’s IP stack, firmware, or operating system

Host layers (transport, session, presentation, application)

Possible attack methods include:

  • Sending malformed protocol headers that compromise vulnerabilities in protocol libraries (e.g., TCP, UDP, SIP).
  • Submitting messages designed to exploit vulnerabilities in transport compression libraries (e.g., ZIP or GZIP).
  • Creating malformed messages that target vulnerabilities in transport layer encryption (e.g., TLS).
  • Attacking applications or services that manage application layer protocols (such as HTTP, SMTP, or SFTP).

Defensive Techniques

The following controls, when combined, can lower the likelihood of a successful network attack:

  • Rapid patching of software and firmware within network and computing infrastructure, encompassing both operating systems and applications. Effective patching mitigates the risk posed by known vulnerabilities that have been addressed by vendors or the broader community, but it does not safeguard against attackers who exploit undisclosed vulnerabilities.
  • One-way flow control. Implemented using techniques like data diodes, this ensures that data only moves in a single direction through the channel. While flow control does not eliminate vulnerabilities within the destination system, it complicates an attacker’s ability to carry out command and control operations, export data, or glean further information from protected services.
  • Application of a simplified transfer protocol with a protocol break. A protocol break terminates the network connection and the application protocol, after which the payload is sent via a simplified protocol to a receiving process that reconstructs the connection and forwards the data. A well-designed protocol break significantly reduces the risk of a protocol-based attack on the destination system.

A protocol break is typically used alongside flow control measures for enhanced security.

Furthermore, with appropriate implementation, the combination of a protocol break and flow control can significantly reduce the risk of a successful network attack. Testing these implementations is critical to ensure they function as intended.


B. Preventing Import of Malicious Content

Attackers often attempt to compromise computer systems by embedding malicious code within files or data objects that the target system processes.

This malicious content is designed to execute the attacker’s code, which may be contained within the content or downloaded as part of the attack.

Nature of the Attack

There are various ways an attacker could compromise a target system using malicious content.

Possible attack methods include:

  • Sending malformed compressed content aimed at exploiting vulnerabilities in decompression algorithms
  • Delivering malformed encrypted content targeted at vulnerabilities in decryption processes
  • Sending syntactically incorrect content that exploits vulnerabilities in parsers used by the target system, such as JSON or XML parsers
  • Sending semantically incorrect content that takes advantage of flaws in how the target system processes data
  • Sending content that exploits logical errors in the target system, allowing an attacker to perform unauthorized actions
  • Embedding active code (e.g., scripts or macros) into editable formats like PDFs or Word documents, aiming to execute malicious code on the target system

Complex data formats are often challenging for developers to effectively create and test parsers for. This complexity enhances the potential for inadvertently introduced vulnerabilities in their software.

For instance, consider productivity documents such as PDFs and Microsoft Word files. These can comprise thousands of interlinked data objects, each with different ‘type’ definitions, presenting a considerable attack surface for discovering vulnerabilities.

Defensive Techniques

The following controls can be implemented together to lower the risk of malicious content successfully breaching a system:

  • Rapid patching of all services and applications utilized to open or handle content, inclusive of any library dependencies, middleware, and underlying operating systems. Patching diminishes the risks associated with known vulnerabilities but does not protect against attackers who exploit undisclosed vulnerabilities.
  • Thorough engineering and testing of components involved in handling or processing external content to uncover and mitigate the risk of vulnerabilities during development.
  • Syntactic and semantic verification of content to ensure its correctness before it reaches the system responsible for interpretation. Verification components should be robustly designed and positioned after protocol breaks. Syntactic verification assures that the object structure and syntax are valid (e.g., valid XML or JSON conforming to a specified schema). Semantic verification ensures that the content is meaningful within the context of the operation or business process. Verification mechanisms should effectively remove any potentially active content.
  • Conversion of complex file formats into simpler formats. For intricate formats, creating a robust verification engine can be impractical since vulnerabilities can exist within verification functions as well. In such cases, transformation can be designed to neutralize potentially harmful code present in the content. However, the transformation engine processes complex data formats and remains susceptible to vulnerabilities. Therefore, transformation should occur prior to the protocol break sender component. This phase may also involve removing unwanted content, such as active code (e.g., macros). Post-verification, it may be necessary to reconstruct the object in its original or alternate format to meet destination system expectations.
  • Non-persistence and sandboxing of applications used for rendering content. These techniques can confine the impact of any compromise to specific sessions or limited durations, making it harder for attackers to establish persistent access within a network. Note that designing non-persistence and sandboxing requires careful consideration to ensure effectiveness.
  • Prevent execution of active code on destination systems. For instance, disabling macros (see our guidance on Macro Security in Microsoft Office).

Handling Nested Content

Special care should be given to formats containing embedded content. Nested content should be unpacked, transformed if needed, and verified.

To avoid recursion-related vulnerabilities, limits should be established on nesting and recursion.

For systems that accept multiple content types, there exists a possibility for one format to be disguised as another to mislead verification engines. To avert this, content formats should be verified thoroughly and consistently at every stage of the system architecture.


C. Suggested Practices for Data Import

The controls discussed earlier, encompassing transformation, verification, and flow control, can be integrated to establish a robust pattern for data import. Our recommended structure is illustrated below:

Ordering of the components

The components within this pattern are specifically arranged to optimize protection. In particular, the transformation phase occurs before data is processed through the protocol break and flow control, with verification as the final step.

We regard the Flow Control as the demarcation between the less trusted ‘low side’ of the gateway and the more trusted ‘high side.’ The transformation process should be conducted on the low side, as parsing and handling untrusted content inherently carries risk. We acknowledge that the transformation engine might be compromised; thus, identification and mitigation of such issues are imperative.

If an attacker gains complete control over the transformation engine, multiple additional challenges must be overcome to affect the destination system.

An effectively designed import gateway will recognize the verification engine’s role as a straightforward task compared to that of the transformation engine—its purpose is to verify that the data output from the transformation engine aligns syntactically and semantically with expectations.

Differing from the transformation engine, which may need to accommodate diverse formats from various sources, the verification engine might need only to validate content in a single, straightforward format. By controlling the format for transformation, the verification engine can enforce strict compliance to verify data correctness.

Upon successful verification, data can be forwarded to the destination system, potentially being transformed back to its original (or a different) format for compatibility.

Component Removal

Not every component in the pattern is always necessary. For instance, transformation might not be essential if the content format is straightforward enough to be verified directly. Additionally, the required level of verification can be adjusted against how robust the destination system is, as well as the ramifications of a system breach.

Architectural Considerations

Here are some critical security considerations pertaining to the overall architecture:

  • Management and administration should not compromise gateway security. Specifically, low-side components should be managed independently of high-side components, ensuring that a failure of any low-side component cannot permit bypassing the gateway.
  • Non-persistence and sandboxing should be utilized to limit compromise impact. Components that handle data processing, such as the transformation engine, verification engine, and destination system, may benefit from one or both techniques. While they won’t prevent component compromise, they can constrict impact to specific processes and hinder an attacker from gaining a permanent foothold.
  • Steps should be executed sequentially. Network design must ensure that the procedures within the pattern cannot be circumvented, as doing so would diminish the security provided by the end-to-end solution.


D. Monitoring for Breach Attempts

Implementing protective monitoring across every element of the gateway is beneficial; however, the verification engine and destination system are critical components that require close surveillance for potential breach attempts.

Surveillance of the Verification Engine and Destination System

The verification engine is a key security enforcing element within the gateway and should be monitored rigorously. For import processes employing the transformation method, a correctly functioning transformation engine should yield no verification failures. A verification failure indicates that the transformation component may have malfunctioned or been compromised, warranting immediate alerts for action.

The destination system in our recommended pattern represents the target an attacker seeks to infiltrate. Monitoring for compromise indications here is crucial, especially focusing on components and processes that manage externally sourced content. Based on the destination system’s integrity requirements, it may be wise to separate components dealing with external content from other internal content, allowing monitoring systems to concentrate more on higher-risk activities.

Monitoring Additional Components

For deployments with higher risk levels, applying protective monitoring to the remaining components of the gateway is valuable. Below are some suggestions on what to watch for:

External Network Connection

For import gateways that exclusively accept connections from designated source systems, implementing strict guidelines can restrict connections to authorized sources and notify of any attempts from unrecognized sources.

For gateways allowing connections from numerous sources, relying solely on a ‘known good’ strategy may be infeasible, necessitating a ‘known bad’ approach instead.

Transformation Engine

Because the transformation engine processes complex, unfiltered information, it is prudent to assume it could be relatively easy to compromise. While our model aims to prevent such compromises from affecting the verification engine or the destination system, awareness of transformations is critical for remediation.

Monitoring activities for signs of compromise in the transformation engine may encompass observing attempts to initiate outbound network connections or detect system modifications. It could also involve tracking atypical behavior in the underlying server, virtual machine, or container responsible for transformation activities. Uncharacteristic behavior may include crashes of processes involved in external data handling or calls to unusual libraries or binaries.

Protocol Break and Flow Control

If an optical flow control device is utilized, it’s vital to maintain monitoring within the protocol break receiver for any connection drops and to ensure the integrity of the flow control connection. Any anomalies in the receiver’s components could imply a compromise of the sender, requiring immediate reporting and action.

Internal Network

In enclosed or isolated networks, where communication patterns are predictable and well understood, a ‘known good’ network monitoring approach should be feasible, alerting on unusual communications. If a comprehensive internal network solution is not possible, specific attention should be placed on unexpected connection attempts to or from the verification engine.

Integrating Logs from Various Components

To facilitate monitoring of the gateway and enhance security understanding, it’s advisable to aggregate and correlate logs or alerts from different components into a suitable analysis platform.

To prevent any monitoring flows from acting as bypasses for components in the gateway, we recommend validating logs or alerts for accuracy utilizing the methods outlined in this guidance before analysis—failure to do so could enable attackers to generate falsified logs as a means of compromising the destination system.

Logs or alerts from external, low-side components (in figure 2, the transformation engine(s), external network, and protocol break sender) should undergo appropriate protocol breaks and flow control prior to being introduced into the destination system for monitoring.

For additional details on log collection, please refer to Introduction to logging for security purposes.


Final Thoughts

This pattern has been formulated through practical experience. While it does not guarantee absolute safety, proper implementation can provide a substantial defense against attacks and can be adapted for various systems.

As always when implementing security controls, it’s crucial to understand the complete system from a holistic perspective, considering both technology and human processes. The practices of transformation and verification may influence user experiences, so thorough testing is advised to ensure the system performs efficiently for users prior to deployment.

Based on an article from www.ncsc.gov.uk: https://www.ncsc.gov.uk/guidance/pattern-safely-importing-data

Leave a Reply

Your email address will not be published. Required fields are marked *

Back To Top