2.1 Executable Design Principles and Basis

Effective OT–IT isolation design is grounded in a set of executable principles that translate security intent into engineering decisions. These principles are not abstract guidelines — each one specifies when it applies, what the technical basis is, and how compliance is verified. The following twelve principles form the design foundation for every OT–IT boundary implementation covered by this guide.

# Principle When Applied Technical Basis Verification Method
1Business Continuity First — security must not become the top outage causeAlways in process controlAvailability risk assessment, safety requirementsUptime metrics; security-caused incident count
2Minimum Interconnection — only connect what is required by a documented business needAny OT–IT link requestLeast privilege principle, audit findingsFlow matrix review; orphaned rule audit
3Zoning & Conduits — segment by function/criticality; connect zones through controlled conduitsAny new line/stationIEC 62443 zoning conceptVLAN audit; traceroute between zones
4Default Deny at Boundaries — start from deny-all, then whitelist preciselyFirewall/diode policiesSecure-by-default controlNegative test: blocked flows confirmed
5Protocol Break and Content Inspection — use proxies/gateways to break direct sessions and scan contentFile transfer, APIs, historian replicationMalware prevention practiceEICAR test; proxy log review
6Strong Identity and Session Accountability — named accounts, MFA, session recording, JIT approvalsRemote O&M, engineering changesPAM best practicePAM report; recording sample audit
7Offline/Staged Updates — patches and AV signatures pass through staging + verification before OT deploymentLimited windows and legacy systemsOT operational constraintsPatch manifest and hash validation
8Deterministic Performance — enforce QoS; avoid deep inspection where it breaks latency constraintsControl loops sensitive to delay/jitterControl engineering constraintsRTT/jitter measurement under load
9Defense-in-Depth — firewall + hardening + monitoring + backup + response planAll tiersLayered security modelControl coverage matrix
10Auditable Change Control — every ruleset change is ticketed, reviewed, tested, and reversibleRule updates, new conduitsISO-style operational controlsChange ticket completeness audit
11Fail Secure, Not Fail Open — boundary failure should not expose OT to IT (unless safety requires otherwise and is explicitly approved)HA design, bypass switchesRisk governanceFailover drill; bypass policy review
12Separate Management Plane — manage firewalls/DMZ hosts from dedicated admin networks with strict accessMedium/large sitesAttack surface reductionMgmt VLAN isolation test

2.2 Failure Causes and Recommendations

Many OT security incidents can be traced to a small set of recurring design or operational failures. Understanding these failure patterns and their mechanisms enables designers to build preventive controls into the architecture from the outset, rather than discovering vulnerabilities during incidents. The table below documents eight high-frequency failure groups with their mechanisms, recommended avoidance strategies, and operational checks.

Common Failure Cause Failure Mechanism Recommendation (Avoidance) Operational Check
Flat network "for convenience"Any IT compromise reaches controllers directlyEnforce zones; no L3 route IT↔OTRoute table & traceroute audit
"Temporary" any-any rulesRules remain forever, become permanent backdoorsTime-bound rules + auto-expiry enforcementWeekly rules aging report
Direct VPN into OTCredential theft gives full OT accessVPN terminates in DMZ only; bastion requiredVerify VPN split-tunnel disabled
Shared vendor accountsNo attribution, password reuse across vendorsNamed accounts + MFA + JIT approvalsPAM report and account review
No protocol understandingWrong DPI settings block process trafficPilot test in lab; staged rollout with monitoringPre-prod acceptance tests
Patch from Internet directlyIntroduces malware or breaks system stabilityOffline staging + checksum + rollback planPatch manifest & hash validation
Logs not time-syncedCorrelation fails; delayed incident responseNTP/PTP relays; drift monitoring dashboardTime drift dashboard alerts
Under-sized firewallsDrops cause control disruptions during IT incidentsSize by CPS/sessions/latency with headroomLoad test and headroom policy enforcement

2.3 Core Design / Selection Logic

The isolation level selection process follows a structured decision tree that maps the functional requirements of each OT–IT interconnect to the appropriate security mechanism. The key driver is the directionality of the required data flow: one-way export requirements favor unidirectional gateways (data diodes), while bidirectional requirements must be carefully evaluated to determine whether they can be safely brokered through a DMZ. Any requirement that cannot be safely brokered in a DMZ should trigger an application re-architecture review rather than a direct OT–IT connection.

Decision Tree for Isolation Level Selection

Figure 2.1: Decision Tree for Isolation Level Selection — from root question "Need OT-IT Communication?" through directionality and DMZ feasibility checks to the four outcome paths: Physical Isolation, One-Way Gateway, DMZ Controlled Interconnect, or Re-architect/Reject.

Step-by-Step Design Sequence

The following nine-step sequence provides a structured methodology for designing an OT–IT isolation solution from initial assessment through operational readiness. Each step builds on the outputs of the previous step, creating a traceable design record that supports both acceptance testing and ongoing operations.

  1. Build asset inventory and classify by criticality (Safety, Production, Quality, Visibility).
  2. Build communication flow matrix — who talks to whom, protocol, ports, direction, frequency, latency.
  3. Decide zoning — Field/Control/Core/DMZ/O&M/Data Uplink zones with VLAN/subnet assignments.
  4. For each OT–IT flow, choose isolation level using directionality, integrity needs, and operational need.
  5. Select devices — industrial firewall vs. diode vs. isolation gateway; determine HA requirements.
  6. Design whitelist rules and inspection points; define logging and alerting requirements.
  7. Define O&M model — bastion, MFA, JIT approvals, recording, file transfer controls.
  8. Define update model — offline staging, scanning, release ring, rollback procedures.
  9. Define acceptance tests and O&M runbooks; establish change control procedures.

2.4 Key Design Dimensions

A complete OT–IT isolation design must address eight key dimensions that span technical performance, operational sustainability, regulatory compliance, and lifecycle economics. Neglecting any dimension creates gaps that may not surface until acceptance testing or, worse, during an incident. The table below provides a structured overview of each dimension with its key considerations and typical metrics.

Dimension Key Considerations Typical Metrics / Targets Design Impact
Performance / ExperienceLatency, jitter, throughput, CPS; inspection must not break controlRTT <5ms added by FW; CPS headroom ≥4×DPI placement, bypass for time-critical paths
Stability / ReliabilityHA, link redundancy, failover behavior, deterministic operationHA failover ≤30s; no fail-open behaviorActive/standby FW pairs, dual uplinks, state sync
Maintain / ReplaceModular design; spare parts; version control; rollbackMTTR <4h for boundary devicesSpare inventory, golden images, staged upgrades
Compatibility / ExtensionProtocol support (Modbus/TCP, OPC UA, DNP3, IEC 104), vendor interoperability≥6 OT protocols supported by DPI engineProtocol testing in lab before production
LCC / TCOLicensing, support, lifecycle, spares, training5-year TCO comparison across vendorsAvoid proprietary lock-in; require exportable configs
Energy & EnvironmentPower draw, heat, cabinet cooling, EMI toleranceIndustrial-grade: -40 to +70°C, DIN rail mountCabinet thermal design, UPS sizing
Compliance / CertificationIEC 62443 component claims, security hardening guidesIEC 62443-4-2 SL2 component certificationVendor certification documentation required
OperabilityLogs, dashboards, runbooks, change workflowAlert-to-ticket time <5 min; runbook coverage 100%SIEM integration, change management tooling
Design Reminder: The most common dimension neglected during initial design is operability. A technically sound architecture that lacks clear runbooks, alert thresholds, and change procedures will degrade over time as staff turnover and configuration drift accumulate. Invest in operational documentation as part of the design deliverable, not as an afterthought.