
Inline pass/fail decisions stay resident on the edge - deterministic, PLC-integrated, and cloud-independent. The cloud handles model lifecycle, rollout orchestration, and governance asynchronously, never touching the control path. The article compares edge-only and edge-plus-cloud hybrid patterns across their operational consequences, covering latency budgets, PLC integration, CI/CD with hardware-in-the-loop validation, canary rollout, typed rollback triggers, and resilient OTA updates.
In Part 1, we established that cloud-based inference is ruled out for inline pass/fail decisions by physics, not preference. The actuation window on a high-speed automotive line is too narrow, WAN jitter too unpredictable, and the consequences of a missed rejection too severe.
That brings us to the real architectural question: how much intelligence should live at the edge, and how much belongs in the cloud? There are two realistic answers: edge-only and edge-plus-cloud hybrid : and the right choice depends on your fleet size, governance requirements, and operational maturity. In this part, we define both architectures precisely, walk through their strengths and limitations, and give you a decision framework you can apply to your own programme.
For automotive quality control, there are two realistic deployment patterns. Before comparing them, it helps to define two terms that will appear throughout this article:
These two planes have fundamentally different latency, reliability, and availability requirements and keeping them architecturally separate is the core design principle that makes reliable edge AI possible at scale.
With that framing in place, the two deployment patterns become clear:
In an edge-only setup, each inspection station is a fully self-contained cyber-physical system. The real-time data plane operates entirely locally, but so does every other operational responsibility that a cloud control plane would otherwise handle.
This is the trade-off that makes edge-only architectures operationally demanding at scale: the station must own its own lifecycle, not just its inference loop.
%20(1).jpg)
Without a cloud control plane, every lifecycle operation that would normally be centrally managed must be handled locally or via manual intervention. This includes:
Each edge node must maintain a local versioned store of model artifacts (e.g., /opt/models/defect-detector-v17.onnx, v16.onnx as fallback). Without a central registry, the station itself must know which version is active, which is the last-known-good, and how to revert typically managed via a local config file and a simple rollback script that re-points the active model symlink. Without this, a bad model update has no recovery path short of manual re-imaging.
Metrics, inference logs, and image samples must be retained locally for a meaningful window (e.g., 30–90 days) to support incident investigation, model performance review, and audit. This requires deliberate disk capacity planning and a retention policy log rotation, compression, and archival schedules that does not exist by default and must be explicitly provisioned per station.
In the absence of a central config management system, each station's configuration (model path, thresholds, camera parameters, PLC I/O mappings) can drift independently over time through ad-hoc manual changes. Edge-only setups must compensate with a local config-as-code discipline all parameters stored in version-controlled files, with change history, and deployed via a defined process rather than direct edits. Without this, reproducing a known-good state after an incident becomes guesswork.
When an edge node fails and must be replaced (hardware fault, thermal damage, or end of life), the replacement device must be restored to an identical operational state: same OS image, same container versions, same model artifacts, same config, same local metric history where possible. Without a documented and rehearsed recovery playbook ideally an automated bootstrap script that provisions a replacement node from a known-good image and pulls current artifacts from a local NAS or USB staging store device replacement becomes a multi-hour manual operation that takes the inspection cell offline.
Most mature programs evolve toward a hybrid edge-plus-cloud architecture, where the edge is the data plane and the cloud is the control plane. This separation is only meaningful, however, if it is structurally enforced not just intended.
The single most common failure mode in hybrid deployments is the cloud gradually becoming a hidden runtime dependency: a model-lookup call added here, a remote threshold fetch added there, a control decision routed through a cloud API for convenience.
Each of these individually seems harmless, but collectively they reintroduce exactly the latency, jitter, and availability risks that the hybrid architecture was designed to eliminate.
%20(1).jpg)
The rule must be explicit and non-negotiable: the edge data plane must be fully self-sufficient at runtime.
The cloud must never be in the critical path of a pass/fail decision. Specifically:
At any point in your architecture, ask "If the cloud becomes unreachable right now, what happens to the next part on the line?" The only acceptable answer is: "The edge inspects it normally using the last deployed model and config." If any part of your design produces a different answer, the cloud has become a hidden runtime dependency and the architecture must be revised.

For manufacturers operating under IATF 16949, ASPICE, or internal quality management systems, the ability to answer "which model version made this pass/fail decision, trained on which dataset, deployed by whom, and when?" is not optional — it is an audit requirement. This is the single strongest operational argument for hybrid architectures in mature automotive programmes, and it is independent of any latency or performance consideration.
Device replacement is an inevitable operational reality on a factory floor hardware fails, thermal damage occurs, and nodes reach end-of-life. The difference between a 20-minute automated recovery and a 4-hour manual re-imaging exercise has direct line availability implications. At fleet scale (20+ stations), the cumulative impact of slow, manual recovery procedures becomes a significant operational cost that is rarely accounted for during initial architecture selection.
Both edge-only and edge-plus-cloud architectures can meet real-time pass/fail requirements provided the edge data plane is kept fully self-sufficient at runtime. The difference comes down to operational scale and governance. Edge-only is the right starting point for small fleets with strong data residency requirements; hybrid is the right long-term architecture for programmes operating at 10+ stations, subject to IATF 16949 or ASPICE audits, or requiring centralized retraining and fleet-wide analytics.
The most dangerous hybrid deployment is not a poorly designed one it is one that was well-designed but gradually accumulated hidden cloud runtime dependencies over time. Architectural discipline, not just architecture, is what separates a reliable hybrid from a fragile one.
Choosing the right architecture pattern is the strategic decision. Part 3 is where we get into the engineering: how to build a deterministic, sub-30 ms inference pipeline from frame acquisition through preprocessing, model inference, decision logic, and PLC actuation with explicit jitter budgets, safe failure modes, and a clean adapter boundary to the control system.