← SSLOCSD

Clients/SSLOCSD/files/Comm Errors in Alarms/Alarm_Investigation_Report.docx.md

gdrive
Source
2
Chunks
5
Entities
Doc
Type

Content

# Alarm_Investigation_Report.docx > [Open in Google Drive](https://drive.google.com/file/d/18YFmZppz9vb7vvumBU08vJ9CuSRXAIK0/view) ## Summary This document is an alarm investigation report for the WWTP Primary Sludge System at SSLOCSD, detailing a spurious alarm event caused by a momentary communication loss between the SCADA server and the PLC. It analyzes alarms related to drive faults on sludge pumps and a low-low sump level alarm, concluding that the root cause was a brief SCADA-to-PLC communication failure rather than actual equipment faults. The report includes a timeline, hypotheses evaluation, and recommendations for network infrastructure inspection and alarm configuration review. ## Content <details><summary>Full extracted text</summary> ALARM INVESTIGATION REPORT WWTP Primary Sludge System — Spurious Alarm Event 1. Incident Summary At 1:53:18 AM on March 2, 2026, the SCADA system generated three simultaneous Priority 900 (Urgent) alarms in the WWTP Primary area. All three alarms were in a TRIP condition and cleared shortly after, with the system returning to normal operation by approximately 2:02 AM. Alarms Generated 2. Investigation & Analysis 2.1 Alarm Detail Review Detailed examination of the alarm properties for all three alarms revealed that each carried a Condition Quality of "Bad Quality – Communication Failure." This designation indicates that the SCADA system flagged the alarm data as unreliable due to a loss of communication, rather than representing a genuine field condition. 2.2 Drive Fault Alarms (P2312 & P2322) The Primary Sludge Pump 2 and Pump 4 drive fault alarms are triggered by a drive fault status bit communicated from the variable frequency drives (VFDs) to the PLC over Ethernet. When the SCADA system lost communication with the PLC, the drive fault status values defaulted to their fail-safe state, which the alarm system interpreted as active faults. 2.3 Secondary Sump Low-Low Level Alarm (LT_3210) The sump level transmitter (LT_3210) is a hardwired 4–20 mA analog instrument connected directly to the PLC input module. Under normal circumstances, a communication loss between SCADA and the PLC would not affect the actual field signal. However, during the communication disruption, the SCADA system was unable to read the level value from the PLC. The value momentarily defaulted to zero (or was flagged as bad quality) on the SCADA side, which triggered the Low-Low alarm. Trend data confirms the level returned to its normal value immediately after communication was restored, indicating the actual sump level never dropped. 2.4 Hypotheses Considered & Eliminated PLC Power Loss: If the PLC had experienced a power interruption, all I/O values from that controller would have dropped to zero simultaneously. Review of trend data confirmed that other values associated with the same PLC remained stable throughout the event. This hypothesis was eliminated. Ethernet Communication Loss (Drive Network Only): A communication failure on the drive Ethernet network could explain the VFD fault alarms, as the drive fault status is transmitted over Ethernet. However, this would not account for the simultaneous level transmitter anomaly, since LT_3210 is hardwired to the PLC and does not rely on Ethernet communication. This hypothesis was eliminated as the sole cause. SCADA-to-PLC Communication Loss: A momentary loss of communication between the SCADA server and the PLC would explain all three alarms simultaneously. Both the drive fault status (read from the PLC, which receives it over Ethernet from the VFDs) and the sump level value (read from the PLC, which receives it via hardwired analog input) would become unavailable to SCADA at the same time. The Bad Quality – Communication Failure tag on all three alarms confirms this as the root cause. 3. Root Cause Determination The root cause of all three alarms is a momentary communication loss between the SCADA system (Server: WWTP_AE) and the PLC controlling the WWTP Primary Sludge area. This communication disruption lasted only seconds and caused the SCADA system to temporarily lose visibility of process values from the PLC, resulting in spurious alarms as default/fail-safe values triggered alarm conditions. 4. Timeline of Events 5. Recommendations Network Path Investigation: Inspect the network infrastructure between the SCADA server (WWTP_AE) and the affected PLC, including managed switches, fiber connections, and any intermediate network devices, for signs of intermittent failures or errors. Network Switch Log Review: Review switch logs and port diagnostics for the timeframe of 1:50–2:05 AM on 3/2/2026 to identify any port flaps, CRC errors, or link state changes. Alarm Configuration Review: Consider implementing alarm suppression or shelving during communication quality events (Bad Quality) to prevent spurious alarms from generating operator distractions. </details>

Extracted Entities

TypeKeyValueConfidenceEvidence
server SCADA server hostname WWTP_AE 100% communication loss between the SCADA system (Server: WWTP_AE) and the PLC
site plant name WWTP Primary Sludge System 100% alarm investigation report for the WWTP Primary Sludge System at SSLOCSD
task Network Path Investigation Inspect network infrastructure between SCADA server WWTP_AE and PLC 90% Inspect the network infrastructure between the SCADA server (WWTP_AE) and the affected PLC
task Network Switch Log Review Review switch logs and port diagnostics for 1:50–2:05 AM on 3/2/2026 90% Review switch logs and port diagnostics for the timeframe of 1:50–2:05 AM on 3/2/2026
task Alarm Configuration Review Implement alarm suppression or shelving during communication quality events 90% Consider implementing alarm suppression or shelving during communication quality events
File: Clients/SSLOCSD/files/Comm Errors in Alarms/Alarm_Investigation_Report.docx.md
Updated: 2026-03-06 05:50:20.285015