← SSLOCSD

Clients/SSLOCSD/emails/.raw/2026/march/RE_Sludge_Pump_Alarm_Investigation_Report.md

gmail
Source
3
Chunks
13
Entities
Doc
Type

Content

Good Morning Mason, Thank you for the detailed analysis of the alarm/callout, the layout of the report made it very easy to digest the information. One of the recommendations was alarm configuration review. Would a slightly longer delay in alarm conditions address this type of issue while still creating the alarm if the communication error persists? If this is a viable solution, would we want to review the system as a whole or address issues as they arise? I would be hesitant to automatically shelve alarms but maybe a system wide review would be valuable. Kind regards, Michael Arias Michael J. Arias Operations Supervisor Grade III Operator South San Luis Obispo County Sanitation District 1600 Aloha Place, Oceano, Ca. 805-489-6666 [South San Luis Obispo County Sanitation District] From: Mason Radke <mason@autosysnet.com> Sent: Tuesday, March 3, 2026 8:12 PM To: Mike Arias <Arias@sslocsd.us>; Kevin Seifert <kevin@autosysnet.com> Subject: Sludge Pump Alarm Investigation Report ALARM INVESTIGATION REPORT WWTP Primary Sludge System - Spurious Alarm Event Date of Event: March 2, 2026 Time of Event: 1:53:18 AM System: WWTP Primary Sludge Pumping Server: WWTP_AE Area: WWTP.Primary (RNA://SGlobal/SSLOCSD_HMI) Prepared By: Mason 1. Incident Summary At 1:53:18 AM on March 2, 2026, the SCADA system generated three simultaneous Priority 900 (Urgent) alarms in the WWTP Primary area. All three alarms were in a TRIP condition and cleared shortly after, with the system returning to normal operation by approximately 2:02 AM. Alarms Generated Alarm Tag Description Alarm Class Condition Quality P2312_Alm_DriveFault Primary Sludge Pump 2 Drive Fault P_PF52x Bad Quality - Communication Failure P2322_Alm_DriveFault Primary Sludge Pump 4 Drive Fault P_PF52x Bad Quality - Communication Failure LT_3210_Alm_LoLo Secondary Sump Low-Low Alarm P_PF52x Bad Quality - Communication Failure 2. Investigation & Analysis 2.1 Alarm Detail Review Detailed examination of the alarm properties for all three alarms revealed that each carried a Condition Quality of "Bad Quality - Communication Failure." This designation indicates that the SCADA system flagged the alarm data as unreliable due to a loss of communication, rather than representing a genuine field condition. 2.2 Drive Fault Alarms (P2312 & P2322) The Primary Sludge Pump 2 and Pump 4 drive fault alarms are triggered by a drive fault status bit communicated from the variable frequency drives (VFDs) to the PLC over Ethernet. When the SCADA system lost communication with the PLC, the drive fault status values defaulted to their fail-safe state, which the alarm system interpreted as active faults. 2.3 Secondary Sump Low-Low Level Alarm (LT_3210) The sump level transmitter (LT_3210) is a hardwired 4-20 mA analog instrument connected directly to the PLC input module. Under normal circumstances, a communication loss between SCADA and the PLC would not affect the actual field signal. However, during the communication disruption, the SCADA system was unable to read the level value from the PLC. The value momentarily defaulted to zero (or was flagged as bad quality) on the SCADA side, which triggered the Low-Low alarm. Trend data confirms the level returned to its normal value immediately after communication was restored, indicating the actual sump level never dropped. 2.4 Hypotheses Considered & Eliminated 1. PLC Power Loss: If the PLC had experienced a power interruption, all I/O values from that controller would have dropped to zero simultaneously. Review of trend data confirmed that other values associated with the same PLC remained stable throughout the event. This hypothesis was eliminated. 2. Ethernet Communication Loss (Drive Network Only): A communication failure on the drive Ethernet network could explain the VFD fault alarms, as the drive fault status is transmitted over Ethernet. However, this would not account for the simultaneous level transmitter anomaly, since LT_3210 is hardwired to the PLC and does not rely on Ethernet communication. This hypothesis was eliminated as the sole cause. 3. SCADA-to-PLC Communication Loss: A momentary loss of communication between the SCADA server and the PLC would explain all three alarms simultaneously. Both the drive fault status (read from the PLC, which receives it over Ethernet from the VFDs) and the sump level value (read from the PLC, which receives it via hardwired analog input) would become unavailable to SCADA at the same time. The Bad Quality - Communication Failure tag on all three alarms confirms this as the root cause. 3. Root Cause Determination The root cause of all three alarms is a momentary communication loss between the SCADA system (Server: WWTP_AE) and the PLC controlling the WWTP Primary Sludge area. This communication disruption lasted only seconds and caused the SCADA system to temporarily lose visibility of process values from the PLC, resulting in spurious alarms as default/fail-safe values triggered alarm conditions. 4. Timeline of Events Time Event 1:53:18 AM SCADA-PLC communication loss occurs. Three alarms triggered simultaneously: P2312 Drive Fault, P2322 Drive Fault, LT_3210 Low-Low Level. All tagged Bad Quality - Communication Failure. 1:55:35 AM Secondary Sump Low-Low alarm acknowledged by operator. 1:55:52 AM Pump 4 Drive Fault acknowledged. 1:56:09 AM Pump 2 Drive Fault acknowledged. 2:00:17 AM CCT Mid Channel ORP Low Alarm triggered (P_AE110B_ORP, Val=455.4). Separate issue, unrelated to communication event. ~2:01-2:02 AM All drive fault and sump level alarms clear. Communication restored. System returns to normal. 5. Recommendations 4. Network Path Investigation: Inspect the network infrastructure between the SCADA server (WWTP_AE) and the affected PLC, including managed switches, fiber connections, and any intermediate network devices, for signs of intermittent failures or errors. 5. Network Switch Log Review: Review switch logs and port diagnostics for the timeframe of 1:50-2:05 AM on 3/2/2026 to identify any port flaps, CRC errors, or link state changes. 6. Alarm Configuration Review: Consider implementing alarm suppression or shelving during communication quality events (Bad Quality) to prevent spurious alarms from generating operator distractions. -Mason Radke -Autosys, LLC ## Attachments - ![[20260304_19cb9aaaf812_image001.png]]

Extracted Entities

TypeKeyValueConfidenceEvidence
contact Operations Supervisor Michael J. Arias 100% Michael J. Arias Operations Supervisor Grade III Operator
contact Mason Radke Email mason@autosysnet.com 100% From: Mason Radke <mason@autosysnet.com>
contact Mike Arias Email Arias@sslocsd.us 100% To: Mike Arias <Arias@sslocsd.us>
contact Kevin Seifert Email kevin@autosysnet.com 100% To: Kevin Seifert <kevin@autosysnet.com>
contact Operations Supervisor Phone 805-489-6666 100% 805-489-6666
server SCADA Server WWTP_AE 100% Server: WWTP_AE
site Client Address 1600 Aloha Place, Oceano, Ca. 100% 1600 Aloha Place, Oceano, Ca.
site Client Name South San Luis Obispo County Sanitation District 100% South San Luis Obispo County Sanitation District
site Area WWTP.Primary 100% Area: WWTP.Primary (RNA://SGlobal/SSLOCSD_HMI)
system SCADA System WWTP Primary Sludge Pumping 100% System: WWTP Primary Sludge Pumping
task Network Path Investigation Inspect network infrastructure between SCADA server WWTP_AE and PLC 90% Network Path Investigation: Inspect the network infrastructure between the SCADA server (WWTP_AE) and the affected PLC
task Network Switch Log Review Review switch logs and port diagnostics for 1:50-2:05 AM on 3/2/2026 90% Network Switch Log Review: Review switch logs and port diagnostics for the timeframe of 1:50-2:05 AM on 3/2/2026
task Alarm Configuration Review Consider alarm suppression or shelving during communication quality events 90% Alarm Configuration Review: Consider implementing alarm suppression or shelving during communication quality events
File: Clients/SSLOCSD/emails/.raw/2026/march/RE_Sludge_Pump_Alarm_Investigation_Report.md
Updated: 2026-03-11 23:56:54.198918