Clients/Laguna/Plant/slack/2026/04/2026-04-21_laguna.md

Content

# #laguna — 2026-04-21

**10:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791597514509):** 
  ![[F0AU909NS2H_IMG_1380.jpg]]
  ![[F0ATWUET4AK_IMG_1379.jpg]]
  ![[F0AV6LBPP4G_IMG_1378.jpg]]
  ![[F0AUAAKLAP8_IMG_1377.jpg]]
**10:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791605688999):** @Mason Radke
**10:14 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791653107719):** something isnt happy. so far the logs are being translated as a download to a PLC
**10:14 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791657107709):** that is breaking things.
**10:14 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791659535859):** still digging
**10:16 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791797178479):** It wasn't Cannon per David at Cannon
**10:16 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791817085909):** ok
**10:17 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791864653789):** Aerzen has remote access. I'm having Jerry check with them next
**10:18 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791887892489):** im trying to find out which plc and tags are causing the problems
**10:19 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791963859109):** which VM is GR? that first picture says it's almost out of memory
**10:24 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776792274740209):** just coming into this blind, but I find it interesting these APPSRVR redundant DI objects. there are 720 errors on APPSRVR1 for the UF plc (I'm just looking at the pictures he sent)
**10:24 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776792293475089):** yes.
**10:25 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776792304680809):** things are not happy
**10:25 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776792306328219):** something changed
**10:26 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776792384632669):** checking with Jesse to see if there were any power outages or process hiccups yesterday
**10:36 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776793011867319):** none that they are aware of
**10:37 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776793024870209):** its a mess
**10:37 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776793053662809):** be prepared for me to not have enough time to finish this
**10:40 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776793253946019):** 
  [[F0AURFN1857_AVEVA_SCADA_Information_Request.docx]]
  [[F0AURFNT1U1_AVEVA_SCADA_Customer_Explanation.docx]]
  [[F0AU96JGEER_AVEVA_SCADA_Troubleshooting_Guide.docx]]
**10:40 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776793256779609):** if you want to start reading up
**11:01 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776794474442249):** I will reach out to David at Cannon if you can't finish. I'm of no use with this system.
**11:02 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794520761109):** i just hope he doesnt try to make it sound like it something we did
**11:03 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794582023809):** i may have found it
**11:03 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794585508439):** fingers crossed
**11:03 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794586419469):** standby
**11:03 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776794614436309):** :hand_with_index_finger_and_thumb_crossed:
**11:09 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794987458119):** can you have them close/reopen the clients
**11:09 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776794993407289):** and check trending
**11:09 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776794998567589):** standby
**11:10 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795022418679):** just close/open not reboot?
**11:10 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795031477219):** might as well reboot.
**11:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795212524209):** trying to reach someone. neither jesse nor richardo answered. texted jerry
**11:15 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795300284919):** this AVEVA_SCADA_Information_Request doc is interesting. Is this Claude requesting all this or is it AVEVA
**11:15 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795336564239):** No I’m directing it 
**11:16 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795391128129):** so you are asking for these things?
**11:16 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795415272699):** standby im trying to get this done before i have to leave
**11:17 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795432729509):** yep
**11:18 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795522959179):** findings are changing as we get deeper into this.. maybe disregard those documents until i can get a final version together
**11:18 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795533306189):** copy
**11:21 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795661359109):** think it's the redundant DH+ servers causing this?
**11:21 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795673193339):** i wont know until its fixed
**11:21 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795676547029):** copy
**11:21 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795687324889):** just throwing out ideas
**11:21 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795689948779):** now things are pointing to a windows update crash and loop
**11:21 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795703404229):** but i just dont know.. there are so many symptoms
**11:21 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795717016949):** how can windows update? there's no internet
**11:23 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795780544539):** showing a windiows installer package running on both machines. started at the same time.
**11:23 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795782436959):** i dont know
**11:23 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795785363999):** feels like some sort of update
**11:23 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776795795180759):** hmmm wierd
**11:24 [Reno Hiltermann](https://slack.com/archives/C08JS6KDLBD/p1776795851794929):** Any chance someone used a wifi dongle and phone hotspot? Then Microsoft implemented the malware :sweat_smile:
**11:25 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795900457449):** if our first hack was on our most secure client im gonna scream
**11:25 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776795922327519):** oh.. haha. i get it
**11:25 [Reno Hiltermann](https://slack.com/archives/C08JS6KDLBD/p1776795946827449):** Yes. Microsoft knows all.
**11:28 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776796133924219):** when you say it's running on both machines, are these the servers or the workstations
**11:28 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776796133994429):**
```
 Bad news: the reboot did not fix it. Both servers are still crash-looping at nearly the same rate.
                                                                                                                        
  What the logs show                                                                                                    
                                    
  APP-SRVR1 (57 minutes since the reboot):                                                                              
  - 146 crashes already — ABCIP.exe × 40, MSIBC81.tmp × 5, plus the same cascade as before
  - First crash came 13 minutes after the server booted — identical pattern to Apr 19                                   
   
  APP-SRVR2 (last 24 hours):                                                                                            
  - Still crashing, most recent crash at 18:08 UTC today (17 min before you exported) — GDIWebServer × 277, mmc.exe ×
  184, aaCALWrapperEx × 184 in the last 24h. Same storm.                                                                
                                                           
  Reboot alone doesn't clear this because something is re-launching MSIBC81.tmp within minutes of boot. That "something"
   is almost always one of:                                                                                             
                                    
  1. A Windows service set to Auto-start that hosts the installer                                                       
  2. A Scheduled Task that runs every few minutes          
  3. A Group Policy software-installation assignment that retries on every policy refresh                               
  4. A stuck Windows Installer rollback that the MSI service keeps retrying         ```
**11:29 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776796146180369):** reading
**11:32 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776796344199559):** are you wrapping this up for hand-off?
**11:32 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776796361064649):** im still trying to figure out whats going on
**11:32 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776796372946079):** but i guess i better wrap it up for hand off
**11:39 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776796777380329):** thanks for giving it your all
**11:40 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776796814714579):** they dont need this platform.. its for like running a city
**11:40 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776796821803359):** they need ignition
**11:42 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776796972774229):** of course it's overkill. it was built by engineers
**11:49 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776797367555349):** 
  [[F0AUD432W4A_AVEVA_SCADA_Handoff_Summary.docx]]
  [[F0AUB29DLP8_AVEVA_SCADA_Handoff_Technical.docx]]
**11:52 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776797552692639):** OK. I gotta jam.
**11:52 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776797565196739):** thanks Mason.
**12:15 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776798956671179):** David says the background info is very good and he agrees that a repair install is likely needed.
**12:16 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776799006756849):** I see "redundant" servers as a problem though.
**12:16 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799012938379):** That’s cool! I’m glad the work I did was helpful
**12:17 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799041656049):** It’s very much designed for redundant servers
**12:17 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799052983239):** How do you feel that’s causing issues?
**12:17 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776799069647129):** what's the point of a redundant server if they are both going to fail at the same time
**12:18 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799082602179):** I believe they’re on different hardware
**12:18 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776799109379889):** yes indeed, but if the software borks on both then worthless
**12:19 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799177900229):** Yeah, well the Software is a seven headed behemoth     But if a server fails, the other one should pick up
**12:20 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776799243405499):** funny how Wonderware was #1, then Rockwell came along and wanted to be like them, now no one wants either because they are so full of problems
**12:21 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799266210409):** Feels like such a mess in there, trying to troubleshoot these things
**12:21 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776799276939159):** So many dependencies so many moving parts
**13:58 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776805120016189):** David says the system is back to normal.
**14:13 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776806026741159):** Good!
**14:14 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776806054356849):** see if you can get some notes or details on what he did to stop the sink and restart things, etc.
**14:14 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776806095144759):** copy. I asked him for a service report. we'll see how detailed he'll be
**15:01 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776808903285079):**
service report from David at Cannon:
*Service Report – SCADA Application Server Instability*
*Date:* April 21, 2026
*System:* AVEVA System Platform (Galaxy: LCSD)
*Servers:* APP-SRVR1, APP-SRVR2
*Issue Summary*
Operators reported slow or unresponsive SCADA graphics and trend displays beginning April 20, 2026. Investigation identified that APP-SRVR1 was experiencing a repeated application engine restart condition. One AppEngine was continuously cycling between _Syncing_ and _Stopped_, causing repeated reinitialization of PLC subscriptions and Historian client connections. This resulted in delayed graphic loading, intermittent blank trend data, and overall system sluggishness.
*Findings*
Log review indicated a corrupted AVEVA System Platform component on APP-SRVR1 triggering a Windows Installer self-repair loop. The repair process repeatedly failed, causing the affected AppEngine to continuously restart and preventing stable object hosting. This created excessive resubscription traffic to PLC data sources and repeated Historian client reconnections, degrading system performance.
APP-SRVR1 was isolated from the Galaxy, allowing APP-SRVR2 to temporarily operate the SCADA system independently. An AVEVA Change Installer repair was then performed on APP-SRVR1 to correct the corrupted component installation.
*Resolution*
The repair installation completed successfully. Application engines on APP-SRVR1 were brought back online sequentially and verified stable. Object hosting was transferred back to the primary server without recurrence of the synchronization issue. Historian trending and client performance returned to normal.
*Current Status*
SCADA system operating normally with full redundancy restored between APP-SRVR1 and APP-SRVR2. No further application engine instability observed.
**15:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776809583349949):** The MBR process data is not responding. PLC (addr 30) and HMI (addr 31) show red X in RSLinx from old SCADA machine. Jesse reports the process is running and the HMI is responsive.
**15:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776809592947559):** This is on the DH+ network
**15:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776809603445049):** Will go out tomorrow to investigate
**15:21 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776810087075149):** the DH+ MSG from Chem_Rm plc to MBR is failed as well.
**15:25 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776810308958909):** @AutoBot site visit for Kevin to Laguna Sanitation Investigate comm loss to MBR PLC 4/22 at 0800
**15:25 [AutoBot](https://slack.com/archives/C08JS6KDLBD/p1776810314694389):** :white_check_mark: Added: *On Site: Laguna County Sanitation — Kevin to Laguna Sanitation Investigate comm loss to MBR PLC 4/22*, Wed Apr 22, 8:00 AM – 9:00 AM @ Laguna County Sanitation
Extracted Entities

Type	Key	Value	Confidence	Evidence
contact	Person	Kevin	100%	10:13 [Kevin](https://slack.com/archives/C08JS6KDLBD/p1776791597514509)
contact	Person	Mason Radke	100%	10:14 [Mason Radke](https://slack.com/archives/C08JS6KDLBD/p1776791653107719)
contact	Person	David at Cannon	90%	It wasn't Cannon per David at Cannon
contact	Person	Jerry	80%	I'm having Jerry check with them next
contact	Person	Jesse	80%	checking with Jesse to see if there were any power outages
contact	Person	Richardo	70%	neither jesse nor richardo answered
server	Server Name	APP-SRVR2	100%	APP-SRVR2 (last 24 hours): - Still crashing
server	Server Name	APP-SRVR1	100%	APP-SRVR1 (57 minutes since the reboot): - 146 crashes already
server	Server Name	APPSRVR2	90%	APP-SRVR2 (last 24 hours): Still crashing
server	Server Name	APPSRVR1	90%	there are 720 errors on APPSRVR1 for the UF plc
site	Client Name	Laguna	100%	Client: Laguna
system	Product	AVEVA SCADA	100%	AVEVA_SCADA_Information_Request.doc
system	Product	Ignition	90%	they need ignition
task	Action Item	Wrap up troubleshooting and prepare for hand-off	90%	im still trying to figure out whats going on but i guess i better wrap it up for hand off
task	Action Item	Check if clients can close/reopen or reboot to fix trending issues	80%	can you have them close/reopen the clients and check trending
File: Clients/Laguna/Plant/slack/2026/04/2026-04-21_laguna.md
Updated: 2026-04-21 22:30:10.011441