[#INSTRM-790] ion pumps communication issue

[INSTRM-790] ion pumps communication issue Created: 07/Oct/19 Updated: 17/Oct/19
Status:	Open
Project:	Instrument control development
Component/s:	ics_xcuActor
Affects Version/s:	None
Fix Version/s:	None

Type:

Bug

Priority:

Normal

Reporter:

fmadec

Assignee:

cloomis

Resolution:

Unresolved

Votes:

Labels:

SPS

Remaining Estimate:

Not Specified

Time Spent:

Not Specified

Original Estimate:

Not Specified

Issue Links:

Relates
relates to	~~INSTRM-791~~	For a single SM, add either an ionpum...	Done

Sprint:

SM1-2019 P

Description

This is not new, but it often happened this time

it can happen at the ion pumps start as shown below or in operation by the monitoring.

it would be "nice" to have a way to handle that before starting operation at Subaru

2019-10-03 15:23:19 xcu_r1 w text="failed to create connect or send to ion pump: [Errno 111] Connection refused"
2019-10-03 15:23:19 xcu_r1 f text="command failed: ConnectionRefusedError(111, 'Connection refused') in sendOneCommand() at /software/mhs/products/Linux64/ics_xcuActor/1.11.1/python/xcuActor/Controllers/ionpump.py:65"

Comments

Comment by cloomis [ 08/Oct/19 ]

The chained pair of ionpump controllers which controls all six ion pumps for the three cryostats are accessible as two RS-485 nodes behind a single MOXA RS-485 port. This cluster of things is commanded from three independent actors. This cannot be reliable, really. One of the earlier tickets discussing ion pump communications problems suggested having either an ionpumpActor or at least a single program to sequence all commands to the controllers. The traffic is very simple, so either seems rational. Will think about which is best.

I looked more carefully at the MOXA configuration options, and do not believe we can use it to sequence TCP connections: MOXAs can allow multiple concurrent connections, but that would be much worse.

An inventory of all the Connection refused failures in 2019 does suggest that simply retrying after some small random delay would significantly improve matters. But that cannot entirely fix the real problem.

For r1 and b1 together, 2019 showed 1_260_000 commands with all but 49 being status commands; 11_000 of those connections were refused. All but 136 of the refused connections were from when the periodic status queries for the two cryostats synced up (07-1* and 09-1*). The periodic tasks are re-scheduled based on the system clock and not from the end of a command, so it is very easy to get locked into a hopeless failure. Different ticket, but naively retrying to fix this ticket's problems would probably make things worse without addressing that. I'd rather add a new actor than change that.

Generated at Thu Jul 03 12:05:02 JST 2025 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.

[INSTRM-790] ion pumps communication issue Created: 07/Oct/19 Updated: 17/Oct/19