[INSTRM-1050] DCB PDU 'off imme' command not echoed; dcb command fails Created: 12/Aug/20  Updated: 02/Dec/20  Resolved: 03/Nov/20

Status: Done
Project: Instrument control development
Component/s: ics_dcbActor, ics_enuActor
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Normal
Reporter: cloomis Assignee: cloomis
Resolution: Done Votes: 0
Labels: SPS
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Relates
relates to INSTRM-1049 DCB PDU off command silently failed Won't Fix
Sprint: SM1PD-2020 M
Reviewers: arnaud.lefur

 Description   

This has happened several times while running hexapodStability tests, which want a 12x12 grid of three argon=60 exposures. When turning the lamp off after the 60s, the PDU command sw o05 off imme is evidently not echoed, and the DCB command then fails out. As far as I can see, lamp control actual worked: the lamp was tuned on, and after 60s the lamp was successfully turned off.

Note that there was reportedly an immediate empty response, but no timeout nor any time taken. Something was awry.

Just to get some clarity and avoid missing data, I changed the encoding from 'utf8' to 'latin-1'. That did not fix the problem nor uncover any immediate hints on the particular command failure, but did expose the fact that we need to handle telnet negotiations correctly.

I added a telnet protocol filter, which will certainly fix another problem this uncovered.
Also added a spin-until-off step like the existing spin-until-on step. This helps, but also highlights problems with variable PDU overheads.

I am pretty sure that exposures at/near 60s are a problem, having to do with the PDU connection closing silently just before the off imme must be sent. Looking more closely, it looks like a broken pipe, after which a new telnet session is needed.

Following that and including the other bug fixes, I am restarting with argon=50 instead of argon=60.

I think the Bug part of this ticket is taken care of; I'll chat with arnaud.lefur about the fixes and about what other tickets to open.

As an aside, this PDU is one of the nastiest things I have ever worked with. Besides using real RFC854 telnet and injecting spurious telnet codes into its output, it takes quite variable and sometime surprisingly long times to turn an outlet on or off. This is a problem if we really want to use lamp-controlled exposures, and certainly if we want to control several lamps in one exposure.



 Comments   
Comment by cloomis [ 14/Aug/20 ]

I think this fixes:

  • problems from the telnet junk
  • the worst of the problems when using multiple lamps.

And administrative fix from extending the auto-logout time on the PDU will ameliorate the 60s issue.

Comment by cloomis [ 03/Nov/20 ]

Whatever has been done and is in use is all that will be done on this. The new PDU obviates all this.

Generated at Sat Feb 10 16:31:22 JST 2024 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.