[INSTRM-2046] Add tron watchdog Created: 27/Jul/23 Updated: 10/Nov/23 |
|
| Status: | Open |
| Project: | Instrument control development |
| Component/s: | tron_tron |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Normal |
| Reporter: | cloomis | Assignee: | cloomis |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||
| Description |
|
tron_tron stopped, which it essentially never does. But it did, and when it was restarted, the actors reconnected as we would have wished for. But it could simply have been started by itself. Write some cron/at/whatever script to look for a missing runhub.py and start the hub if down for a minute or two. |
| Comments |
| Comment by cloomis [ 10/Nov/23 ] |
|
tron_actorcore does use a twisted ReconnectingClientFactory, but that seems to stop trying after a while. There are several knobs for delays, retries, limits. Those need to be checked. |