[INSTRM-322] Switch actor logging to rsyslog. Created: 10/Apr/18 Updated: 11/Nov/23 |
|
| Status: | Open |
| Project: | Instrument control development |
| Component/s: | tron_actorcore |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Task | Priority: | Normal |
| Reporter: | cloomis | Assignee: | arnaud.lefur |
| Resolution: | Unresolved | Votes: | 0 |
| Labels: | SPS | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Issue Links: |
|
||||||||||||||||
| Story Points: | 6 | ||||||||||||||||
| Sprint: | SM1-2019 E, SM1-2019 J, SM1PD-2020 F, SM1PD-2020 J | ||||||||||||||||
| Description |
|
All logging for actors goes via opscore.sdss3logging, which implements a StreamHandler to a RotatingLogHandler. This will not scale well to PFS: we should replace the final handler with a SyslogHandler to the instrument's rsyslog server. |
| Comments |
| Comment by hassan [ 05/Feb/19 ] |
|
arnaud.lefur to ask cloomis for clarification. |
| Comment by hassan [ 07/Mar/19 ] |
|
No further inputs needed from cloomis. |
| Comment by hassan [ 11/Jul/19 ] |
|
Important for temperature boards - logging goes to a local file. |
| Comment by cloomis [ 02/Oct/19 ] |
|
rsyslog is as disgusting an accretion of idiosyncratic features as I have ever seen. But it is well tested and does provide the features we need. Specifically, python logging writes to "syslog" handlers, rsyslog lets you add memory or disk queues on outputs (solving Just to corral the craziness, I'll always use RFC5424 ("new syslog" format; the RFC is written by the rsyslog author, sigh), and will always use the "advanced" configuration/scripting language. And will write to per-day log files saved in per-actor directories. Rsyslog allows writing to every imaginable output; we might well add PostgreSQL at some point, but not for now. |
| Comment by cloomis [ 29/May/20 ] |
|
Bump, damnit. At least for the temperature boards: the disk filled up on temps-b1. We don't need to record all command traffic, but I think we should. |
| Comment by cloomis [ 09/Jul/20 ] |
|
bump bump on the temps boards. There is no workaround besides not logging, and we are wedging systems. |
| Comment by hassan [ 14/Aug/20 ] |
|
Arnaud to look at Craig's proposal |
| Comment by hassan [ 01/Jun/22 ] |
|
Following today's SMx discussion, this could be the cause of the recent problem where the STS lost contact with SM1. |
| Comment by cloomis [ 27/Jul/23 ] |
|
Bump. On 2023-07-26 the /data NFS mount just got bounced on all the PFS hosts, and python logging from all the PFS actors to /data/logs stopped. Being the scenes the failure might have been worse than mere silence: where were all the messages going, I wonder? Getting queued, but is there a limit, etc. On the original question about whether the ugly but proven rsyslog is the right tool? I will not look further but invite others to do so. It is proven and we have proven that we can use it. We would been to choose a logging host, select the output (files vs db vs xxx) and write the exploder. Thoughts on the host, Yoshida, Hiroshige? |
| Comment by cloomis [ 08/Nov/23 ] |
|
Bump again, at least for the SPS temp hosts, which do not use NFS at all and have one tiny disk. |
| Comment by arnaud.lefur [ 10/Nov/23 ] |
|
Yoshida, Hiroshige we would need a rsyslog host, is that something you / CDM could easily provide ? |
| Comment by Yoshida, Hiroshige [ 10/Nov/23 ] |
|
Can't we use the existing logger-ics? It currently runs rsyslog (mainly for collecting server/host messages). |
| Comment by cloomis [ 11/Nov/23 ] |
|
Is there a VM at IRx? Would be nice to maintain writes for SPS during outages. Could just use mhs-ics I guess. Maybe that's the right move. Also, does anyone have thoughts about storage? The obvious implementation is to explode back into the same /data/logs tree, but I'm wondering if anyone has any experience/preference with any other. Are we running any log-friendly backends? |