[INSTRM-1424] Make rerun directories writeable on the Hilo PFS cluster Created: 28/Oct/21 Updated: 07/Dec/22 Resolved: 07/Dec/22 |
|
| Status: | Done |
| Project: | Instrument control development |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Story | Priority: | Normal |
| Reporter: | rhl | Assignee: | Kiyoto Yabe |
| Resolution: | Done | Votes: | 0 |
| Labels: | EngRun | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Story Points: | 1 |
| Sprint: | PreEngRun4 |
| Description |
|
Can we sort out permissions on /data/drp/repo/rerun? I can't write to it, as we mounted the entire file system R/O (which is reasonable). I think we need to replace that rerun directory with a symbolic link on both the summit and in Hilo, and make the Hilo one point to a writeable file system (e.g. /work/rerun). The issue is that running the pipelines with a -rerun a/b/c option is extremely useful, and this expands to the directory /data/drp/repo/rerun/a/b/c so the user code needs to be able to write to /data/drp/repo/rerun. I don't think that it need be the same directory as on the summit (although that would be nice), so probably the easiest solution is to make it a symbolic link in both places, pointing to local disk. |
| Comments |
| Comment by Kiyoto Yabe [ 11/Nov/21 ] |
|
Maybe I don't fully understand the demand, but why do we need a symbolic link at summit? Without changing the configuration at summit, I think there are two options:
Does either option solve the problem? |
| Comment by rhl [ 11/Nov/21 ] |
|
If you can make /data/drp/repo/rerun writeable that'd be great, but I assumed that you mounted the /data/drp/repo filesystem read-only. My proposal was to get around that limit, but if we don't need to, that'd be better. |
| Comment by cloomis [ 11/Nov/21 ] |
|
It was not obvious when we started using the Hilo machines how much could be NFS-shared between the two sites, and how useful the summit DRP machines would be. Experience suggests that we generally should not share: the performance hit from the link is real, and the summit machines are too slow to be interesting. On the Hilo side, since we must read all images at least once before getting anything from DRP, we might as well do that on ingest. Make the entire /data/drp local to Hilo, run ingestPfsImages --mode=copy, then operate independently. So I vote for making the entire /data/drp Hilo-local and writable there. Sadly we might still need/want summit reductions for operational reasons, but I think for now splitting Hilo from the summit is the sensible move. |
| Comment by Kiyoto Yabe [ 12/Nov/21 ] |
|
In Sep. run, we actually operated /work/drp as a locally isolated DRP directory, which mounting /data/drp as RO. Is it OK to have the same mechanism in Nov. run? Or should we rename the local directory to /data/drp and remove the RO mount? |
| Comment by rhl [ 12/Nov/21 ] |
|
I think you should choose any mechanism that works for you (cloomis may have helpful thoughts); all I care about is that the path is the same in the two places and that I can write to a reruns directory. The reruns needn't be shared with the summit |
| Comment by Kiyoto Yabe [ 24/Dec/21 ] |
|
In the Nov. run, we finally operated the rerun directory independently at summit (/data/drp/repo/rerun) and Hilo (/work/drp/rerun). I think we should sort out again the location before the next run. Can we use only Hilo location or do we still need to use the summit location in the next run? |
| Comment by Kiyoto Yabe [ 24/Dec/21 ] |
|
If we use the single location in future, how do we merge the previous data into the new location? Is it OK to just mv/copy (or making symbolic links) or do we need to take a special care of it in terms of DRP (e.g. registry consistency)? |
| Comment by Kiyoto Yabe [ 14/Oct/22 ] |
|
I think we are still using `/data/drp/sm1-5.2/rerun` but mostly and actively using the Hilo location `/work/drp/rerun` and I don't think we see any problem for a while. So, can we close this ticket and file another if needed, or should this be still open? cloomis? rhl? |
| Comment by Kiyoto Yabe [ 07/Dec/22 ] |
|
I close this and file another if necessary. |