PIPE2D-1058: Avoid/fix registry sqlite deadlocks
We're going to move the registry to postgresql. I've asked CSES for a new database on tiger2-sumire.
Testing on my laptop while waiting for CSES:
createuser -h localhost pfs
createdb -h localhost --owner=pfs pfs_gen2
sqlite3 registry.sqlite3 .dump | sed -e 's|^PRAGMA.*$||' -e 's|integer primary key autoincrement|serial primary key|g' -e 's|double|double precision|g' -e 's|^.*sqlite.*$||' -e 's|pfsDesignId int|pfsDesignId bigint|g' | tee registry.sql | psql -h localhost -U pfs pfs_gen2
Here's how to delete all the tables in a database in postgresql:
DO $$ DECLARE
r RECORD;
BEGIN
FOR r IN (SELECT tablename FROM pg_tables WHERE schemaname = current_schema()) LOOP
EXECUTE 'DROP TABLE IF EXISTS ' || quote_ident(r.tablename) || ' CASCADE';
END LOOP;
END $$;
pfs_gen2=> select field, pfsDesignId, count(*) from raw group by field, pfsDesignId order by field, pfsDesignId;
Results look identical (at least for the first page!) between postresql and sqlite.
Yabe-san created a database at Hilo:
psql -h 133.40.152.102 -p 5433 -U pfs -d registry_gen2
The password is XXX
Save that password:
price@pfsa-usr01-gb:~ $ cat .pgpass
*:*:registry_gen2:pfs:XXX
I'm working on pfsa-usr01-gb (via "ssh pfs" with the VPN), which I'm told is at Hilo. Looks like there's a data repo in /work/drp, and raw data in /data/raw, and the stack is /work/stack.
price@pfsa-usr01-gb:~ $ sqlite3 /work/drp/registry.sqlite3 .dump | sed -e 's|^PRAGMA.*$||' -e 's|integer primary key autoincrement|serial primary key|g' -e 's|double|double precision|g' -e 's|^.*sqlite.*$||' -e 's|pfsDesignId int|pfsDesignId bigint|g' | tee registry.sql | psql -h 133.40.152.102 -p 5433 -U pfs registry_gen2
Well, that worked just fine.
To use it, we need to put a "registry.pgsql" file in the repo:
price@pfsa-usr01-gb:~ $ cat /work/drp/registry.pgsql
host: "133.40.152.102"
port: 5433
user: "pfs"
database: "registry_gen2"
(Password will be supplied via ~/.pgpass, which is better protected.)
price@pfsa-usr01-gb:~ $ . /work/stack/loadLSST.bash
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~ $ setup pfs_pipe2d
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~ $ reduceExposure.py /work/drp --calib /work/drp/CALIB-20220630 --rerun price/pipe2d-1058 --id visit=72016 arm=r
root INFO: Loading config overrride file '/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/obs_pfs/w.2022.17/config/reduceExposure.py'
CameraMapper INFO: Loading exposure registry from /work/drp/registry.pgsql
CameraMapper INFO: Loading calib registry from /work/drp/CALIB-20220630/calibRegistry.sqlite3
CameraMapper INFO: Loading calib registry from /work/drp/CALIB-20220630/calibRegistry.sqlite3
root WARN: No data found for dataId=OrderedDict([('visit', 72016), ('arm', 'r')])
root INFO: Running: /work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/drp_stella/w.2022.17/bin/reduceExposure.py /work/drp --calib /work/drp/CALIB-20220630 --rerun price/pipe2d-1058 --id visit=72016 arm=r
root WARN: Not running the task because there is no data to process; you may preview data using "--show data"
Well, that worked, sort of. It went to the correct registry, but there's no visit=72016 in there. I guess that means we can test the ingest!
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~ $ ingestPfsImages.py /work/drp "/data/raw/2022-02-22/sps/PFSA07*.fits" --mode=copy
CameraMapper INFO: Loading exposure registry from /work/drp/registry.pgsql
CameraMapper INFO: Loading calib registry from /work/drp/CALIB/calibRegistry.sqlite3
Traceback (most recent call last):
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/obs_pfs/w.2022.17/bin/ingestPfsImages.py", line 3, in <module>
PfsIngestTask.parseAndRun()
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 416, in parseAndRun
task.run(args)
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 542, in run
context = self.register.openRegistry(root, create=args.create, dryrun=args.dryrun)
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 298, in openRegistry
context = RegistryContext(registryName, self.createTable, create, self.config.permissions)
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/obs_pfs/w.2022.17/python/lsst/obs/pfs/ingest.py", line 53, in __init__
os.chmod(registryName, permissions)
PermissionError: [Errno 1] Operation not permitted: '/work/drp/registry.sqlite3'
So we need to fix the ingest. Basing off w.2022.17 to be consistent with what's in use at Hilo.
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~ $ ingestPfsImages.py /work/drp "/data/raw/2022-02-22/sps/PFSA07*.fits" --mode=copy --pfsConfigDir /work/drp/pfsDesign -c clobber=True
Success! But the visits are not in the psql registry... and it's ended up in the sqlite registry!
Oh! There's a separate script for ingesting into a psql registry. It has a different RegisterTask with slightly different SQL commands for creating the tables, and such.
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~/pfs/obs_pfs[tickets/PIPE2D-1058] $ ingestPfsImagesPgsql.py /work/drp "/data/raw/2022-02-22/sps/PFSA07*.fits" --mode=copy --pfsConfigDir /work/drp/pfsDesign -c clobber=True
CameraMapper INFO: Loading exposure registry from /work/drp/registry.pgsql
CameraMapper INFO: Loading calib registry from /work/drp/CALIB/calibRegistry.sqlite3
ingestPfs INFO: /data/raw/2022-02-22/sps/PFSA07206314.fits --<copy>--> /work/drp/2022-02-22/PFSA072063m1.fits
File "/home/price/pfs/obs_pfs/bin/ingestPfsImagesPgsql.py", line 3, in <module>
PfsPgsqlIngestTask.parseAndRun()
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 416, in parseAndRun
task.run(args)
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 553, in run
self.register.addRow(registry, info, dryrun=args.dryrun, create=args.create)
File "/work/stack_INFRA-290/stack/miniconda3-4.5.12-1172c30/Linux64/pipe_tasks/18.1.0/python/lsst/pipe/tasks/ingest.py", line 368, in addRow
conn.cursor().execute(sql, values)
psycopg2.IntegrityError: duplicate key value violates unique constraint "raw_pkey"
DETAIL: Key (id)=(2) already exists.
We need to reset the auto increment start point.
registry_gen2=# select pg_get_serial_sequence('raw', 'id');
pg_get_serial_sequence
------------------------
public.raw_id_seq
(1 row)
registry_gen2=# select max(id) from raw;
max
-------
94666
(1 row)
registry_gen2=# ALTER SEQUENCE raw_id_seq RESTART WITH 94667;
ALTER SEQUENCE
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~/pfs/obs_pfs[tickets/PIPE2D-1058] $ ingestPfsImagesPgsql.py /work/drp "/data/raw/2022-02-22/sps/PFSA07*.fits" --mode=copy --pfsConfigDir /work/drp/pfsDesign -c clobber=True
(lsst-scipipe-1172c30) price@pfsa-usr01-gb:~/pfs/obs_pfs[tickets/PIPE2D-1058] $ reduceExposure.py /work/drp --calib /work/drp/CALIB-20220630 --rerun price/pipe2d-1058 --id visit=72016 arm=r -c isr.doFlat=False
That works!