[INFRA-4] Define format for persistence of 1-D spectra Created: 16/Jul/14  Updated: 08/Jul/16  Resolved: 08/Jul/16

Status: Won't Fix
Project: Software Development Infrastructure
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Story Priority: Major
Reporter: rhl Assignee: rhl
Resolution: Won't Fix Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Issue Links:
Blocks
blocks SIM2D-28 Simulate arc spectra Done
blocks SIM2D-29 Simulate static OH emission spectra Done
blocks SIM2D-31 Simulate flat field lamp spectra Done
blocks SIM2D-32 Simulate galaxy spectra Done
blocks SIM2D-33 Simulate stellar spectra Done
Relates
relates to SIM2D-35 Water vapour absorption Won't Fix
Epic Link: Data Model
Sprint: 2014-12, 2014-13, 2014-14
Reviewers: cloomis

 Description   

I expect that this will be a multi-extension fits file containing data, noise, and flags. We should take a careful look at the SDSS BOSS spectro files and see what we can learn.

I expect that the same file format will be used for injection of at least some of the input spectra that the simulator needs. I say, "some" because e.g. arc spectra are probable better described in terms of (lambda, intensity, width) than a 1-D spectrum.



 Comments   
Comment by rhl [ 19/Jul/14 ]

You've already thought about this.

Comment by bick [ 11/Dec/14 ]

The experience I've had so far has been with the SDSS spPlate file format, and I'm leaning toward a modified version of it. The short explanation is that it's multi-extension FITS with an extension for flux, inverse variance (we could use variance instead), sky spectrum subtracted (maybe/maybe-not suitable in our case), bit-flags, and metadata.

Here it is described in its modern form for SDSS3:

http://data.sdss3.org/datamodel/files/SPECTRO_REDUX/RUN2D/PLATE4/spPlate.html

And here's the earlier variant from SDSS2:

http://classic.sdss.org/dr7/dm/flatFiles/spPlate.html

For the work I did on spectroscopic variability with Carlos Badenes, we modified the spPlate format to split the spectra into separate files, each containing the co-added spectrum plus the individual spectra which were stacked to produce it. The MEF structure was the same. The result was the so-called spFiber file. I've pinged a few people who've worked with both spFiber files and spPlate files to get some feedback, and by and large the preference is for 1 source per file. The caveat that's been mentioned is that it's often necessary to check neighboring fibers as sources of contamination, and that's easier when all data are in the same file.

In bullets, here's why I like the spFiber:

  • It's based on SDSS, which is:
    • well tested, and
    • already familiar to the community.
  • It's a per-source format, which is:
    • more easily run in parallel,
    • easier to handle for users who want specific objects.

One concern I have with it is that there's some metadata which is common to all fibers in the focal plane ... exposure time, observing conditions, etc. In our spFiber files, we just repeated that info in each file, which is clearly inefficient (though not particularly badly so ... it's just not that much info, after all).

In any case, I have an example of an spFiber, and a short Python script which loads it to make a trivial plot of the coadd and its sub-spectra. Where is a suitable place to check these in, or to otherwise share them?

Also, although I'm leaning towards the spFiber model, I don't consider it a done deal. If others have suggestions/ideas/concerns, please speak.

Comment by Anonymous [ 12/Dec/14 ]

Let me channel a pair of questions which I have been asked about using SDSS reductions. The proposed per-fiber model (which I like) opens some choices up.

  • Do we provide pre-coadd "spFiber"s, where pixel wavelengths have not been interpolated onto a common vector? I would expect 3 rows or 3 HDUs for the three cameras. Basically: can people get "well-calibrated" detector pixels with unambiguous bitmasks?
  • For the coadd spFibers, do we lock the wavelength CRVAL1 to be int*CD1_1, where CD1_1 is a constant for all reductions? Even for coadds of one exposure? Basically: can spFibers be directly compared/grouped without interpolation, which process no one gets right in the face of complex bitmasks?
Comment by rhl [ 08/Jul/16 ]

This work has been moved into the data model project (DAMD), and is tracked in git at https://github.com/Subaru-PFS/datamodel/blob/master/datamodel.txt

Generated at Sat Feb 10 16:48:25 JST 2024 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.