[PIPE2D-1060] Tune hyperparameters of fluxmodel interpolation Created: 13/Jul/22  Updated: 05/Jun/23  Resolved: 17/Aug/22

Status: Done
Project: DRP 2-D Pipeline
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Task Priority: Normal
Reporter: sogo.mineo Assignee: sogo.mineo
Resolution: Done Votes: 0
Labels: flux-calibration
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: PNG File interpolation.error.full.png     PNG File interpolation.error.png     PNG File max.relative.error (bad zones masked).png     PNG File max.relative.error.png     PNG File rms.relative.error (bad zones masked).png     PNG File rms.relative.error.png    
Issue Links:
Relates
relates to PIPE2D-1231 Reduce memory usage of fitPfsReferenc... Done
Reviewers: hassan

 Description   

In PIPE2D-1053, I made a set of flux models, interpolating spectra on a fine grid of parameters (Teff, log(g), metalicity, alpha). The interpolator (RBF) has hyperparameters such as kernel type and kernel size. Because the "kernel size" (epsilon) is only isotropic, I set epsilon to a fixed constant, and instead introduced hyperparameters teffScale, loggScale, mScale and alphaScale by which to multiply the four parameters (Teff, log(g), metalicity, alpha) respectively. In PIPE2D-1053, I experimented a little and set kernel="multiquadric", teffScale=0.5e-3, loggScale=0.5, mScale=2 and alphaScale=0.5 with epsilon=2.

The hyperparameters are, however, not optimal. For low temperature and low log(g), I know that the max relative error max_{λ} |(rbf(λ) - f(λ)) / f(λ)| is sometimes above 100%, where f(λ) is an AMBRE spectrum at (Teff, log(g), metal, alpha), and rbf(λ) is the RBF interpolation at (Teff, log(g), metal, alpha) guessed from all the other combinations of parameters than (Teff, log(g), metal, alpha). I must tune the hyperparameters more carefully.

The smaller the scale parameters are, the better the RBF interpolation is. As the scale parameters get smaller, however, the condition number of the matrix of the linear equations to be solved gets worse (wikipedia says). In searching for the best scale parameters, I set the initial guess to somewhat big value, assuming that the condition number was small there, and hoping that the optimizer would optimize the parameters gradually to a smaller values as long as the linear equations were soluble. The optimization program has been running for a week, and it now seems to have found the minimum.

The optimized problem was as follows: Minimize
sum_{i = 1}^{6000} loss(i)
, where i is the name of an AMBRE spectrum. We compute loss(i) in this way: We put i aside and make an RBF model from all the other AMBRE spectra. We interpolate the spectra at i and compare the guess and the truth (We take care that this interpolation won't be extrapolation. If this interpolation is extrapolation, i is excluded from the sum). Let loss(i) = \sum_{λ} (guess(λ) - truth(λ))^2 / truth(λ)^2. Because it would take all too long a time if we were to compute this last sum in full, we use only the shortest 1/3 of the entire wavelength range (where absorption lines densely exist) and abandon 9/10 of the samples in this range.

The new fluxmodeldata package, when made, will be compatible to the previous one (PIPE2D-1053). Programs need not changing.

Edit: Uploaded the new version of fluxmodeldata here: https://hscdata.mtk.nao.ac.jp/hsc_bin_dist/pfs/fluxmodeldata-ambre-20220714-full.tar.xz
Some quality assessments of this package are shown in the comments of this issue.



 Comments   
Comment by sogo.mineo [ 14/Jul/22 ]


Histogram of e_{i} (i = 1,...,6000), where e_{i} = max_{λ} |rbf_{i}(λ) - f_{i}(λ)| / f_{i}(λ), f_{i} is the AMBRE spectrum at i, and rbf_{i} is the RBF interpolation guessed from the other points. A small number of e_{i} are above 5%, but I don't think they are problematic, as I am going to explain later.

Comment by sogo.mineo [ 14/Jul/22 ]


Histogram of e'_{i} (i = 1,...,6000) where e'_{i} = root mean square of (rbf_{i}(λ) - f_{i}(λ)) / f_{i}(λ). RMS error is below < 0.3% for all points.

Comment by sogo.mineo [ 14/Jul/22 ]


Two bad examples whose interpolation errors are large. The spikes of relative errors are always at ~ 400nm. The strange error at the right edge (>1200nm) of the lower panel is caused by wavelength extrapolation (not related to this issue).

Comment by sogo.mineo [ 14/Jul/22 ]


When I look into the spikes in detail, I see that the absorption is so deep there that the denominator of the relative error (guess - truth) / truth is very small, which is why the relative error is large there. Therefore I don't think the spikes will affect the flux calibration.

Comment by sogo.mineo [ 15/Jul/22 ]

I confirmed that drp_stella's unit tests passes with the new fluxmodeldata package.

Comment by sogo.mineo [ 15/Jul/22 ]

Could you review this task? The product of this task has been uploaded to the URL shown at the bottom of the description. There are no changes to programs. Only the full fluxmodeldata package (PIPE2D-1053) is superseded.

Comment by hassan [ 11/Aug/22 ]

I think this is fine. Would it be possible to test the situation where the absorption at ~393.5 nm is masked out during the interpolation process?

Comment by sogo.mineo [ 16/Aug/22 ]

I started the computation just now. It will end in a day.

Comment by sogo.mineo [ 17/Aug/22 ]


These are the error histograms with bad zones masked. Maximum errors are reduced but average errors don't decrease very much.

> the absorption at ~393.5 nm is masked out during the interpolation process
Masking a wavelength zone does not affect interpolation in other zones because the interpolation at one wavelength is independent of the interpolation at another. In drawing the histograms above, I masked the bad zones not during the interpolation process but during the computation of errors.

Comment by sogo.mineo [ 17/Aug/22 ]

I am closing this issue. Thank you for reviewing.

Generated at Mon Apr 07 08:42:28 JST 2025 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.