[PIPE2D-1231] Reduce memory usage of fitPfsReferenceFlux Created: 05/Jun/23 Updated: 16/Jun/23 Resolved: 09/Jun/23 |
|
Status: | Done |
Project: | DRP 2-D Pipeline |
Component/s: | None |
Affects Version/s: | None |
Fix Version/s: | None |
Type: | Task | Priority: | Normal |
Reporter: | sogo.mineo | Assignee: | sogo.mineo |
Resolution: | Done | Votes: | 0 |
Labels: | flux-calibration | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Attachments: |
![]() ![]() |
||||||||||||||||
Issue Links: |
|
||||||||||||||||
Reviewers: | price |
Description |
I want to use PCA to reduce memory usage of fitPfsReferenceFlux. Though it takes several days with 500GB of memory The interpolator of flux models has been a function from the 4-dimensional parameter space I decide to use RBF to fit this function R^4 → R^1024. I use the following procedure to estimate interpolation errors: For each x[i] of input ~6000 spectra: Make RBF from the input ~6000 spectra except x[i] Interpolate a spectrum y at the same parameter as x[i] rms[i] = sqrt(mean(square((y - x[i]) / x[i]))) error = sqrt(mean(square(rms))) Below is the histogram of "rms" with the best hyperparameters. The RMS of "rms" is 2.6e-4 I can see that the increase of errors from I want this new PCA-based interpolation merged to the master branch. The new code requires a new version of fluxmodeldata, which I have uploaded here: [-https://hscdata.mtk.nao.ac.jp/hsc_bin_dist/pfs/fluxmodeldata-ambre-20230602.tar.gz-] To use the new fluxmodeldata, users have to run ./install.sh --prefix=/path/to/pfs-packages [--set=small] to |
Comments |
Comment by sogo.mineo [ 05/Jun/23 ] |
Could you review this PR? |
Comment by price [ 07/Jun/23 ] |
The packaging for fluxmodeldata doesn't make much sense to me. install.py doesn't actually do an installation, but it precomputes the data. Moreover, it puts the products in the same directory, not actually installing it anywhere. That means that if I want both small and full packages, I have to untar the tarball, rename the directory, and run the script in both. It's probably not worth fixing now, but for the next iteration it would be helpful to solve. |
Comment by sogo.mineo [ 07/Jun/23 ] |
I am making a new fluxmodeldata package whose install.py actually installs files to PREFIX/fluxmodeldata-ambre-20230602-small or PREFIX/fluxmodeldata-ambre-20230602-full according to --set option. (I will upload it tomorrow). I have made changes to the ticket branch (and reverted makeFluxModelInterpolator.py) to keep supporting old fluxmodeldata packages. I have forgot to revise the commit message, but I don't have time to amend it today. Please give me comments, if any, on anything except the commit message. A question: I put @deprecated decorator on makeFluxModelInterpolator() function (in makeFluxModelInterpolator.py) but no DeprecationWarning is seen when the program is run. It seems to be python's (not deprecated's) default behavior. What is the best way to inform the user of the deprecation? Should I use print()? |
Comment by sogo.mineo [ 08/Jun/23 ] |
I uploaded the new fluxmodeldata package (https://hscdata.mtk.nao.ac.jp/hsc_bin_dist/pfs/fluxmodeldata-ambre-20230608.tar.gz), and inscribed this specific version in the deprecation messages in the sources. (For example, "NaiveFluxModelInterpolator has been replaced by PCAFluxModelInterpolator, which requires fluxmodeldata >= ambre-20230608. See I decide to let makeFluxModelInterpolator.py exit immediately if fluxmodeldata is new, because it is not compatible with the new fluxmodeldata. I keep makeFluxModelInterpolator.py only for old versions of fluxmodeldata. |
Comment by sogo.mineo [ 09/Jun/23 ] |
Merged. Thanks for reviewing. |
Comment by price [ 15/Jun/23 ] |
I just tried installing the new fluxmodeldata package, and it was much smoother, thanks! |
Comment by price [ 15/Jun/23 ] |
Ah, no! I wasn't aware that the package name was added to the prefix, so everything was installed down a level. |