[PIPE2D-127] Profile the pipeline Created: 19/Nov/16  Updated: 23/Jun/17  Resolved: 23/Jun/17

Status: Done
Project: DRP 2-D Pipeline
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Story Priority: Major
Reporter: price Assignee: price
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified

Attachments: Text File profile.txt    
Sprint: 2014-16
Reviewers: rhl

 Description   

Profile the pipeline to identify hotspots and low-hanging fruit.



 Comments   
Comment by price [ 03/Dec/16 ]

I'll post the details soon, but here are the main results:

General gains could be made by looking at ISR:

  • calcEffectiveGain, setGain aren't useful in my opinion
  • Statistics gets called a lot, for the same kind of need (background level)
  • Need to investigate remaining parts (CCD assembly, interpolation) to see if we can squeeze some performance

pfs::drp::stella::findCenterPositionsOneTrace is calling pfs::drp::stella::math::insertSorted, which is calling std::vector::insert. In general, you should never use std::vector::insert. cplusplus.com says:

Because vectors use an array as their underlying storage, inserting elements in positions other than the vector end causes the container to relocate all the elements that were after position to their new positions. This is generally an inefficient operation compared to the one performed for the same operation by other kinds of sequence containers (such as list or forward_list).

Instead, push everything onto the end of the std::vector and then use std::sort at the end. Or use a list instead of an array, but that may have performance implications in other places.

Cache the end() iterator in loops over ndarray (especially 2D arrays, which have more complicated iterators) because that's showing up in the profiles.

In ReduceArcTask, this snippet is using the lion's share of the runtime:

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   165  30412812     67429737      2.2     30.5                  for j in range(traceIds.shape[0]):
   166  30412800     72818425      2.4     32.9                      if traceIds[j] != l:
   167      7200        16710      2.3      0.0                          l = traceIds[j]
   168  30412800     74841313      2.5     33.8                      if traceIds[j] == traceIdsUnique[traceId]:
   169     50688       125401      2.5      0.1                          wLenTemp[k] = wavelengths[j]
   170     50688       116528      2.3      0.1                          k = k+1
  • l isn't used for anything, so can immediately save 33%.
  • Use an iterator for clarity and to save dereferencing everything, e.g., for trace in traceIds
  • Iterating over individual vector (or image) values in python is something to avoid like the plague.

The code is clearer and faster (18.774 sec --> 6.077 sec) if written as:

                    indices = np.where(traceIds == traceIdsUnique[traceId])[0]
                    assert len(indices) == len(wLenTemp)
                    wLenTemp[:] = wavelengths[indices]

(The assert is there because it wasn't obvious to me that the two vectors should have the same length — I suggest checking the original assignment of wLenTemp. In fact, you could drop the assert, the original assignment of wLenTemp and the [:] in the above and it should work just fine, even a tad faster.)

Some additional thoughts:

  • fiberTrace files are being read using pyfits. Can we use lsst::afw::table instead? pyfits is notoriously slow.
  • Why is there a single fiberTrace for all time, rather than versioning it like a bias or flat?
Comment by price [ 03/Dec/16 ]

Oh, and I forgot to mention, constructDark is taking a long time hunting cosmics (190 of 224 sec), which seems excessive.

Comment by price [ 03/Dec/16 ]

Details are attached in profile.txt.

Comment by price [ 03/Dec/16 ]

Code changes made as part of this work:

price@price-laptop:~/PFS/pfs_pipe2d (tickets/PIPE2D-127=) $ git sub
commit 116be6508ce3f6f0f201fce6e458b279aadfaa3e
Author: Paul Price <price@astro.princeton.edu>
Date:   Thu Dec 1 15:06:37 2016 -0500

    add support for python profiling in integration test

 bin/pfs_integration_test.sh | 45 ++++++++++++++++++++++++++++++++++++---------
 1 file changed, 36 insertions(+), 9 deletions(-)

commit 79bd9dd1d94526023d5d07f25884ef221dc74bef
Author: Paul Price <price@astro.princeton.edu>
Date:   Fri Dec 2 08:46:07 2016 -0500

    travis: build with 2 cores

 bin/pfs_travis.sh | 2 ++
 1 file changed, 2 insertions(+)
Comment by price [ 03/Dec/16 ]

And:

price@price-laptop:~/PFS/obs_pfs (tickets/PIPE2D-127=) $ git sub
commit 2377391fc9c0875d8914eee93611b8154e7e2ebf
Author: Paul Price <price@astro.princeton.edu>
Date:   Thu Dec 1 17:19:25 2016 -0500

    ingest: clean up for style

 python/lsst/obs/pfs/ingest.py | 22 ++++++++++------------
 1 file changed, 10 insertions(+), 12 deletions(-)
Comment by price [ 06/Dec/16 ]

rhl, could you please verify this is the level of detail you were after? I haven't yet made tickets to implement these suggestions. Would you like multiple tickets or one ticket to rule them all?

Comment by rhl [ 25/Feb/17 ]

That's useful. I don't know how swinbank wants to track this, but probably creating the tickets would be useful

Comment by swinbank [ 25/Feb/17 ]

Please go ahead and create multiple tickets.

Comment by rhl [ 23/Jun/17 ]

I think we need to close this – you did the work. I'll open a new ticket to repeat the work with multiprocessing turned off (--batch-type None) and hope/pray that some of the hot spots have been eliminated.

Also, we should process the runPipeline script.

Generated at Sat Feb 10 15:48:43 JST 2024 using Jira 8.3.4#803005-sha1:1f96e09b3c60279a408a2ae47be3c745f571388b.