[PIPE2D-127] Profile the pipeline Created: 19/Nov/16 Updated: 23/Jun/17 Resolved: 23/Jun/17 |
|
| Status: | Done |
| Project: | DRP 2-D Pipeline |
| Component/s: | None |
| Affects Version/s: | None |
| Fix Version/s: | None |
| Type: | Story | Priority: | Major |
| Reporter: | price | Assignee: | price |
| Resolution: | Done | Votes: | 0 |
| Labels: | None | ||
| Remaining Estimate: | Not Specified | ||
| Time Spent: | Not Specified | ||
| Original Estimate: | Not Specified | ||
| Attachments: |
|
| Sprint: | 2014-16 |
| Reviewers: | rhl |
| Description |
|
Profile the pipeline to identify hotspots and low-hanging fruit. |
| Comments |
| Comment by price [ 03/Dec/16 ] |
|
I'll post the details soon, but here are the main results: General gains could be made by looking at ISR:
pfs::drp::stella::findCenterPositionsOneTrace is calling pfs::drp::stella::math::insertSorted, which is calling std::vector::insert. In general, you should never use std::vector::insert. cplusplus.com says:
Instead, push everything onto the end of the std::vector and then use std::sort at the end. Or use a list instead of an array, but that may have performance implications in other places. Cache the end() iterator in loops over ndarray (especially 2D arrays, which have more complicated iterators) because that's showing up in the profiles. In ReduceArcTask, this snippet is using the lion's share of the runtime: Line # Hits Time Per Hit % Time Line Contents ============================================================== 165 30412812 67429737 2.2 30.5 for j in range(traceIds.shape[0]): 166 30412800 72818425 2.4 32.9 if traceIds[j] != l: 167 7200 16710 2.3 0.0 l = traceIds[j] 168 30412800 74841313 2.5 33.8 if traceIds[j] == traceIdsUnique[traceId]: 169 50688 125401 2.5 0.1 wLenTemp[k] = wavelengths[j] 170 50688 116528 2.3 0.1 k = k+1
The code is clearer and faster (18.774 sec --> 6.077 sec) if written as:
indices = np.where(traceIds == traceIdsUnique[traceId])[0]
assert len(indices) == len(wLenTemp)
wLenTemp[:] = wavelengths[indices]
(The assert is there because it wasn't obvious to me that the two vectors should have the same length — I suggest checking the original assignment of wLenTemp. In fact, you could drop the assert, the original assignment of wLenTemp and the [:] in the above and it should work just fine, even a tad faster.) Some additional thoughts:
|
| Comment by price [ 03/Dec/16 ] |
|
Oh, and I forgot to mention, constructDark is taking a long time hunting cosmics (190 of 224 sec), which seems excessive. |
| Comment by price [ 03/Dec/16 ] |
|
Details are attached in profile.txt. |
| Comment by price [ 03/Dec/16 ] |
|
Code changes made as part of this work:
price@price-laptop:~/PFS/pfs_pipe2d (tickets/PIPE2D-127=) $ git sub
commit 116be6508ce3f6f0f201fce6e458b279aadfaa3e
Author: Paul Price <price@astro.princeton.edu>
Date: Thu Dec 1 15:06:37 2016 -0500
add support for python profiling in integration test
bin/pfs_integration_test.sh | 45 ++++++++++++++++++++++++++++++++++++---------
1 file changed, 36 insertions(+), 9 deletions(-)
commit 79bd9dd1d94526023d5d07f25884ef221dc74bef
Author: Paul Price <price@astro.princeton.edu>
Date: Fri Dec 2 08:46:07 2016 -0500
travis: build with 2 cores
bin/pfs_travis.sh | 2 ++
1 file changed, 2 insertions(+)
|
| Comment by price [ 03/Dec/16 ] |
|
And:
price@price-laptop:~/PFS/obs_pfs (tickets/PIPE2D-127=) $ git sub
commit 2377391fc9c0875d8914eee93611b8154e7e2ebf
Author: Paul Price <price@astro.princeton.edu>
Date: Thu Dec 1 17:19:25 2016 -0500
ingest: clean up for style
python/lsst/obs/pfs/ingest.py | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
|
| Comment by price [ 06/Dec/16 ] |
|
rhl, could you please verify this is the level of detail you were after? I haven't yet made tickets to implement these suggestions. Would you like multiple tickets or one ticket to rule them all? |
| Comment by rhl [ 25/Feb/17 ] |
|
That's useful. I don't know how swinbank wants to track this, but probably creating the tickets would be useful |
| Comment by swinbank [ 25/Feb/17 ] |
|
Please go ahead and create multiple tickets. |
| Comment by rhl [ 23/Jun/17 ] |
|
I think we need to close this – you did the work. I'll open a new ticket to repeat the work with multiprocessing turned off (--batch-type None) and hope/pray that some of the hot spots have been eliminated. Also, we should process the runPipeline script. |