[mus422] Coder runtime

Tue Feb 23 10:17:48 PST 2010

I have heard from several of you that your coders are running very slowly.
I tested the coder framework on a few SQAM samples of about 20 seconds
each in duration and find that it encoded/decoded each sample in about 2
minutes or so.  If your coder is running much more slowly than this, you
probably should run the Python profiler on your code to find and remove
bottlenecks slowing down your code.  

Profiling is pretty easy in Python and will quickly show you where your
coder is spending most of its time. (The answer is sometimes quite
surprising.)  To run the profiler, you import the library cProfile and
then pass a string containing the Python code you'd like profiled to
cProfile.run().  For example (from the Python documentation): 

  _____  

To profile an application with a main entry point of foo(), you would add
the following to your module:

import cProfile
cProfile.run('foo()')

(Use profile instead of cProfile if the latter is not available on your
system.)

  _____  

It is my impression that some of you are still including too many loops in
your code.  To illustrate how important it is to vectorize any long loops,
I've enclosed a simple quantization example that profiles 3 different ways
of implementing uniform midrise quantization on a vector of input signal
values.  (To keep it simple, I've ignored the sign bit and am only passing
unsigned signal values between 0.0 and 1.0.  However, I think even this
simplified example will give you an idea of what to look for in your
code.)  

The first case loops over signal values and quantizes each one.  The
second case does the same through a list comprehension (i.e. code of the
form "[ f(x) for x in input_vec]").  Finally, the third case does a real
vectorized calculation (as can be seen by the fact that there is no Python
code looping over vector elements).  

I have seen several of you do list comprehensions on the quantization
homework where vectorizing would have been better.  The profiler will tell
us how much of a runtime penalty comes from doing a list comprehension
instead of vectorizing.  Here is what the profiler output looks like when
running this example on my machine:

  _____  

        410602 function calls in 5.638 CPU seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.009    0.009    5.638    5.638 <string>:2(<module>)
      200    2.881    0.014    3.698    0.018 vectorizing and profiling
example.py:11(Q1)
      200    1.892    0.009    1.892    0.009 vectorizing and profiling
example.py:18(Q2)
      200    0.007    0.000    0.012    0.000 vectorizing and profiling
example.py:23(Q3)
   409600    0.816    0.000    0.816    0.000 {method 'append' of 'list'
objects}
      200    0.005    0.000    0.005    0.000 {method 'astype' of
'numpy.ndarray' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of
'_lsprof.Profiler' objects}
      200    0.028    0.000    0.028    0.000 {method 'uniform' of
'mtrand.RandomState' objects}

  _____  

In this run, the Q1 function doing a loop spent a cumulative time of 3.698
s running (2.881 s of time in the function itself and 0.816 s in the
append function adding a new item to the current list).  The Q2 function
doing list comprehension took 1.892 s of runtime - about half the time of
the loop code.  Finally, the vectorized Q3 version took 0.012 s -- over
100 times faster than the list comprehension version.  Vectorizing pays
off!

Best,

Marina  

Marina Bosi

Consulting Professor, Department of Music

Stanford University

Computer Center for Research in Music and Acoustics

The Knoll,  660 Lomita Court

Stanford, California 94305-8180, USA

http://ccrma.stanford.edu

mbosi at stanford.edu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://cm-mail.stanford.edu/pipermail/422/attachments/20100223/680a73e0/attachment.html 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vectorizing and profiling example.py
Type: application/octet-stream
Size: 1696 bytes
Desc: not available
Url : http://cm-mail.stanford.edu/pipermail/422/attachments/20100223/680a73e0/attachment.obj