Following up on my task to make it easier to benchmark memory usage in Python, I updated Fabian’s memory_profiler to include a couple of useful IPython magics. While in my last post, I used the new IPython 0.13 syntax for defining magics, this time I used the backwards-compatible one from the previous version.
You can find this work-in-progress as a pull request on memory_profiler from where you can trace it to my GitHub repo. Here’s what you can do with it:
Copying the spirit of
%lprun, since imitation is the most sincere form of flattery, you can use %mprun to easily view line-by-line memory usage reports, without having to go in and add the
In : import numpy as np In : from sklearn.linear_model import ridge_regression In : X, y = np.array([[1, 2], [3, 4], [5, 6]]), np.array([2, 4, 6]) In : %mprun -f ridge_regression ridge_regression(X, y, 1.0) (...) 109 41.6406 MB 0.0000 MB if n_features > n_samples or \ 110 41.6406 MB 0.0000 MB isinstance(sample_weight, np.ndarray) or \ 111 41.6406 MB 0.0000 MB sample_weight != 1.0: 112 113 # kernel ridge 114 # w = X.T * inv(X X^t + alpha*Id) y 115 A = np.dot(X, X.T) 116 A.flat[::n_samples + 1] += alpha * sample_weight 117 coef = np.dot(X.T, _solve(A, y, solver, tol)) 118 else: 119 # ridge 120 # w = inv(X^t X + alpha*Id) * X.T y 121 41.6484 MB 0.0078 MB A = np.dot(X.T, X) 122 41.6875 MB 0.0391 MB A.flat[::n_features + 1] += alpha 123 41.7344 MB 0.0469 MB coef = _solve(A, np.dot(X.T, y), solver, tol) 124 125 41.7344 MB 0.0000 MB return coef.T
As described in my previous post, this is a
%timeit-like magic for quickly seeing how much memory a Python command uses.
Unlike %timeit, however, the command needs to be executed in a fresh process. I have to dig in some more to debug this, but if the command is run in the current process, very often the difference in memory usage will be insignificant, I assume because preallocated memory is used. The problem is that when running in a new process, some functions that I tried to bench crash with
SIGSEGV. For a lot of stuff, though,
%memit is currently usable:
In : import numpy as np In : X = np.ones((1000, 1000)) In : %memit X.T worst of 3: 0.242188 MB per loop In : %memit np.asfortranarray(X) worst of 3: 15.687500 MB per loop In : Y = X.copy('F') In : %memit np.asfortranarray(Y) worst of 3: 0.324219 MB per loop
It is very easy, using this small tool, to see what forces memory copying and what does not.
First, you have to get the source code of this version of memory_profiler. Then, it depends on your version of IPython. If you have 0.10, you have to edit
~/.ipython/ipy_user_conf.py like this: (once again, instructions borrowed from line_profiler)
# These two lines are standard and probably already there. import IPython.ipapi ip = IPython.ipapi.get() # These two are the important ones. import memory_profiler ip.expose_magic('mprun', memory_profiler.magic_mprun) ip.expose_magic('memit', memory_profiler.magic_memit)
If you’re using IPython 0.11 or newer, the steps are different. First create a configuration profile:
$ ipython profile create
Then create a file named
~/.ipython/extensions/memory_profiler_ext.py with the following content:
import memory_profiler def load_ipython_extension(ip): ip.define_magic('mprun', memory_profiler.magic_mprun) ip.define_magic('memit', memory_profiler.magic_memit)
Then register it in
~/.ipython/profile_default/ipython_config.py, like this. Of course, if you already have other extensions such as
line_profiler_ext, just add the new one to the list.
c.TerminalIPythonApp.extensions = [ 'memory_profiler_ext', ] c.InteractiveShellApp.extensions = [ 'memory_profiler_ext', ]
Now launch IPython and you can use the new magics like in the examples above.