Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Labels
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Commits
Open sidebar
nexedi
cython
Commits
fc37e45b
Commit
fc37e45b
authored
Mar 19, 2018
by
gabrieldemarmiesse
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
Finished v1.
parent
1d86d5fe
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
with
64 additions
and
40 deletions
+64
-40
docs/examples/userguide/convolve_typed.pyx
docs/examples/userguide/convolve_typed.pyx
+1
-1
docs/src/userguide/numpy_tutorial.rst
docs/src/userguide/numpy_tutorial.rst
+63
-39
No files found.
docs/examples/userguide/convolve_typed.pyx
View file @
fc37e45b
...
...
@@ -40,7 +40,7 @@ def naive_convolve(f, g):
cdef
int
value
for
x
in
range
(
xmax
):
for
y
in
range
(
ymax
):
# Cython has built-in C functions for min and max
# Cython has built-in C functions for min and max
.
# This makes the following lines very fast.
s_from
=
max
(
smid
-
x
,
-
smid
)
s_to
=
min
((
xmax
-
x
)
-
smid
,
smid
+
1
)
...
...
docs/src/userguide/numpy_tutorial.rst
View file @
fc37e45b
...
...
@@ -163,9 +163,9 @@ run a Python session to test both the Python version (imported from
In [11]: N = 300
In [12]: f = np.arange(N*N, dtype=np.int).reshape((N,N))
In [13]: g = np.arange(81, dtype=np.int).reshape((9, 9))
In [19]: %timeit
-n2 -r3
convolve_py.naive_convolve(f, g)
In [19]: %timeit convolve_py.naive_convolve(f, g)
3.9 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit
-n2 -r3
convolve_cy.naive_convolve(f, g)
In [20]: %timeit convolve_cy.naive_convolve(f, g)
3.12 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
There's not such a huge difference yet; because the C code still does exactly
...
...
@@ -201,11 +201,7 @@ After building this and continuing my (very informal) benchmarks, I get:
.. sourcecode:: ipython
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
3.9 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit -n2 -r3 convolve_cy.naive_convolve(f, g)
3.12 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [22]: %timeit -n2 -r3 convolve_typed.naive_convolve(f, g)
In [22]: %timeit convolve_typed.naive_convolve(f, g)
13.8 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So in the end, adding types make the Cython code slower?
...
...
@@ -263,13 +259,7 @@ Let's see how much faster accessing is now.
.. sourcecode:: ipython
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
3.9 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit -n2 -r3 convolve_cy.naive_convolve(f, g)
3.12 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [21]: %timeit -n2 -r3 convolve_typed.naive_convolve(f, g)
13.8 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [22]: %timeit -n2 -r3 convolve_memview.naive_convolve(f, g)
In [22]: %timeit convolve_memview.naive_convolve(f, g)
13.5 ms ± 455 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Note the importance of this change.
...
...
@@ -307,17 +297,11 @@ information.
.. sourcecode:: ipython
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
3.9 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit -n2 -r3 convolve_cy.naive_convolve(f, g)
3.12 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [21]: %timeit -n2 -r3 convolve_typed.naive_convolve(f, g)
13.8 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [22]: %timeit -n2 -r3 convolve_memview.naive_convolve(f, g)
13.5 ms ± 455 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [23]: %timeit -n2 -r3 convolve_index.naive_convolve(f, g)
In [23]: %timeit convolve_index.naive_convolve(f, g)
7.57 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now 515 times faster than the interpreted Python version.
.. Warning::
Speed comes with some cost. Especially it can be dangerous to set typed
...
...
@@ -356,43 +340,83 @@ get by declaring the memoryviews as contiguous:
.. sourcecode:: ipython
In [19]: %timeit -n2 -r3 convolve_py.naive_convolve(f, g)
3.9 s ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [20]: %timeit -n2 -r3 convolve_cy.naive_convolve(f, g)
3.12 s ± 15.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [21]: %timeit -n2 -r3 convolve_typed.naive_convolve(f, g)
13.8 s ± 122 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [22]: %timeit -n2 -r3 convolve_memview.naive_convolve(f, g)
13.5 ms ± 455 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [23]: %timeit -n2 -r3 convolve_index.naive_convolve(f, g)
7.57 ms ± 151 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [23]: %timeit -n2 -r3 convolve_contiguous.naive_convolve(f, g)
In [23]: %timeit convolve_contiguous.naive_convolve(f, g)
7.2 ms ± 40.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now 541 times faster than the interpreted Python version.
Making the function cleaner
===========================
Declaring types can make your code quite verbose. If you don't mind
Cython inferring the C types of your variables, you can use
the `infer_types=True` compiler directive. It will save you quite a bit
of typing.
the ``infer_types=True`` compiler directive at the top of the file.
It will save you quite a bit of typing.
Note that since type declarations must happen at the top indentation level,
Cython won't infer the type of variable declared for the first time
in other indentation levels. It would change too much the meaning of
our code. This is why, we must still declare manually the type of the
``value`` variable.
# explain here why value must be typed
And actually, manually giving the type of the ``value`` variable will
be useful when using fused types.
.. literalinclude:: ../../examples/userguide/convolve_infer_types.pyx
:linenos:
# explain here why it is faster.
We now do a speed test:
.. sourcecode:: ipython
In [24]: %timeit convolve_infer_types.naive_convolve(f, g)
5.33 ms ± 72.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now 731 times faster than the interpreted Python version.
# Explain the black magic of why it's faster.
More generic code
==================
# Explain here templated
All those speed gains are nice, but adding types constrains our code.
At the moment, it would mean that our function only work with
NumPy arrays with the ``np.intc`` type. Is it possible to make our
code work for multiple NumPy data types?
Yes, with the help of a new feature called fused types.
You can learn more about it at :ref:`this section of the documentation
<fusedtypes>`.
It is similar to C++ 's templates. It generates mutiple function declarations
at compile time, and then chooses the right one at run-time based on the
types of the arguments provided. It is also possible to check with
``if-else`` statements what is the value of the fused type.
In our example, since we don't have access anymore to the NumPy's dtype
of our input arrays, we use those ``if-else`` statements to
know what NumPy data type we should use for our output array.
In this case, our function now works for ints, doubles and floats.
.. literalinclude:: ../../examples/userguide/convolve_fused_types.pyx
:linenos:
We can check that the output type is the right one::
>>>naive_convolve_fused_types(f, g).dtype
dtype('int32')
>>>naive_convolve_fused_types(f.astype(np.double), g.astype(np.double)).dtype
dtype('float64')
We now do a speed test:
.. sourcecode:: ipython
In [25]: %timeit convolve_fused_types.naive_convolve(f, g)
5.08 ms ± 173 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
We're now 767 times faster than the interpreted Python version.
# Explain the black magic of why it's faster.
Where to go from here?
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment