clibraries.rst 24 KB
Newer Older
1

2 3
.. _using_c_libraries:

4
******************
5
Using C libraries
6
******************
7 8

Apart from writing fast code, one of the main use cases of Cython is
9 10
to call external C libraries from Python code.  As Cython code
compiles down to C code itself, it is actually trivial to call C
11 12 13 14
functions directly in the code.  The following gives a complete
example for using (and wrapping) an external C library in Cython code,
including appropriate error handling and considerations about
designing a suitable API for Python and Cython code.
15 16 17 18 19 20 21 22 23 24 25 26

Imagine you need an efficient way to store integer values in a FIFO
queue.  Since memory really matters, and the values are actually
coming from C code, you cannot afford to create and store Python
``int`` objects in a list or deque.  So you look out for a queue
implementation in C.

After some web search, you find the C-algorithms library [CAlg]_ and
decide to use its double ended queue implementation.  To make the
handling easier, however, you decide to wrap it in a Python extension
type that can encapsulate all memory management.

27 28
.. [CAlg] Simon Howard, C Algorithms library, http://c-algorithms.sourceforge.net/

29

30 31
Defining external declarations
==============================
32

33
You can download CAlg `here <https://codeload.github.com/fragglet/c-algorithms/zip/master>`_.
34

35
The C API of the queue implementation, which is defined in the header
36
file ``c-algorithms/src/queue.h``, essentially looks like this:
37

38 39
.. literalinclude:: ../../examples/tutorial/clibraries/c-algorithms/src/queue.h
    :language: C
40 41

To get started, the first step is to redefine the C API in a ``.pxd``
42
file, say, ``cqueue.pxd``:
43

44
.. literalinclude:: ../../examples/tutorial/clibraries/cqueue.pxd
45 46

Note how these declarations are almost identical to the header file
47 48 49 50 51 52
declarations, so you can often just copy them over.  However, you do
not need to provide *all* declarations as above, just those that you
use in your code or in other declarations, so that Cython gets to see
a sufficient and consistent subset of them.  Then, consider adapting
them somewhat to make them more comfortable to work with in Cython.

53 54 55 56 57 58
Specifically, you should take care of choosing good argument names
for the C functions, as Cython allows you to pass them as keyword
arguments.  Changing them later on is a backwards incompatible API
modification.  Choosing good names right away will make these
functions more pleasant to work with from Cython code.

59 60 61 62 63 64 65 66
One noteworthy difference to the header file that we use above is the
declaration of the ``Queue`` struct in the first line.  ``Queue`` is
in this case used as an *opaque handle*; only the library that is
called knows what is really inside.  Since no Cython code needs to
know the contents of the struct, we do not need to declare its
contents, so we simply provide an empty definition (as we do not want
to declare the ``_Queue`` type which is referenced in the C header)
[#]_.
67 68

.. [#] There's a subtle difference between ``cdef struct Queue: pass``
69 70 71 72 73 74 75
       and ``ctypedef struct Queue: pass``.  The former declares a
       type which is referenced in C code as ``struct Queue``, while
       the latter is referenced in C as ``Queue``.  This is a C
       language quirk that Cython is not able to hide.  Most modern C
       libraries use the ``ctypedef`` kind of struct.

Another exception is the last line.  The integer return value of the
76 77 78 79 80 81 82 83
``queue_is_empty()`` function is actually a C boolean value, i.e. the
only interesting thing about it is whether it is non-zero or zero,
indicating if the queue is empty or not.  This is best expressed by
Cython's ``bint`` type, which is a normal ``int`` type when used in C
but maps to Python's boolean values ``True`` and ``False`` when
converted to a Python object.  This way of tightening declarations in
a ``.pxd`` file can often simplify the code that uses them.

84 85 86 87 88 89 90 91 92 93 94 95 96
It is good practice to define one ``.pxd`` file for each library that
you use, and sometimes even for each header file (or functional group)
if the API is large.  That simplifies their reuse in other projects.
Sometimes, you may need to use C functions from the standard C
library, or want to call C-API functions from CPython directly.  For
common needs like this, Cython ships with a set of standard ``.pxd``
files that provide these declarations in a readily usable way that is
adapted to their use in Cython.  The main packages are ``cpython``,
``libc`` and ``libcpp``.  The NumPy library also has a standard
``.pxd`` file ``numpy``, as it is often used in Cython code.  See
Cython's ``Cython/Includes/`` source package for a complete list of
provided ``.pxd`` files.

97 98

Writing a wrapper class
99
=======================
100

101 102
After declaring our C library's API, we can start to design the Queue
class that should wrap the C queue.  It will live in a file called
103
``queue.pyx``. [#]_
104 105 106 107 108

.. [#] Note that the name of the ``.pyx`` file must be different from
       the ``cqueue.pxd`` file with declarations from the C library,
       as both do not describe the same code.  A ``.pxd`` file next to
       a ``.pyx`` file with the same name defines exported
109 110 111 112
       declarations for code in the ``.pyx`` file.  As the
       ``cqueue.pxd`` file contains declarations of a regular C
       library, there must not be a ``.pyx`` file with the same name
       that Cython associates with it.
113

114
Here is a first start for the Queue class:
115

116
.. literalinclude:: ../../examples/tutorial/clibraries/queue.pyx
117

118
Note that it says ``__cinit__`` rather than ``__init__``.  While
119
``__init__`` is available as well, it is not guaranteed to be run (for
120 121
instance, one could create a subclass and forget to call the
ancestor's constructor).  Because not initializing C pointers often
Stefan Behnel's avatar
Stefan Behnel committed
122 123 124 125 126
leads to hard crashes of the Python interpreter, Cython provides
``__cinit__`` which is *always* called immediately on construction,
before CPython even considers calling ``__init__``, and which
therefore is the right place to initialise ``cdef`` fields of the new
instance.  However, as ``__cinit__`` is called during object
127 128
construction, ``self`` is not fully constructed yet, and one must
avoid doing anything with ``self`` but assigning to ``cdef`` fields.
129 130

Note also that the above method takes no parameters, although subtypes
Stefan Behnel's avatar
Stefan Behnel committed
131 132 133 134 135 136 137
may want to accept some.  A no-arguments ``__cinit__()`` method is a
special case here that simply does not receive any parameters that
were passed to a constructor, so it does not prevent subclasses from
adding parameters.  If parameters are used in the signature of
``__cinit__()``, they must match those of any declared ``__init__``
method of classes in the class hierarchy that are used to instantiate
the type.
138

139 140

Memory management
141
=================
142

143 144
Before we continue implementing the other methods, it is important to
understand that the above implementation is not safe.  In case
145
anything goes wrong in the call to ``queue_new()``, this code will
146
simply swallow the error, so we will likely run into a crash later on.
147
According to the documentation of the ``queue_new()`` function, the
148 149 150 151
only reason why the above can fail is due to insufficient memory.  In
that case, it will return ``NULL``, whereas it would normally return a
pointer to the new queue.

152
The Python way to get out of this is to raise a ``MemoryError`` [#]_.
153
We can thus change the init function as follows:
154

155
.. literalinclude:: ../../examples/tutorial/clibraries/queue2.pyx
156 157 158 159 160

.. [#] In the specific case of a ``MemoryError``, creating a new
   exception instance in order to raise it may actually fail because
   we are running out of memory.  Luckily, CPython provides a C-API
   function ``PyErr_NoMemory()`` that safely raises the right
161
   exception for us.  Cython automatically
162 163 164
   substitutes this C-API call whenever you write ``raise
   MemoryError`` or ``raise MemoryError()``.  If you use an older
   version, you have to cimport the C-API function from the standard
165
   package ``cpython.exc`` and call it directly.
166 167 168 169 170 171

The next thing to do is to clean up when the Queue instance is no
longer used (i.e. all references to it have been deleted).  To this
end, CPython provides a callback that Cython makes available as a
special method ``__dealloc__()``.  In our case, all we have to do is
to free the C Queue, but only if we succeeded in initialising it in
172 173 174 175 176 177
the init method::

        def __dealloc__(self):
            if self._c_queue is not NULL:
                cqueue.queue_free(self._c_queue)

178 179

Compiling and linking
180
=====================
181

182 183
At this point, we have a working Cython module that we can test.  To
compile it, we need to configure a ``setup.py`` script for distutils.
184
Here is the most basic script for compiling a Cython module::
185 186 187

    from distutils.core import setup
    from distutils.extension import Extension
188
    from Cython.Build import cythonize
189 190

    setup(
191 192
        ext_modules = cythonize([Extension("queue", ["queue.pyx"])])
    )
193

194

luz.paz's avatar
luz.paz committed
195
To build against the external C library, we need to make sure Cython finds the necessary libraries.
196 197 198 199 200 201 202 203 204 205 206 207
There are two ways to archive this. First we can tell distutils where to find
the c-source to compile the :file:`queue.c` implementation automatically. Alternatively,
we can build and install C-Alg as system library and dynamically link it. The latter is useful
if other applications also use C-Alg.


Static Linking
---------------

To build the c-code automatically we need to include compiler directives in `queue.pyx`::

    # distutils: sources = c-algorithms/src/queue.c
208
    # distutils: include_dirs = c-algorithms/src/
209 210 211 212 213 214 215 216 217 218 219 220 221 222 223

    cimport cqueue

    cdef class Queue:
        cdef cqueue.Queue* _c_queue
        def __cinit__(self):
            self._c_queue = cqueue.queue_new()
            if self._c_queue is NULL:
                raise MemoryError()

        def __dealloc__(self):
            if self._c_queue is not NULL:
                cqueue.queue_free(self._c_queue)

The ``sources`` compiler directive gives the path of the C
224 225
files that distutils is going to compile and
link (statically) into the resulting extension module.
226 227 228
In general all relevant header files should be found in ``include_dirs``.
Now we can build the project using::

229
    $ python setup.py build_ext -i
230 231 232

And test whether our build was successful::

233
    $ python -c 'import queue; Q = queue.Queue()'
234 235 236 237 238 239 240


Dynamic Linking
---------------

Dynamic linking is useful, if the library we are going to wrap is already
installed on the system. To perform dynamic linking we first need to
241 242 243 244 245 246 247 248 249 250 251 252 253 254
build and install c-alg.

To build c-algorithms on your system::

    $ cd c-algorithms
    $ sh autogen.sh
    $ ./configure
    $ make

to install CAlg run::

    $ make install

Afterwards the file :file:`/usr/local/lib/libcalg.so` should exist.
255 256 257

.. note::

258 259
    This path applies to Linux systems and may be different on other platforms,
    so you will need to adapt the rest of the tutorial depending on the path
260
    where ``libcalg.so`` or ``libcalg.dll`` is on your system.
261 262 263

In this approach we need to tell the setup script to link with an external library.
To do so we need to extend the setup script to install change the extension setup from
264 265 266

::

267
    ext_modules = cythonize([Extension("queue", ["queue.pyx"])])
268 269 270 271 272

to

::

273
    ext_modules = cythonize([
274
        Extension("queue", ["queue.pyx"],
275
                  libraries=["calg"])
276
        ])
277

278 279
Now we should be able to build the project using::

280
    $ python setup.py build_ext -i
281 282

If the `libcalg` is not installed in a 'normal' location, users can provide the
283 284 285 286 287 288 289
required parameters externally by passing appropriate C compiler
flags, such as::

    CFLAGS="-I/usr/local/otherdir/calg/include"  \
    LDFLAGS="-L/usr/local/otherdir/calg/lib"     \
        python setup.py build_ext -i

290 291 292 293 294 295 296


Before we run the module, we also need to make sure that `libcalg` is in
the `LD_LIBRARY_PATH` environment variable, e.g. by setting::

   $ export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

297 298
Once we have compiled the module for the first time, we can now import
it and instantiate a new Queue::
299

300
    $ export PYTHONPATH=.
301
    $ python -c 'import queue; Q = queue.Queue()'
302

303 304
However, this is all our Queue class can do so far, so let's make it
more usable.
305

306 307 308 309

Mapping functionality
---------------------

310 311 312 313 314 315
Before implementing the public interface of this class, it is good
practice to look at what interfaces Python offers, e.g. in its
``list`` or ``collections.deque`` classes.  Since we only need a FIFO
queue, it's enough to provide the methods ``append()``, ``peek()`` and
``pop()``, and additionally an ``extend()`` method to add multiple
values at once.  Also, since we already know that all values will be
316 317
coming from C, it's best to provide only ``cdef`` methods for now, and
to give them a straight C interface.
318 319 320

In C, it is common for data structures to store data as a ``void*`` to
whatever data item type.  Since we only want to store ``int`` values,
321 322 323 324
which usually fit into the size of a pointer type, we can avoid
additional memory allocations through a trick: we cast our ``int`` values
to ``void*`` and vice versa, and store the value directly as the
pointer value.
325 326 327

Here is a simple implementation for the ``append()`` method::

328
        cdef append(self, int value):
329 330 331 332
            cqueue.queue_push_tail(self._c_queue, <void*>value)

Again, the same error handling considerations as for the
``__cinit__()`` method apply, so that we end up with this
333
implementation instead::
334

335
        cdef append(self, int value):
336 337
            if not cqueue.queue_push_tail(self._c_queue,
                                          <void*>value):
338
                raise MemoryError()
339 340 341

Adding an ``extend()`` method should now be straight forward::

342
    cdef extend(self, int* values, size_t count):
343 344
        """Append all ints to the queue.
        """
345
        cdef int value
346
        for value in values[:count]:  # Slicing pointer to limit the iteration boundaries.
347
            self.append(value)
348

349
This becomes handy when reading values from a C array, for example.
350 351 352

So far, we can only add data to the queue.  The next step is to write
the two methods to get the first element: ``peek()`` and ``pop()``,
353
which provide read-only and destructive read access respectively.
354 355 356
To avoid compiler warnings when casting ``void*`` to ``int`` directly,
we use an intermediate data type that is big enough to hold a ``void*``.
Here, ``Py_ssize_t``::
357

358 359
    cdef int peek(self):
        return <Py_ssize_t>cqueue.queue_peek_head(self._c_queue)
360

361 362
    cdef int pop(self):
        return <Py_ssize_t>cqueue.queue_pop_head(self._c_queue)
363

luz.paz's avatar
luz.paz committed
364
Normally, in C, we risk losing data when we convert a larger integer type
365 366 367 368 369 370
to a smaller integer type without checking the boundaries, and ``Py_ssize_t``
may be a larger type than ``int``.  But since we control how values are added
to the queue, we already know that all values that are in the queue fit into
an ``int``, so the above conversion from ``void*`` to ``Py_ssize_t`` to ``int``
(the return type) is safe by design.

371 372 373 374 375 376

Handling errors
---------------

Now, what happens when the queue is empty?  According to the
documentation, the functions return a ``NULL`` pointer, which is
377
typically not a valid value.  But since we are simply casting to and
378 379
from ints, we cannot distinguish anymore if the return value was
``NULL`` because the queue was empty or because the value stored in
380 381 382 383
the queue was ``0``.  In Cython code, we want the first case to
raise an exception, whereas the second case should simply return
``0``.  To deal with this, we need to special case this value,
and check if the queue really is empty or not::
384

385 386
    cdef int peek(self) except? -1:
        cdef int value = <Py_ssize_t>cqueue.queue_peek_head(self._c_queue)
387 388 389 390 391 392 393
        if value == 0:
            # this may mean that the queue is empty, or
            # that it happens to contain a 0 value
            if cqueue.queue_is_empty(self._c_queue):
                raise IndexError("Queue is empty")
        return value

394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412
Note how we have effectively created a fast path through the method in
the hopefully common cases that the return value is not ``0``.  Only
that specific case needs an additional check if the queue is empty.

The ``except? -1`` declaration in the method signature falls into the
same category.  If the function was a Python function returning a
Python object value, CPython would simply return ``NULL`` internally
instead of a Python object to indicate an exception, which would
immediately be propagated by the surrounding code.  The problem is
that the return type is ``int`` and any ``int`` value is a valid queue
item value, so there is no way to explicitly signal an error to the
calling code.  In fact, without such a declaration, there is no
obvious way for Cython to know what to return on exceptions and for
calling code to even know that this method *may* exit with an
exception.

The only way calling code can deal with this situation is to call
``PyErr_Occurred()`` when returning from a function to check if an
exception was raised, and if so, propagate the exception.  This
413
obviously has a performance penalty.  Cython therefore allows you to
414
declare which value it should implicitly return in the case of an
415
exception, so that the surrounding code only needs to check for an
416 417 418 419 420 421 422 423 424 425 426 427 428
exception when receiving this exact value.

We chose to use ``-1`` as the exception return value as we expect it
to be an unlikely value to be put into the queue.  The question mark
in the ``except? -1`` declaration indicates that the return value is
ambiguous (there *may* be a ``-1`` value in the queue, after all) and
that an additional exception check using ``PyErr_Occurred()`` is
needed in calling code.  Without it, Cython code that calls this
method and receives the exception return value would silently (and
sometimes incorrectly) assume that an exception has been raised.  In
any case, all other return values will be passed through almost
without a penalty, thus again creating a fast path for 'normal'
values.
429

430 431 432 433
Now that the ``peek()`` method is implemented, the ``pop()`` method
also needs adaptation.  Since it removes a value from the queue,
however, it is not enough to test if the queue is empty *after* the
removal.  Instead, we must test it on entry::
434

435
    cdef int pop(self) except? -1:
436 437
        if cqueue.queue_is_empty(self._c_queue):
            raise IndexError("Queue is empty")
438
        return <Py_ssize_t>cqueue.queue_pop_head(self._c_queue)
439

440 441 442
The return value for exception propagation is declared exactly as for
``peek()``.

443
Lastly, we can provide the Queue with an emptiness indicator in the
444 445 446
normal Python way by implementing the ``__bool__()`` special method
(note that Python 2 calls this method ``__nonzero__``, whereas Cython
code can use either name)::
447

448
    def __bool__(self):
449 450
        return not cqueue.queue_is_empty(self._c_queue)

451
Note that this method returns either ``True`` or ``False`` as we
452
declared the return type of the ``queue_is_empty()`` function as
453
``bint`` in ``cqueue.pxd``.
454

455 456 457 458

Testing the result
------------------

459 460 461 462 463 464 465 466 467 468 469 470
Now that the implementation is complete, you may want to write some
tests for it to make sure it works correctly.  Especially doctests are
very nice for this purpose, as they provide some documentation at the
same time.  To enable doctests, however, you need a Python API that
you can call.  C methods are not visible from Python code, and thus
not callable from doctests.

A quick way to provide a Python API for the class is to change the
methods from ``cdef`` to ``cpdef``.  This will let Cython generate two
entry points, one that is callable from normal Python code using the
Python call semantics and Python objects as arguments, and one that is
callable from C code with fast C semantics and without requiring
471 472 473 474
intermediate argument conversion from or to Python types. Note that ``cpdef``
methods ensure that they can be appropriately overridden by Python
methods even when they are called from Cython. This adds a tiny overhead
compared to ``cdef`` methods.
475

476 477 478 479 480 481 482 483 484 485 486 487 488
Now that we have both a C-interface and a Python interface for our
class, we should make sure that both interfaces are consistent.
Python users would expect an ``extend()`` method that accepts arbitrary
iterables, whereas C users would like to have one that allows passing
C arrays and C memory.  Both signatures are incompatible.

We will solve this issue by considering that in C, the API could also
want to support other input types, e.g. arrays of ``long`` or ``char``,
which is usually supported with differently named C API functions such as
``extend_ints()``, ``extend_longs()``, extend_chars()``, etc.  This allows
us to free the method name ``extend()`` for the duck typed Python method,
which can accept arbitrary iterables.

489
The following listing shows the complete implementation that uses
490
``cpdef`` methods where possible:
491

492
.. literalinclude:: ../../examples/tutorial/clibraries/queue3.pyx
493 494

Now we can test our Queue implementation using a python script,
495
for example here :file:`test_queue.py`:
496

497
.. literalinclude:: ../../examples/tutorial/clibraries/test_queue.py
498

499
As a quick test with 10000 numbers on the author's machine indicates,
500
using this Queue from Cython code with C ``int`` values is about five
501 502 503 504 505
times as fast as using it from Cython code with Python object values,
almost eight times faster than using it from Python code in a Python
loop, and still more than twice as fast as using Python's highly
optimised ``collections.deque`` type from Cython code with Python
integers.
506

507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528

Callbacks
---------

Let's say you want to provide a way for users to pop values from the
queue up to a certain user defined event occurs.  To this end, you
want to allow them to pass a predicate function that determines when
to stop, e.g.::

    def pop_until(self, predicate):
        while not predicate(self.peek()):
            self.pop()

Now, let us assume for the sake of argument that the C queue
provides such a function that takes a C callback function as
predicate.  The API could look as follows::

    /* C type of a predicate function that takes a queue value and returns
     * -1 for errors
     *  0 for reject
     *  1 for accept
     */
529
    typedef int (*predicate_func)(void* user_context, QueueValue data);
530 531 532 533

    /* Pop values as long as the predicate evaluates to true for them,
     * returns -1 if the predicate failed with an error and 0 otherwise.
     */
534
    int queue_pop_head_until(Queue *queue, predicate_func predicate,
535 536 537 538 539 540 541 542 543 544
                             void* user_context);

It is normal for C callback functions to have a generic :c:type:`void*`
argument that allows passing any kind of context or state through the
C-API into the callback function.  We will use this to pass our Python
predicate function.

First, we have to define a callback function with the expected
signature that we can pass into the C-API function::

545
    cdef int evaluate_predicate(void* context, cqueue.QueueValue value):
546 547 548 549 550
        "Callback function that can be passed as predicate_func"
        try:
            # recover Python function object from void* argument
            func = <object>context
            # call function, convert result into 0/1 for True/False
551
            return bool(func(<int>value))
552 553 554 555 556 557
        except:
            # catch any Python errors and return error indicator
            return -1

The main idea is to pass a pointer (a.k.a. borrowed reference) to the
function object as the user context argument. We will call the C-API
Stefan Behnel's avatar
Stefan Behnel committed
558
function as follows::
559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581

    def pop_until(self, python_predicate_function):
        result = cqueue.queue_pop_head_until(
            self._c_queue, evaluate_predicate,
            <void*>python_predicate_function)
        if result == -1:
            raise RuntimeError("an error occurred")

The usual pattern is to first cast the Python object reference into
a :c:type:`void*` to pass it into the C-API function, and then cast
it back into a Python object in the C predicate callback function.
The cast to :c:type:`void*` creates a borrowed reference.  On the cast
to ``<object>``, Cython increments the reference count of the object
and thus converts the borrowed reference back into an owned reference.
At the end of the predicate function, the owned reference goes out
of scope again and Cython discards it.

The error handling in the code above is a bit simplistic. Specifically,
any exceptions that the predicate function raises will essentially be
discarded and only result in a plain ``RuntimeError()`` being raised
after the fact.  This can be improved by storing away the exception
in an object passed through the context parameter and re-raising it
after the C-API function has returned ``-1`` to indicate the error.