Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Support
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
C
cython
Project overview
Project overview
Details
Activity
Releases
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
Analytics
Analytics
Repository
Value Stream
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Create a new issue
Commits
Issue Boards
Open sidebar
Kirill Smelkov
cython
Commits
be803255
Commit
be803255
authored
Jul 19, 2012
by
Stefan Behnel
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
string docs: moved comments on C++ string into their own section
parent
6d9f94d4
Changes
1
Show whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
43 additions
and
17 deletions
+43
-17
docs/src/tutorial/strings.rst
docs/src/tutorial/strings.rst
+43
-17
No files found.
docs/src/tutorial/strings.rst
View file @
be803255
...
...
@@ -163,23 +163,6 @@ null bytes. Text encoded in UTF-8 or one of the ISO-8859 encodings is
usually a good candidate. If in doubt, it's better to pass indices
that are 'obviously' correct than to rely on the data to be as expected.
When wrapping a C++ library, strings will usually come in the form of
the :c:type:`std::string` class. Efficient decoding support is
available in Cython 0.17 and later::
# distutils: language = c++
from libcpp.string cimport string
cdef string s = string('abcdefg')
ustring1 = s.decode('UTF-8')
ustring2 = s[2:-2].decode('UTF-8')
For C++ strings, decoding slices will always take the proper length
of the string into account and apply Python slicing semantics (e.g.
return empty strings for out-of-bounds indices).
It is common practice to wrap string conversions (and non-trivial type
conversions in general) in dedicated functions, as this needs to be
done in exactly the same way whenever receiving text from C. This
...
...
@@ -231,6 +214,49 @@ assignment. Later access to the invalidated pointer will read invalid
memory and likely result in a segfault. Cython will therefore refuse
to compile this code.
C++ strings
-----------
When wrapping a C++ library, strings will usually come in the form of
the :c:type:`std::string` class. As with C strings, Python byte strings
automatically coerce from and to C++ strings::
# distutils: language = c++
from libcpp.string cimport string
cdef string s = py_bytes_object
try:
s.append('abc')
py_bytes_object = s
finally:
del s
The memory management situation is different than in C because the
creation of a C++ string makes an independent copy of the string
buffer which the string object then owns. It is therefore possible
to convert temporarily created Python objects directly into C++
strings. A common way to make use of this is when encoding a Python
unicode string into a C++ string::
cdef string cpp_string = py_unicode_string.encode('UTF-8')
Note that this involves a bit of overhead because it first encodes
the Unicode string into a temporarily created Python bytes object
and then copies its buffer into a new C++ string.
For the other direction, efficient decoding support is available
in Cython 0.17 and later::
cdef string s = string('abcdefg')
ustring1 = s.decode('UTF-8')
ustring2 = s[2:-2].decode('UTF-8')
For C++ strings, decoding slices will always take the proper length
of the string into account and apply Python slicing semantics (e.g.
return empty strings for out-of-bounds indices).
Source code encoding
--------------------
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment