Merge pull request #2389 from gabrieldemarmiesse/test_string_14

Adding tests for "Unicode and passing strings" part 14

Merge pull request #2389 from gabrieldemarmiesse/test_string_14
Adding tests for "Unicode and passing strings" part 14
87813c9f · scoder · GitHub · 216f998e · 3f5848fc · 87813c9f
Commit 87813c9f authored Jun 22, 2018 by scoder Committed by GitHub Jun 22, 2018
4 changed files
--- a/docs/examples/tutorial/string/for_bytes.pyx
+++ b/docs/examples/tutorial/string/for_bytes.pyx
+cdef bytes bytes_string = b'hello world'
+
+cdef char c
+for c in bytes_string:
+    if c == 'A':
+        print("Found the letter A")
--- a/docs/examples/tutorial/string/for_unicode.pyx
+++ b/docs/examples/tutorial/string/for_unicode.pyx
+cdef unicode ustring = u'Hello world'
+
+# NOTE: no typing required for 'uchar' !
+for uchar in ustring:
+    if uchar == u'A':
+        print("Found the letter A")
--- a/docs/examples/tutorial/string/if_char_in.pyx
+++ b/docs/examples/tutorial/string/if_char_in.pyx
+cpdef void is_in(Py_UCS4 uchar_val):
+    if uchar_val in u'abcABCxY':
+        print("The character is in the string.")
+    else:
+        print("The character isn't in the string")
--- a/docs/src/tutorial/strings.rst
+++ b/docs/src/tutorial/strings.rst
@@ -620,22 +620,14 @@ C code::
    for c in c_string[:100]:
        if c == 'A': ...

-The same applies to bytes objects::
+The same applies to bytes objects:

-    cdef bytes bytes_string = ...
-
-    cdef char c
-    for c in bytes_string:
-        if c == 'A': ...
+.. literalinclude:: ../../examples/tutorial/string/for_bytes.pyx

 For unicode objects, Cython will automatically infer the type of the
-loop variable as :c:type:`Py_UCS4`::
-
-    cdef unicode ustring = ...
+loop variable as :c:type:`Py_UCS4`:

-    # NOTE: no typing required for 'uchar' !
-    for uchar in ustring:
-        if uchar == u'A': ...
+.. literalinclude:: ../../examples/tutorial/string/for_unicode.pyx

 The automatic type inference usually leads to much more efficient code
 here.  However, note that some unicode operations still require the
@@ -648,11 +640,9 @@ loop to enforce one-time coercion before running Python operations on
 it.

 There are also optimisations for ``in`` tests, so that the following
-code will run in plain C code, (actually using a switch statement)::
+code will run in plain C code, (actually using a switch statement):

-    cdef Py_UCS4 uchar_val = get_a_unicode_character()
-    if uchar_val in u'abcABCxY':
-        ...
+.. literalinclude:: ../../examples/tutorial/string/if_char_in.pyx

 Combined with the looping optimisation above, this can result in very
 efficient character switching code, e.g. in unicode parsers.