Commit 3b04d75a authored by scoder's avatar scoder Committed by GitHub

Merge pull request #2379 from gabrieldemarmiesse/test_string_5

Adding tests for "Unicode and passing strings" part 5
parents cbdca7c2 41dc52e6
from to_unicode cimport _ustring
def api_func(s):
text = _ustring(s)
# ...
# to_unicode.pyx
from cpython.version cimport PY_MAJOR_VERSION
cdef unicode _ustring(s):
if type(s) is unicode:
# fast path for most common case(s)
return <unicode>s
elif PY_MAJOR_VERSION < 3 and isinstance(s, bytes):
# only accept byte strings in Python 2.x, not in Py3
return (<bytes>s).decode('ascii')
elif isinstance(s, unicode):
# an evil cast to <unicode> might work here in some(!) cases,
# depending on what the further processing does. to be safe,
# we can always create a copy instead
return unicode(s)
else:
raise TypeError("Could not convert to unicode.")
...@@ -236,30 +236,13 @@ way to go, since it allows for easy adaptation of the input normalisation ...@@ -236,30 +236,13 @@ way to go, since it allows for easy adaptation of the input normalisation
process later. process later.
This kind of input normalisation function will commonly look similar to This kind of input normalisation function will commonly look similar to
the following:: the following:
from cpython.version cimport PY_MAJOR_VERSION
cdef unicode _ustring(s):
if type(s) is unicode:
# fast path for most common case(s)
return <unicode>s
elif PY_MAJOR_VERSION < 3 and isinstance(s, bytes):
# only accept byte strings in Python 2.x, not in Py3
return (<bytes>s).decode('ascii')
elif isinstance(s, unicode):
# an evil cast to <unicode> might work here in some(!) cases,
# depending on what the further processing does. to be safe,
# we can always create a copy instead
return unicode(s)
else:
raise TypeError(...)
And should then be used like this:: .. literalinclude:: ../../examples/tutorial/string/to_unicode.pyx
def api_func(s): And should then be used like this:
text = _ustring(s)
... .. literalinclude:: ../../examples/tutorial/string/api_func.pyx
Similarly, if the further processing happens at the byte level, but Unicode Similarly, if the further processing happens at the byte level, but Unicode
string input should be accepted, then the following might work, if you are string input should be accepted, then the following might work, if you are
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment