An error occurred fetching the project authors.
  1. 13 Jan, 2018 1 commit
  2. 11 Feb, 2017 1 commit
  3. 02 Sep, 2015 1 commit
  4. 08 Aug, 2015 1 commit
  5. 26 Jul, 2015 1 commit
  6. 25 Jul, 2015 1 commit
  7. 17 Jun, 2014 1 commit
  8. 15 Mar, 2013 1 commit
  9. 07 Mar, 2013 1 commit
  10. 06 Mar, 2013 1 commit
  11. 05 Mar, 2013 1 commit
  12. 03 Mar, 2013 2 commits
  13. 10 Jan, 2013 2 commits
  14. 06 Jan, 2013 1 commit
  15. 03 Oct, 2011 1 commit
  16. 12 Jan, 2011 1 commit
  17. 04 Sep, 2010 1 commit
  18. 09 Aug, 2010 1 commit
  19. 03 Jul, 2010 2 commits
  20. 12 Feb, 2010 1 commit
  21. 05 Feb, 2010 1 commit
  22. 10 Oct, 2009 1 commit
  23. 21 Aug, 2009 3 commits
  24. 08 Jul, 2009 1 commit
  25. 06 Jul, 2009 3 commits
  26. 05 Jul, 2009 2 commits
  27. 16 Aug, 2008 1 commit
  28. 15 Aug, 2008 2 commits
    • david@evans-2.local's avatar
    • Stefan Behnel's avatar
      Rewrite of the string literal handling code · 2e8a0084
      Stefan Behnel authored
      String literals pass through the compiler as follows:
      - unicode string literals are stored as unicode strings and encoded to UTF-8 on the way out
      - byte string literals are stored as correctly encoded byte strings by unescaping the source string literal into the corresponding byte sequence. No further encoding is done later on!
      - char literals are stored as byte strings of length 1. This can be verified by the parser now, e.g. a non-ASCII char literal in UTF-8 source code will result in an error, as it would end up as two or more bytes in the C code, which can no longer be represented as a C char.
      
      Storing byte strings is necessary as we otherwise loose the ability to encode byte string literals on the way out. They do not necessarily contain only bytes that fit into the source code encoding as the source can use escape sequences to represent them. Previously, ASCII encoded source code could not contain byte string literals with properly escaped non-ASCII bytes.
      
      Another bug that was fixed: in Python, escape sequences behave different in unicode strings (where they represent the character code) and byte strings (where they represent a byte value). Previously, they resulted in the same byte value in Cython code. This is only a problem for non-ASCII escapes, since the character code and the byte value of ASCII characters are identical.
      2e8a0084