pcre-8.36

722dc78d · Sergei Golubchik · 32ec8625 · 553b437d · 722dc78d · 722dc78d
Commit 722dc78d authored Nov 20, 2014 by Sergei Golubchik
50 changed files
--- a/pcre/ChangeLog
+++ b/pcre/ChangeLog
 ChangeLog for PCRE
 ------------------

+Version 8.36 26-September-2014
+------------------------------
+
+1.  Got rid of some compiler warnings in the C++ modules that were shown up by
+    -Wmissing-field-initializers and -Wunused-parameter.
+
+2.  The tests for quantifiers being too big (greater than 65535) were being
+    applied after reading the number, and stupidly assuming that integer
+    overflow would give a negative number. The tests are now applied as the
+    numbers are read.
+
+3.  Tidy code in pcre_exec.c where two branches that used to be different are
+    now the same.
+
+4.  The JIT compiler did not generate match limit checks for certain
+    bracketed expressions with quantifiers. This may lead to exponential
+    backtracking, instead of returning with PCRE_ERROR_MATCHLIMIT. This
+    issue should be resolved now.
+
+5.  Fixed an issue, which occures when nested alternatives are optimized
+    with table jumps.
+
+6.  Inserted two casts and changed some ints to size_t in the light of some
+    reported 64-bit compiler warnings (Bugzilla 1477).
+
+7.  Fixed a bug concerned with zero-minimum possessive groups that could match
+    an empty string, which sometimes were behaving incorrectly in the
+    interpreter (though correctly in the JIT matcher). This pcretest input is
+    an example:
+
+      '\A(?:[^"]++|"(?:[^"]*+|"")*+")++'
+      NON QUOTED "QUOT""ED" AFTER "NOT MATCHED
+
+    the interpreter was reporting a match of 'NON QUOTED ' only, whereas the
+    JIT matcher and Perl both matched 'NON QUOTED "QUOT""ED" AFTER '. The test
+    for an empty string was breaking the inner loop and carrying on at a lower
+    level, when possessive repeated groups should always return to a higher
+    level as they have no backtrack points in them. The empty string test now
+    occurs at the outer level.
+
+8.  Fixed a bug that was incorrectly auto-possessifying \w+ in the pattern
+    ^\w+(?>\s*)(?<=\w) which caused it not to match "test test".
+
+9.  Give a compile-time error for \o{} (as Perl does) and for \x{} (which Perl
+    doesn't).
+
+10. Change 8.34/15 introduced a bug that caused the amount of memory needed
+    to hold a pattern to be incorrectly computed (too small) when there were
+    named back references to duplicated names. This could cause "internal
+    error: code overflow" or "double free or corruption" or other memory
+    handling errors.
+
+11. When named subpatterns had the same prefixes, back references could be
+    confused. For example, in this pattern:
+
+      /(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
+
+    the reference to 'Name' was incorrectly treated as a reference to a
+    duplicate name.
+
+12. A pattern such as /^s?c/mi8 where the optional character has more than
+    one "other case" was incorrectly compiled such that it would only try to
+    match starting at "c".
+
+13. When a pattern starting with \s was studied, VT was not included in the
+    list of possible starting characters; this should have been part of the
+    8.34/18 patch.
+
+14. If a character class started [\Qx]... where x is any character, the class
+    was incorrectly terminated at the ].
+
+15. If a pattern that started with a caseless match for a character with more
+    than one "other case" was studied, PCRE did not set up the starting code
+    unit bit map for the list of possible characters. Now it does. This is an
+    optimization improvement, not a bug fix.
+
+16. The Unicode data tables have been updated to Unicode 7.0.0.
+
+17. Fixed a number of memory leaks in pcregrep.
+
+18. Avoid a compiler warning (from some compilers) for a function call with
+    a cast that removes "const" from an lvalue by using an intermediate
+    variable (to which the compiler does not object).
+
+19. Incorrect code was compiled if a group that contained an internal recursive
+    back reference was optional (had quantifier with a minimum of zero). This
+    example compiled incorrect code: /(((a\2)|(a*)\g<-1>))*/ and other examples
+    caused segmentation faults because of stack overflows at compile time.
+
+20. A pattern such as /((?(R)a|(?1)))+/, which contains a recursion within a
+    group that is quantified with an indefinite repeat, caused a compile-time
+    loop which used up all the system stack and provoked a segmentation fault.
+    This was not the same bug as 19 above.
+
+21. Add PCRECPP_EXP_DECL declaration to operator<< in pcre_stringpiece.h.
+    Patch by Mike Frysinger.
+
+
 Version 8.35 04-April-2014
 --------------------------

@@ -27,9 +125,9 @@ Version 8.35 04-April-2014

 6.  Improve character range checks in JIT. Characters are read by an inprecise
    function now, which returns with an unknown value if the character code is
-    above a certain treshold (e.g: 256). The only limitation is that the value
-    must be bigger than the treshold as well. This function is useful, when
-    the characters above the treshold are handled in the same way.
+    above a certain threshold (e.g: 256). The only limitation is that the value
+    must be bigger than the threshold as well. This function is useful when
+    the characters above the threshold are handled in the same way.

 7.  The macros whose names start with RAWUCHAR are placeholders for a future
    mode in which only the bottom 21 bits of 32-bit data items are used. To

--- a/pcre/NEWS
+++ b/pcre/NEWS
 News about PCRE releases
 ------------------------

+Release 8.36 26-September-2014
+------------------------------
+
+This is primarily a bug-fix release. However, in addition, the Unicode data
+tables have been updated to Unicode 7.0.0.
+
+
 Release 8.35 04-April-2014
 --------------------------


--- a/pcre/README
+++ b/pcre/README
@@ -45,14 +45,16 @@ the 16-bit library, which processes strings of 16-bit values, and one for the
 32-bit library, which processes strings of 32-bit values. The distribution also
 includes a set of C++ wrapper functions (see the pcrecpp man page for details),
 courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
-C++.
+C++. Other C++ wrappers have been created from time to time. See, for example:
+https://github.com/YasserAsmi/regexp, which aims to be simple and similar in
+style to the C API.

-In addition, there is a set of C wrapper functions (again, just for the 8-bit
-library) that are based on the POSIX regular expression API (see the pcreposix
-man page). These end up in the library called libpcreposix. Note that this just
-provides a POSIX calling interface to PCRE; the regular expressions themselves
-still follow Perl syntax and semantics. The POSIX API is restricted, and does
-not give full access to all of PCRE's facilities.
+The distribution also contains a set of C wrapper functions (again, just for
+the 8-bit library) that are based on the POSIX regular expression API (see the
+pcreposix man page). These end up in the library called libpcreposix. Note that
+this just provides a POSIX calling interface to PCRE; the regular expressions
+themselves still follow Perl syntax and semantics. The POSIX API is restricted,
+and does not give full access to all of PCRE's facilities.

 The header file for the POSIX-style functions is called pcreposix.h. The
 official POSIX name is regex.h, but I did not want to risk possible problems
@@ -988,4 +990,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 17 January 2014
+Last updated: 24 October 2014
--- a/pcre/configure.ac
+++ b/pcre/configure.ac
@@ -9,19 +9,19 @@ dnl The PCRE_PRERELEASE feature is for identifying release candidates. It might
 dnl be defined as -RC2, for example. For real releases, it should be empty.

 m4_define(pcre_major, [8])
-m4_define(pcre_minor, [35])
+m4_define(pcre_minor, [36])
 m4_define(pcre_prerelease, [])
-m4_define(pcre_date, [2014-04-04])
+m4_define(pcre_date, [2014-09-26])

 # NOTE: The CMakeLists.txt file searches for the above variables in the first
 # 50 lines of this file. Please update that if the variables above are moved.

 # Libtool shared library interface versions (current:revision:age)
-m4_define(libpcre_version, [3:3:2])
-m4_define(libpcre16_version, [2:3:2])
-m4_define(libpcre32_version, [0:3:0])
-m4_define(libpcreposix_version, [0:2:0])
-m4_define(libpcrecpp_version, [0:0:0])
+m4_define(libpcre_version, [3:4:2])
+m4_define(libpcre16_version, [2:4:2])
+m4_define(libpcre32_version, [0:4:0])
+m4_define(libpcreposix_version, [0:3:0])
+m4_define(libpcrecpp_version, [0:1:0])

 AC_PREREQ(2.57)
 AC_INIT(PCRE, pcre_major.pcre_minor[]pcre_prerelease, , pcre)

--- a/pcre/doc/html/README.txt
+++ b/pcre/doc/html/README.txt
@@ -45,14 +45,16 @@ the 16-bit library, which processes strings of 16-bit values, and one for the
 32-bit library, which processes strings of 32-bit values. The distribution also
 includes a set of C++ wrapper functions (see the pcrecpp man page for details),
 courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
-C++.
+C++. Other C++ wrappers have been created from time to time. See, for example:
+https://github.com/YasserAsmi/regexp, which aims to be simple and similar in
+style to the C API.

-In addition, there is a set of C wrapper functions (again, just for the 8-bit
-library) that are based on the POSIX regular expression API (see the pcreposix
-man page). These end up in the library called libpcreposix. Note that this just
-provides a POSIX calling interface to PCRE; the regular expressions themselves
-still follow Perl syntax and semantics. The POSIX API is restricted, and does
-not give full access to all of PCRE's facilities.
+The distribution also contains a set of C wrapper functions (again, just for
+the 8-bit library) that are based on the POSIX regular expression API (see the
+pcreposix man page). These end up in the library called libpcreposix. Note that
+this just provides a POSIX calling interface to PCRE; the regular expressions
+themselves still follow Perl syntax and semantics. The POSIX API is restricted,
+and does not give full access to all of PCRE's facilities.

 The header file for the POSIX-style functions is called pcreposix.h. The
 official POSIX name is regex.h, but I did not want to risk possible problems
@@ -988,4 +990,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 17 January 2014
+Last updated: 24 October 2014
--- a/pcre/doc/html/pcre_config.html
+++ b/pcre/doc/html/pcre_config.html
@@ -39,8 +39,10 @@ arguments are as follows:
  <i>where</i>    Points to where to put the data
 </pre>
 The <i>where</i> argument must point to an integer variable, except for
-PCRE_CONFIG_MATCH_LIMIT and PCRE_CONFIG_MATCH_LIMIT_RECURSION, when it must
-point to an unsigned long integer. The available codes are:
+PCRE_CONFIG_MATCH_LIMIT, PCRE_CONFIG_MATCH_LIMIT_RECURSION, and
+PCRE_CONFIG_PARENS_LIMIT, when it must point to an unsigned long integer,
+and for PCRE_CONFIG_JITTARGET, when it must point to a const char*.
+The available codes are:
 <pre>
  PCRE_CONFIG_JIT           Availability of just-in-time compiler
                              support (1=yes 0=no)

--- a/pcre/doc/html/pcre_fullinfo.html
+++ b/pcre/doc/html/pcre_fullinfo.html
@@ -57,6 +57,10 @@ The following information is available:
  PCRE_INFO_JITSIZE         Size of JIT compiled code
  PCRE_INFO_LASTLITERAL     Literal last data unit required
  PCRE_INFO_MINLENGTH       Lower bound length of matching strings
+  PCRE_INFO_MATCHEMPTY      Return 1 if the pattern can match an empty string,
+                               0 otherwise
+  PCRE_INFO_MATCHLIMIT      Match limit if set, otherwise PCRE_RROR_UNSET
+  PCRE_INFO_MAXLOOKBEHIND   Length (in characters) of the longest lookbehind assertion
  PCRE_INFO_NAMECOUNT       Number of named subpatterns
  PCRE_INFO_NAMEENTRYSIZE   Size of name table entry
  PCRE_INFO_NAMETABLE       Pointer to name table
@@ -72,6 +76,7 @@ The following information is available:
                                  2 if the first character is at the start of the data
                                    string or after a newline, and
                                  0 otherwise
+  PCRE_INFO_RECURSIONLIMIT    Recursion limit if set, otherwise PCRE_ERROR_UNSET
  PCRE_INFO_REQUIREDCHAR      Literal last data unit required
  PCRE_INFO_REQUIREDCHARFLAGS Returns 1 if the last data character is set (which can then
                              be retrieved using PCRE_INFO_REQUIREDCHAR); 0 otherwise
@@ -79,14 +84,18 @@ The following information is available:
 The <i>where</i> argument must point to an integer variable, except for the
 following <i>what</i> values:
 <pre>
-  PCRE_INFO_DEFAULT_TABLES  const unsigned char *
-  PCRE_INFO_FIRSTTABLE      const unsigned char *
+  PCRE_INFO_DEFAULT_TABLES  const uint8_t *
+  PCRE_INFO_FIRSTCHARACTER  uint32_t
+  PCRE_INFO_FIRSTTABLE      const uint8_t *
+  PCRE_INFO_JITSIZE         size_t
+  PCRE_INFO_MATCHLIMIT      uint32_t
  PCRE_INFO_NAMETABLE       PCRE_SPTR16           (16-bit library)
  PCRE_INFO_NAMETABLE       PCRE_SPTR32           (32-bit library)
  PCRE_INFO_NAMETABLE       const unsigned char * (8-bit library)
  PCRE_INFO_OPTIONS         unsigned long int
  PCRE_INFO_SIZE            size_t
-  PCRE_INFO_FIRSTCHARACTER  uint32_t
+  PCRE_INFO_STUDYSIZE       size_t
+  PCRE_INFO_RECURSIONLIMIT  uint32_t
  PCRE_INFO_REQUIREDCHAR    uint32_t
 </pre>
 The yield of the function is zero on success or:
@@ -95,6 +104,7 @@ The yield of the function is zero on success or:
                            the argument <i>where</i> was NULL
  PCRE_ERROR_BADMAGIC       the "magic number" was not found
  PCRE_ERROR_BADOPTION      the value of <i>what</i> was invalid
+  PCRE_ERROR_UNSET          the option was not set
 </PRE>
 </P>
 <P>

--- a/pcre/doc/html/pcrepattern.html
+++ b/pcre/doc/html/pcrepattern.html
@@ -703,6 +703,7 @@ Armenian,
 Avestan,
 Balinese,
 Bamum,
+Bassa_Vah,
 Batak,
 Bengali,
 Bopomofo,
@@ -712,6 +713,7 @@ Buginese,
 Buhid,
 Canadian_Aboriginal,
 Carian,
+Caucasian_Albanian,
 Chakma,
 Cham,
 Cherokee,
@@ -722,11 +724,14 @@ Cypriot,
 Cyrillic,
 Deseret,
 Devanagari,
+Duployan,
 Egyptian_Hieroglyphs,
+Elbasan,
 Ethiopic,
 Georgian,
 Glagolitic,
 Gothic,
+Grantha,
 Greek,
 Gujarati,
 Gurmukhi,
@@ -746,40 +751,56 @@ Katakana,
 Kayah_Li,
 Kharoshthi,
 Khmer,
+Khojki,
+Khudawadi,
 Lao,
 Latin,
 Lepcha,
 Limbu,
+Linear_A,
 Linear_B,
 Lisu,
 Lycian,
 Lydian,
+Mahajani,
 Malayalam,
 Mandaic,
+Manichaean,
 Meetei_Mayek,
+Mende_Kikakui,
 Meroitic_Cursive,
 Meroitic_Hieroglyphs,
 Miao,
+Modi,
 Mongolian,
+Mro,
 Myanmar,
+Nabataean,
 New_Tai_Lue,
 Nko,
 Ogham,
+Ol_Chiki,
 Old_Italic,
+Old_North_Arabian,
+Old_Permic,
 Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
-Ol_Chiki,
 Oriya,
 Osmanya,
+Pahawh_Hmong,
+Palmyrene,
+Pau_Cin_Hau,
 Phags_Pa,
 Phoenician,
+Psalter_Pahlavi,
 Rejang,
 Runic,
 Samaritan,
 Saurashtra,
 Sharada,
 Shavian,
+Siddham,
 Sinhala,
 Sora_Sompeng,
 Sundanese,
@@ -797,8 +818,10 @@ Thaana,
 Thai,
 Tibetan,
 Tifinagh,
+Tirhuta,
 Ugaritic,
 Vai,
+Warang_Citi,
 Yi.
 </P>
 <P>

--- a/pcre/doc/html/pcresyntax.html
+++ b/pcre/doc/html/pcresyntax.html
@@ -171,6 +171,7 @@ Armenian,
 Avestan,
 Balinese,
 Bamum,
+Bassa_Vah,
 Batak,
 Bengali,
 Bopomofo,
@@ -180,6 +181,7 @@ Buginese,
 Buhid,
 Canadian_Aboriginal,
 Carian,
+Caucasian_Albanian,
 Chakma,
 Cham,
 Cherokee,
@@ -190,11 +192,14 @@ Cypriot,
 Cyrillic,
 Deseret,
 Devanagari,
+Duployan,
 Egyptian_Hieroglyphs,
+Elbasan,
 Ethiopic,
 Georgian,
 Glagolitic,
 Gothic,
+Grantha,
 Greek,
 Gujarati,
 Gurmukhi,
@@ -214,40 +219,56 @@ Katakana,
 Kayah_Li,
 Kharoshthi,
 Khmer,
+Khojki,
+Khudawadi,
 Lao,
 Latin,
 Lepcha,
 Limbu,
+Linear_A,
 Linear_B,
 Lisu,
 Lycian,
 Lydian,
+Mahajani,
 Malayalam,
 Mandaic,
+Manichaean,
 Meetei_Mayek,
+Mende_Kikakui,
 Meroitic_Cursive,
 Meroitic_Hieroglyphs,
 Miao,
+Modi,
 Mongolian,
+Mro,
 Myanmar,
+Nabataean,
 New_Tai_Lue,
 Nko,
 Ogham,
+Ol_Chiki,
 Old_Italic,
+Old_North_Arabian,
+Old_Permic,
 Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
-Ol_Chiki,
 Oriya,
 Osmanya,
+Pahawh_Hmong,
+Palmyrene,
+Pau_Cin_Hau,
 Phags_Pa,
 Phoenician,
+Psalter_Pahlavi,
 Rejang,
 Runic,
 Samaritan,
 Saurashtra,
 Sharada,
 Shavian,
+Siddham,
 Sinhala,
 Sora_Sompeng,
 Sundanese,
@@ -265,8 +286,10 @@ Thaana,
 Thai,
 Tibetan,
 Tifinagh,
+Tirhuta,
 Ugaritic,
 Vai,
+Warang_Citi,
 Yi.
 </P>
 <br><a name="SEC8" href="#TOC1">CHARACTER CLASSES</a><br>

--- a/pcre/doc/pcre.txt
+++ b/pcre/doc/pcre.txt
@@ -5326,21 +5326,25 @@ BACKSLASH
       Those  that are not part of an identified script are lumped together as
       "Common". The current list of scripts is:

-       Arabic, Armenian, Avestan, Balinese, Bamum, Batak,  Bengali,  Bopomofo,
-       Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma,
-       Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic,  Deseret,
-       Devanagari,   Egyptian_Hieroglyphs,   Ethiopic,  Georgian,  Glagolitic,
-       Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew,  Hira-
-       gana,   Imperial_Aramaic,  Inherited,  Inscriptional_Pahlavi,  Inscrip-
-       tional_Parthian,  Javanese,  Kaithi,   Kannada,   Katakana,   Kayah_Li,
-       Kharoshthi,  Khmer,  Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian,
-       Lydian,    Malayalam,    Mandaic,    Meetei_Mayek,    Meroitic_Cursive,
-       Meroitic_Hieroglyphs,   Miao,  Mongolian,  Myanmar,  New_Tai_Lue,  Nko,
-       Ogham,   Old_Italic,   Old_Persian,   Old_South_Arabian,    Old_Turkic,
-       Ol_Chiki,  Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samari-
-       tan, Saurashtra, Sharada, Shavian,  Sinhala,  Sora_Sompeng,  Sundanese,
-       Syloti_Nagri,  Syriac,  Tagalog,  Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet,
-       Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh,  Ugaritic,  Vai,
+       Arabic, Armenian, Avestan, Balinese, Bamum, Bassa_Vah, Batak,  Bengali,
+       Bopomofo,  Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Car-
+       ian, Caucasian_Albanian, Chakma, Cham, Cherokee, Common, Coptic, Cunei-
+       form, Cypriot, Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hiero-
+       glyphs,  Elbasan,  Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,
+       Greek,  Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo, Hebrew, Hiragana,
+       Imperial_Aramaic,    Inherited,     Inscriptional_Pahlavi,     Inscrip-
+       tional_Parthian,   Javanese,   Kaithi,   Kannada,  Katakana,  Kayah_Li,
+       Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin, Lepcha,  Limbu,  Lin-
+       ear_A,  Linear_B,  Lisu,  Lycian, Lydian, Mahajani, Malayalam, Mandaic,
+       Manichaean,     Meetei_Mayek,     Mende_Kikakui,      Meroitic_Cursive,
+       Meroitic_Hieroglyphs,  Miao,  Modi, Mongolian, Mro, Myanmar, Nabataean,
+       New_Tai_Lue,  Nko,  Ogham,  Ol_Chiki,  Old_Italic,   Old_North_Arabian,
+       Old_Permic, Old_Persian, Old_South_Arabian, Old_Turkic, Oriya, Osmanya,
+       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
+       Psalter_Pahlavi,  Rejang,  Runic,  Samaritan, Saurashtra, Sharada, Sha-
+       vian, Siddham, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri,  Syriac,
+       Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham, Tai_Viet, Takri, Tamil, Telugu,
+       Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic,  Vai,  Warang_Citi,
       Yi.

       Each character has exactly one Unicode general category property, spec-
@@ -7777,21 +7781,25 @@ PCRE SPECIAL CATEGORY PROPERTIES FOR \p and \P

 SCRIPT NAMES FOR \p AND \P

-       Arabic, Armenian, Avestan, Balinese, Bamum, Batak,  Bengali,  Bopomofo,
-       Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, Chakma,
-       Cham, Cherokee, Common, Coptic, Cuneiform, Cypriot, Cyrillic,  Deseret,
-       Devanagari,   Egyptian_Hieroglyphs,   Ethiopic,  Georgian,  Glagolitic,
-       Gothic, Greek, Gujarati, Gurmukhi, Han, Hangul, Hanunoo, Hebrew,  Hira-
-       gana,   Imperial_Aramaic,  Inherited,  Inscriptional_Pahlavi,  Inscrip-
-       tional_Parthian,  Javanese,  Kaithi,   Kannada,   Katakana,   Kayah_Li,
-       Kharoshthi,  Khmer,  Lao, Latin, Lepcha, Limbu, Linear_B, Lisu, Lycian,
-       Lydian,    Malayalam,    Mandaic,    Meetei_Mayek,    Meroitic_Cursive,
-       Meroitic_Hieroglyphs,   Miao,  Mongolian,  Myanmar,  New_Tai_Lue,  Nko,
-       Ogham,   Old_Italic,   Old_Persian,   Old_South_Arabian,    Old_Turkic,
-       Ol_Chiki,  Oriya, Osmanya, Phags_Pa, Phoenician, Rejang, Runic, Samari-
-       tan, Saurashtra, Sharada, Shavian,  Sinhala,  Sora_Sompeng,  Sundanese,
-       Syloti_Nagri,  Syriac,  Tagalog,  Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet,
-       Takri, Tamil, Telugu, Thaana, Thai, Tibetan, Tifinagh,  Ugaritic,  Vai,
+       Arabic, Armenian, Avestan, Balinese, Bamum, Bassa_Vah, Batak,  Bengali,
+       Bopomofo,  Brahmi,  Braille, Buginese, Buhid, Canadian_Aboriginal, Car-
+       ian, Caucasian_Albanian, Chakma, Cham, Cherokee, Common, Coptic, Cunei-
+       form, Cypriot, Cyrillic, Deseret, Devanagari, Duployan, Egyptian_Hiero-
+       glyphs,  Elbasan,  Ethiopic,  Georgian,  Glagolitic,  Gothic,  Grantha,
+       Greek,  Gujarati,  Gurmukhi,  Han,  Hangul,  Hanunoo, Hebrew, Hiragana,
+       Imperial_Aramaic,    Inherited,     Inscriptional_Pahlavi,     Inscrip-
+       tional_Parthian,   Javanese,   Kaithi,   Kannada,  Katakana,  Kayah_Li,
+       Kharoshthi, Khmer, Khojki, Khudawadi, Lao, Latin, Lepcha,  Limbu,  Lin-
+       ear_A,  Linear_B,  Lisu,  Lycian, Lydian, Mahajani, Malayalam, Mandaic,
+       Manichaean,     Meetei_Mayek,     Mende_Kikakui,      Meroitic_Cursive,
+       Meroitic_Hieroglyphs,  Miao,  Modi, Mongolian, Mro, Myanmar, Nabataean,
+       New_Tai_Lue,  Nko,  Ogham,  Ol_Chiki,  Old_Italic,   Old_North_Arabian,
+       Old_Permic, Old_Persian, Old_South_Arabian, Old_Turkic, Oriya, Osmanya,
+       Pahawh_Hmong,    Palmyrene,    Pau_Cin_Hau,    Phags_Pa,    Phoenician,
+       Psalter_Pahlavi,  Rejang,  Runic,  Samaritan, Saurashtra, Sharada, Sha-
+       vian, Siddham, Sinhala, Sora_Sompeng, Sundanese, Syloti_Nagri,  Syriac,
+       Tagalog,  Tagbanwa,  Tai_Le,  Tai_Tham, Tai_Viet, Takri, Tamil, Telugu,
+       Thaana, Thai, Tibetan, Tifinagh, Tirhuta, Ugaritic,  Vai,  Warang_Citi,
       Yi.



--- a/pcre/doc/pcre_config.3
+++ b/pcre/doc/pcre_config.3
-.TH PCRE_CONFIG 3 "05 November 2013" "PCRE 8.34"
+.TH PCRE_CONFIG 3 "20 April 2014" "PCRE 8.36"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH SYNOPSIS
@@ -24,8 +24,10 @@ arguments are as follows:
  \fIwhere\fP    Points to where to put the data
 .sp
 The \fIwhere\fP argument must point to an integer variable, except for
-PCRE_CONFIG_MATCH_LIMIT and PCRE_CONFIG_MATCH_LIMIT_RECURSION, when it must
-point to an unsigned long integer. The available codes are:
+PCRE_CONFIG_MATCH_LIMIT, PCRE_CONFIG_MATCH_LIMIT_RECURSION, and
+PCRE_CONFIG_PARENS_LIMIT, when it must point to an unsigned long integer,
+and for PCRE_CONFIG_JITTARGET, when it must point to a const char*.
+The available codes are:
 .sp
  PCRE_CONFIG_JIT           Availability of just-in-time compiler
                              support (1=yes 0=no)

--- a/pcre/doc/pcre_fullinfo.3
+++ b/pcre/doc/pcre_fullinfo.3
-.TH PCRE_FULLINFO 3 "24 June 2012" "PCRE 8.30"
+.TH PCRE_FULLINFO 3 "21 April 2014" "PCRE 8.36"
 .SH NAME
 PCRE - Perl-compatible regular expressions
 .SH SYNOPSIS
@@ -43,6 +43,10 @@ The following information is available:
  PCRE_INFO_JITSIZE         Size of JIT compiled code
  PCRE_INFO_LASTLITERAL     Literal last data unit required
  PCRE_INFO_MINLENGTH       Lower bound length of matching strings
+  PCRE_INFO_MATCHEMPTY      Return 1 if the pattern can match an empty string,
+                               0 otherwise
+  PCRE_INFO_MATCHLIMIT      Match limit if set, otherwise PCRE_RROR_UNSET
+  PCRE_INFO_MAXLOOKBEHIND   Length (in characters) of the longest lookbehind assertion
  PCRE_INFO_NAMECOUNT       Number of named subpatterns
  PCRE_INFO_NAMEENTRYSIZE   Size of name table entry
  PCRE_INFO_NAMETABLE       Pointer to name table
@@ -58,6 +62,7 @@ The following information is available:
                                  2 if the first character is at the start of the data
                                    string or after a newline, and
                                  0 otherwise
+  PCRE_INFO_RECURSIONLIMIT    Recursion limit if set, otherwise PCRE_ERROR_UNSET
  PCRE_INFO_REQUIREDCHAR      Literal last data unit required
  PCRE_INFO_REQUIREDCHARFLAGS Returns 1 if the last data character is set (which can then
                              be retrieved using PCRE_INFO_REQUIREDCHAR); 0 otherwise
@@ -65,14 +70,18 @@ The following information is available:
 The \fIwhere\fP argument must point to an integer variable, except for the
 following \fIwhat\fP values:
 .sp
-  PCRE_INFO_DEFAULT_TABLES  const unsigned char *
-  PCRE_INFO_FIRSTTABLE      const unsigned char *
+  PCRE_INFO_DEFAULT_TABLES  const uint8_t *
+  PCRE_INFO_FIRSTCHARACTER  uint32_t
+  PCRE_INFO_FIRSTTABLE      const uint8_t *
+  PCRE_INFO_JITSIZE         size_t
+  PCRE_INFO_MATCHLIMIT      uint32_t
  PCRE_INFO_NAMETABLE       PCRE_SPTR16           (16-bit library)
  PCRE_INFO_NAMETABLE       PCRE_SPTR32           (32-bit library)
  PCRE_INFO_NAMETABLE       const unsigned char * (8-bit library)
  PCRE_INFO_OPTIONS         unsigned long int
  PCRE_INFO_SIZE            size_t
-  PCRE_INFO_FIRSTCHARACTER  uint32_t
+  PCRE_INFO_STUDYSIZE       size_t
+  PCRE_INFO_RECURSIONLIMIT  uint32_t
  PCRE_INFO_REQUIREDCHAR    uint32_t
 .sp
 The yield of the function is zero on success or:
@@ -81,6 +90,7 @@ The yield of the function is zero on success or:
                            the argument \fIwhere\fP was NULL
  PCRE_ERROR_BADMAGIC       the "magic number" was not found
  PCRE_ERROR_BADOPTION      the value of \fIwhat\fP was invalid
+  PCRE_ERROR_UNSET          the option was not set
 .P
 There is a complete description of the PCRE native API in the
 .\" HREF

--- a/pcre/doc/pcrepattern.3
+++ b/pcre/doc/pcrepattern.3
@@ -708,6 +708,7 @@ Armenian,
 Avestan,
 Balinese,
 Bamum,
+Bassa_Vah,
 Batak,
 Bengali,
 Bopomofo,
@@ -717,6 +718,7 @@ Buginese,
 Buhid,
 Canadian_Aboriginal,
 Carian,
+Caucasian_Albanian,
 Chakma,
 Cham,
 Cherokee,
@@ -727,11 +729,14 @@ Cypriot,
 Cyrillic,
 Deseret,
 Devanagari,
+Duployan,
 Egyptian_Hieroglyphs,
+Elbasan,
 Ethiopic,
 Georgian,
 Glagolitic,
 Gothic,
+Grantha,
 Greek,
 Gujarati,
 Gurmukhi,
@@ -751,40 +756,56 @@ Katakana,
 Kayah_Li,
 Kharoshthi,
 Khmer,
+Khojki,
+Khudawadi,
 Lao,
 Latin,
 Lepcha,
 Limbu,
+Linear_A,
 Linear_B,
 Lisu,
 Lycian,
 Lydian,
+Mahajani,
 Malayalam,
 Mandaic,
+Manichaean,
 Meetei_Mayek,
+Mende_Kikakui,
 Meroitic_Cursive,
 Meroitic_Hieroglyphs,
 Miao,
+Modi,
 Mongolian,
+Mro,
 Myanmar,
+Nabataean,
 New_Tai_Lue,
 Nko,
 Ogham,
+Ol_Chiki,
 Old_Italic,
+Old_North_Arabian,
+Old_Permic,
 Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
-Ol_Chiki,
 Oriya,
 Osmanya,
+Pahawh_Hmong,
+Palmyrene,
+Pau_Cin_Hau,
 Phags_Pa,
 Phoenician,
+Psalter_Pahlavi,
 Rejang,
 Runic,
 Samaritan,
 Saurashtra,
 Sharada,
 Shavian,
+Siddham,
 Sinhala,
 Sora_Sompeng,
 Sundanese,
@@ -802,8 +823,10 @@ Thaana,
 Thai,
 Tibetan,
 Tifinagh,
+Tirhuta,
 Ugaritic,
 Vai,
+Warang_Citi,
 Yi.
 .P
 Each character has exactly one Unicode general category property, specified by

--- a/pcre/doc/pcresyntax.3
+++ b/pcre/doc/pcresyntax.3
@@ -139,6 +139,7 @@ Armenian,
 Avestan,
 Balinese,
 Bamum,
+Bassa_Vah,
 Batak,
 Bengali,
 Bopomofo,
@@ -148,6 +149,7 @@ Buginese,
 Buhid,
 Canadian_Aboriginal,
 Carian,
+Caucasian_Albanian,
 Chakma,
 Cham,
 Cherokee,
@@ -158,11 +160,14 @@ Cypriot,
 Cyrillic,
 Deseret,
 Devanagari,
+Duployan,
 Egyptian_Hieroglyphs,
+Elbasan,
 Ethiopic,
 Georgian,
 Glagolitic,
 Gothic,
+Grantha,
 Greek,
 Gujarati,
 Gurmukhi,
@@ -182,40 +187,56 @@ Katakana,
 Kayah_Li,
 Kharoshthi,
 Khmer,
+Khojki,
+Khudawadi,
 Lao,
 Latin,
 Lepcha,
 Limbu,
+Linear_A,
 Linear_B,
 Lisu,
 Lycian,
 Lydian,
+Mahajani,
 Malayalam,
 Mandaic,
+Manichaean,
 Meetei_Mayek,
+Mende_Kikakui,
 Meroitic_Cursive,
 Meroitic_Hieroglyphs,
 Miao,
+Modi,
 Mongolian,
+Mro,
 Myanmar,
+Nabataean,
 New_Tai_Lue,
 Nko,
 Ogham,
+Ol_Chiki,
 Old_Italic,
+Old_North_Arabian,
+Old_Permic,
 Old_Persian,
 Old_South_Arabian,
 Old_Turkic,
-Ol_Chiki,
 Oriya,
 Osmanya,
+Pahawh_Hmong,
+Palmyrene,
+Pau_Cin_Hau,
 Phags_Pa,
 Phoenician,
+Psalter_Pahlavi,
 Rejang,
 Runic,
 Samaritan,
 Saurashtra,
 Sharada,
 Shavian,
+Siddham,
 Sinhala,
 Sora_Sompeng,
 Sundanese,
@@ -233,8 +254,10 @@ Thaana,
 Thai,
 Tibetan,
 Tifinagh,
+Tirhuta,
 Ugaritic,
 Vai,
+Warang_Citi,
 Yi.
 .
 .

--- a/pcre/pcre_compile.c
+++ b/pcre/pcre_compile.c
--- a/pcre/pcre_dfa_exec.c
+++ b/pcre/pcre_dfa_exec.c
@@ -3242,7 +3242,7 @@ md->callout_data = NULL;

 if (extra_data != NULL)
  {
-  unsigned int flags = extra_data->flags;
+  unsigned long int flags = extra_data->flags;
  if ((flags & PCRE_EXTRA_STUDY_DATA) != 0)
    study = (const pcre_study_data *)extra_data->study_data;
  if ((flags & PCRE_EXTRA_MATCH_LIMIT) != 0) return PCRE_ERROR_DFA_UMLIMIT;

--- a/pcre/pcre_exec.c
+++ b/pcre/pcre_exec.c
@@ -1167,11 +1167,16 @@ for (;;)
        if (rrc == MATCH_KETRPOS)
          {
          offset_top = md->end_offset_top;
-          eptr = md->end_match_ptr;
          ecode = md->start_code + code_offset;
          save_capture_last = md->capture_last;
          matched_once = TRUE;
          mstart = md->start_match_ptr;    /* In case \K changed it */
+          if (eptr == md->end_match_ptr)   /* Matched an empty string */
+            {
+            do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
+            break;
+            }
+          eptr = md->end_match_ptr;
          continue;
          }

@@ -1241,10 +1246,15 @@ for (;;)
      if (rrc == MATCH_KETRPOS)
        {
        offset_top = md->end_offset_top;
-        eptr = md->end_match_ptr;
        ecode = md->start_code + code_offset;
        matched_once = TRUE;
        mstart = md->start_match_ptr;   /* In case \K reset it */
+        if (eptr == md->end_match_ptr)  /* Matched an empty string */
+          {
+          do ecode += GET(ecode, 1); while (*ecode == OP_ALT);
+          break;
+          }
+        eptr = md->end_match_ptr;
        continue;
        }

@@ -1979,6 +1989,19 @@ for (;;)
        }
      }

+    /* OP_KETRPOS is a possessive repeating ket. Remember the current position,
+    and return the MATCH_KETRPOS. This makes it possible to do the repeats one
+    at a time from the outer level, thus saving stack. This must precede the
+    empty string test - in this case that test is done at the outer level. */
+
+    if (*ecode == OP_KETRPOS)
+      {
+      md->start_match_ptr = mstart;    /* In case \K reset it */
+      md->end_match_ptr = eptr;
+      md->end_offset_top = offset_top;
+      RRETURN(MATCH_KETRPOS);
+      }
+
    /* For an ordinary non-repeating ket, just continue at this level. This
    also happens for a repeating ket if no characters were matched in the
    group. This is the forcible breaking of infinite loops as implemented in
@@ -2001,18 +2024,6 @@ for (;;)
      break;
      }

-    /* OP_KETRPOS is a possessive repeating ket. Remember the current position,
-    and return the MATCH_KETRPOS. This makes it possible to do the repeats one
-    at a time from the outer level, thus saving stack. */
-
-    if (*ecode == OP_KETRPOS)
-      {
-      md->start_match_ptr = mstart;    /* In case \K reset it */
-      md->end_match_ptr = eptr;
-      md->end_offset_top = offset_top;
-      RRETURN(MATCH_KETRPOS);
-      }
-
    /* The normal repeating kets try the rest of the pattern or restart from
    the preceding bracket, in the appropriate order. In the second case, we can
    use tail recursion to avoid using another stack frame, unless we have an
@@ -5681,54 +5692,25 @@ for (;;)
        switch(ctype)
          {
          case OP_ANY:
-          if (max < INT_MAX)
+          for (i = min; i < max; i++)
            {
-            for (i = min; i < max; i++)
+            if (eptr >= md->end_subject)
              {
-              if (eptr >= md->end_subject)
-                {
-                SCHECK_PARTIAL();
-                break;
-                }
-              if (IS_NEWLINE(eptr)) break;
-              if (md->partial != 0 &&    /* Take care with CRLF partial */
-                  eptr + 1 >= md->end_subject &&
-                  NLBLOCK->nltype == NLTYPE_FIXED &&
-                  NLBLOCK->nllen == 2 &&
-                  UCHAR21(eptr) == NLBLOCK->nl[0])
-                {
-                md->hitend = TRUE;
-                if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
-                }
-              eptr++;
-              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
+              SCHECK_PARTIAL();
+              break;
              }
-            }
-
-          /* Handle unlimited UTF-8 repeat */
-
-          else
-            {
-            for (i = min; i < max; i++)
+            if (IS_NEWLINE(eptr)) break;
+            if (md->partial != 0 &&    /* Take care with CRLF partial */
+                eptr + 1 >= md->end_subject &&
+                NLBLOCK->nltype == NLTYPE_FIXED &&
+                NLBLOCK->nllen == 2 &&
+                UCHAR21(eptr) == NLBLOCK->nl[0])
              {
-              if (eptr >= md->end_subject)
-                {
-                SCHECK_PARTIAL();
-                break;
-                }
-              if (IS_NEWLINE(eptr)) break;
-              if (md->partial != 0 &&    /* Take care with CRLF partial */
-                  eptr + 1 >= md->end_subject &&
-                  NLBLOCK->nltype == NLTYPE_FIXED &&
-                  NLBLOCK->nllen == 2 &&
-                  UCHAR21(eptr) == NLBLOCK->nl[0])
-                {
-                md->hitend = TRUE;
-                if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
-                }
-              eptr++;
-              ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
+              md->hitend = TRUE;
+              if (md->partial > 1) RRETURN(PCRE_ERROR_PARTIAL);
              }
+            eptr++;
+            ACROSSCHAR(eptr < md->end_subject, *eptr, eptr++);
            }
          break;

@@ -6519,7 +6501,7 @@ tables = re->tables;

 if (extra_data != NULL)
  {
-  register unsigned int flags = extra_data->flags;
+  unsigned long int flags = extra_data->flags;
  if ((flags & PCRE_EXTRA_STUDY_DATA) != 0)
    study = (const pcre_study_data *)extra_data->study_data;
  if ((flags & PCRE_EXTRA_MATCH_LIMIT) != 0)

--- a/pcre/pcre_internal.h
+++ b/pcre/pcre_internal.h
@@ -2281,7 +2281,7 @@ enum { ERR0,  ERR1,  ERR2,  ERR3,  ERR4,  ERR5,  ERR6,  ERR7,  ERR8,  ERR9,
       ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59,
       ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69,
       ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79,
-       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERRCOUNT };
+       ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERR86, ERRCOUNT };

 /* JIT compiling modes. The function list is indexed by them. */


--- a/pcre/pcre_jit_compile.c
+++ b/pcre/pcre_jit_compile.c
--- a/pcre/pcre_scanner_unittest.cc
+++ b/pcre/pcre_scanner_unittest.cc
@@ -149,6 +149,8 @@ static void TestBigComment() {
 //       small stack size

 int main(int argc, char** argv) {
+  (void)argc;
+  (void)argv;
  TestScanner();
  TestBigComment();


--- a/pcre/pcre_stringpiece.h.in
+++ b/pcre/pcre_stringpiece.h.in
@@ -174,6 +174,7 @@ template<> struct __type_traits<pcrecpp::StringPiece> {
 #endif

 // allow StringPiece to be logged
-std::ostream& operator<<(std::ostream& o, const pcrecpp::StringPiece& piece);
+PCRECPP_EXP_DECL std::ostream& operator<<(std::ostream& o,
+                                          const pcrecpp::StringPiece& piece);

 #endif /* _PCRE_STRINGPIECE_H */
--- a/pcre/pcre_stringpiece_unittest.cc
+++ b/pcre/pcre_stringpiece_unittest.cc
@@ -142,6 +142,8 @@ static void CheckComparisonOperators() {
 }

 int main(int argc, char** argv) {
+  (void)argc;
+  (void)argv;
  CheckComparisonOperators();
  CheckSTLComparator();


--- a/pcre/pcre_study.c
+++ b/pcre/pcre_study.c
@@ -863,7 +863,6 @@ do
      case OP_NOTUPTOI:
      case OP_NOT_HSPACE:
      case OP_NOT_VSPACE:
-      case OP_PROP:
      case OP_PRUNE:
      case OP_PRUNE_ARG:
      case OP_RECURSE:
@@ -881,6 +880,31 @@ do
      case OP_THEN_ARG:
      return SSB_FAIL;

+      /* A "real" property test implies no starting bits, but the fake property
+      PT_CLIST identifies a list of characters. These lists are short, as they
+      are used for characters with more than one "other case", so there is no
+      point in recognizing them for OP_NOTPROP. */
+
+      case OP_PROP:
+      if (tcode[1] != PT_CLIST) return SSB_FAIL;
+        {
+        const pcre_uint32 *p = PRIV(ucd_caseless_sets) + tcode[2];
+        while ((c = *p++) < NOTACHAR)
+          {
+#if defined SUPPORT_UTF && defined COMPILE_PCRE8
+          if (utf)
+            {
+            pcre_uchar buff[6];
+            (void)PRIV(ord2utf)(c, buff);
+            c = buff[0];
+            }
+#endif
+          if (c > 0xff) SET_BIT(0xff); else SET_BIT(c);
+          }
+        }
+      try_next = FALSE;
+      break;
+
      /* We can ignore word boundary tests. */

      case OP_WORD_BOUNDARY:
@@ -1106,24 +1130,17 @@ do
      try_next = FALSE;
      break;

-      /* The cbit_space table has vertical tab as whitespace; we have to
-      ensure it is set as not whitespace. Luckily, the code value is the same
-      (0x0b) in ASCII and EBCDIC, so we can just adjust the appropriate bit. */
+      /* The cbit_space table has vertical tab as whitespace; we no longer
+      have to play fancy tricks because Perl added VT to its whitespace at
+      release 5.18. PCRE added it at release 8.34. */

      case OP_NOT_WHITESPACE:
      set_nottype_bits(start_bits, cbit_space, table_limit, cd);
-      start_bits[1] |= 0x08;
      try_next = FALSE;
      break;

-      /* The cbit_space table has vertical tab as whitespace; we have to not
-      set it from the table. Luckily, the code value is the same (0x0b) in
-      ASCII and EBCDIC, so we can just adjust the appropriate bit. */
-
      case OP_WHITESPACE:
-      c = start_bits[1];    /* Save in case it was already set */
      set_type_bits(start_bits, cbit_space, table_limit, cd);
-      start_bits[1] = (start_bits[1] & ~0x08) | c;
      try_next = FALSE;
      break;


--- a/pcre/pcre_tables.c
+++ b/pcre/pcre_tables.c
--- a/pcre/pcre_ucd.c
+++ b/pcre/pcre_ucd.c
--- a/pcre/pcrecpp.cc
+++ b/pcre/pcrecpp.cc
@@ -511,7 +511,7 @@ int RE::TryMatch(const StringPiece& text,
    return 0;
  }

-  pcre_extra extra = { 0, 0, 0, 0, 0, 0 };
+  pcre_extra extra = { 0, 0, 0, 0, 0, 0, 0, 0 };
  if (options_.match_limit() > 0) {
    extra.flags |= PCRE_EXTRA_MATCH_LIMIT;
    extra.match_limit = options_.match_limit();
@@ -660,6 +660,8 @@ int RE::NumberOfCapturingGroups() const {
 /***** Parsers for various types *****/

 bool Arg::parse_null(const char* str, int n, void* dest) {
+  (void)str;
+  (void)n;
  // We fail if somebody asked us to store into a non-NULL void* pointer
  return (dest == NULL);
 }

--- a/pcre/pcregrep.c
+++ b/pcre/pcregrep.c
@@ -455,7 +455,7 @@ exit(rc);
  s          pattern string to add
  after      if not NULL points to item to insert after

-Returns:     new pattern block
+Returns:     new pattern block or NULL on error
 */

 static patstr *
@@ -471,6 +471,7 @@ if (strlen(s) > MAXPATLEN)
  {
  fprintf(stderr, "pcregrep: pattern is too long (limit is %d bytes)\n",
    MAXPATLEN);
+  free(p);
  return NULL;
  }
 p->next = NULL;
@@ -2549,7 +2550,11 @@ while (fgets(buffer, PATBUFSIZE, f) != NULL)
  afterwards, as a precaution against any later code trying to use it. */

  *patlastptr = add_pattern(buffer, *patlastptr);
-  if (*patlastptr == NULL) return FALSE;
+  if (*patlastptr == NULL)
+    {
+    if (f != stdin) fclose(f);
+    return FALSE;
+    }
  if (*patptr == NULL) *patptr = *patlastptr;

  /* This loop is needed because compiling a "pattern" when -F is set may add
@@ -2561,7 +2566,10 @@ while (fgets(buffer, PATBUFSIZE, f) != NULL)
    {
    if (!compile_pattern(*patlastptr, pcre_options, popts, TRUE, filename,
        linenumber))
+      {
+      if (f != stdin) fclose(f);
      return FALSE;
+      }
    (*patlastptr)->string = NULL;            /* Insurance */
    if ((*patlastptr)->next == NULL) break;
    *patlastptr = (*patlastptr)->next;
@@ -2962,8 +2970,8 @@ if (locale == NULL)
  locale_from = "LC_CTYPE";
  }

-/* If a locale has been provided, set it, and generate the tables the PCRE
-needs. Otherwise, pcretables==NULL, which causes the use of default tables. */
+/* If a locale is set, use it to generate the tables the PCRE needs. Otherwise,
+pcretables==NULL, which causes the use of default tables. */

 if (locale != NULL)
  {
@@ -2971,7 +2979,7 @@ if (locale != NULL)
    {
    fprintf(stderr, "pcregrep: Failed to set locale %s (obtained from %s)\n",
      locale, locale_from);
-    return 2;
+    goto EXIT2;
    }
  pcretables = pcre_maketables();
  }
@@ -2986,7 +2994,7 @@ if (colour_option != NULL && strcmp(colour_option, "never") != 0)
    {
    fprintf(stderr, "pcregrep: Unknown colour setting \"%s\"\n",
      colour_option);
-    return 2;
+    goto EXIT2;
    }
  if (do_colour)
    {
@@ -3026,7 +3034,7 @@ else if (strcmp(newline, "anycrlf") == 0 || strcmp(newline, "ANYCRLF") == 0)
 else
  {
  fprintf(stderr, "pcregrep: Invalid newline specifier \"%s\"\n", newline);
-  return 2;
+  goto EXIT2;
  }

 /* Interpret the text values for -d and -D */
@@ -3039,7 +3047,7 @@ if (dee_option != NULL)
  else
    {
    fprintf(stderr, "pcregrep: Invalid value \"%s\" for -d\n", dee_option);
-    return 2;
+    goto EXIT2;
    }
  }

@@ -3050,7 +3058,7 @@ if (DEE_option != NULL)
  else
    {
    fprintf(stderr, "pcregrep: Invalid value \"%s\" for -D\n", DEE_option);
-    return 2;
+    goto EXIT2;
    }
  }

@@ -3251,7 +3259,8 @@ for (; i < argc; i++)
 if (jit_stack != NULL) pcre_jit_stack_free(jit_stack);
 #endif

-if (main_buffer != NULL) free(main_buffer);
+free(main_buffer);
+free((void *)pcretables);

 free_pattern_chain(patterns);
 free_pattern_chain(include_patterns);

--- a/pcre/pcreposix.c
+++ b/pcre/pcreposix.c
@@ -172,7 +172,8 @@ static const int eint[] = {
  REG_BADPAT,  /* invalid range in character class */
  REG_BADPAT,  /* group name must start with a non-digit */
  /* 85 */
-  REG_BADPAT   /* parentheses too deeply nested (stack check) */
+  REG_BADPAT,  /* parentheses too deeply nested (stack check) */
+  REG_BADPAT   /* missing digits in \x{} or \o{} */
 };

 /* Table of texts corresponding to POSIX error codes */

--- a/pcre/testdata/testinput1
+++ b/pcre/testdata/testinput1
@@ -111,7 +111,7 @@
    bababbc
    babababc

-/^\ca\cA\c[\c{\c:/
+/^\ca\cA\c[;\c:/
    \x01\x01\e;z

 /^[ab\]cde]/
@@ -4937,6 +4937,12 @@ however, we need the complication for Perl. ---/

 /((?(R1)a+|(?1)b))/
    aaaabcde
+    
+/((?(R)a|(?1)))*/
+    aaa
+
+/((?(R)a|(?1)))+/
+    aaa

 /a(*:any 
 name)/K
@@ -5666,4 +5672,52 @@ AbcdCBefgBhiBqz
 /(a\Kb)*/+
    ababc

+/(?:x|(?:(xx|yy)+|x|x|x|x|x)|a|a|a)bc/
+    acb
+
+'\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")++\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
+'\A([^\"1]++|[\"2]([^\"3]*+|[\"4][\"5])*+[\"6])++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
+/^\w+(?>\s*)(?<=\w)/
+  test test
+
+/(?P<same>a)(?P<same>b)/gJ
+    abbaba
+
+/(?P<same>a)(?P<same>b)(?P=same)/gJ
+    abbaba
+
+/(?P=same)?(?P<same>a)(?P<same>b)/gJ
+    abbaba
+
+/(?:(?P=same)?(?:(?P<same>a)|(?P<same>b))(?P=same))+/gJ
+    bbbaaabaabb
+
+/(?:(?P=same)?(?:(?P=same)(?P<same>a)(?P=same)|(?P=same)?(?P<same>b)(?P=same)){2}(?P=same)(?P<same>c)(?P=same)){2}(?P<same>z)?/gJ
+    bbbaaaccccaaabbbcc
+
+/(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
+    acl
+    bdl
+    adl
+    bcl    
+
+/\sabc/
+    \x{0b}abc
+
+/[\Qa]\E]+/
+    aa]]
+
+/[\Q]a\E]+/
+    aa]]
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testinput11
+++ b/pcre/testdata/testinput11
@@ -132,4 +132,6 @@ is required for these tests. --/

 /abc(d|e)(*THEN)x(123(*THEN)4|567(b|q)(*THEN)xx)/B

+/(((a\2)|(a*)\g<-1>))*a?/B
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testinput16
+++ b/pcre/testdata/testinput16
@@ -32,4 +32,10 @@

 /[[:blank:]]/WBZ

+/\x{212a}+/i8SI
+    KKkk\x{212a}
+
+/s+/i8SI
+    SSss\x{17f}
+
 /-- End of testinput16 --/
--- a/pcre/testdata/testinput19
+++ b/pcre/testdata/testinput19
@@ -19,4 +19,10 @@

 /[[:blank:]]/WBZ

+/\x{212a}+/i8SI
+    KKkk\x{212a}
+
+/s+/i8SI
+    SSss\x{17f}
+
 /-- End of testinput19 --/ 
--- a/pcre/testdata/testinput2
+++ b/pcre/testdata/testinput2
@@ -4035,6 +4035,8 @@ backtracking verbs. --/

 /(?(R&6yh)abc)/

+/(((a\2)|(a*)\g<-1>))*a?/BZ
+
 /-- Test the ugly "start or end of word" compatibility syntax --/

 /[[:<:]]red[[:>:]]/BZ
@@ -4062,4 +4064,18 @@ backtracking verbs. --/

 /(((((a)))))/Q

+/^\w+(?>\s*)(?<=\w)/BZ
+
+/\othing/
+
+/\o{}/
+
+/\o{whatever}/
+
+/\xthing/
+
+/\x{}/
+
+/\x{whatever}/
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testinput6
+++ b/pcre/testdata/testinput6
@@ -421,8 +421,8 @@
 /^[\p{Arabic}]/8
    \x{06e9}
    \x{060b}
-    \x{061c}
    ** Failers
+    \x{061c}
    X\x{06e9}   

 /^[\P{Yi}]/8
@@ -1493,4 +1493,7 @@
 /[q-u]+/8iW 
    Ss\x{17f}

+/^s?c/mi8
+    scat
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testinput7
+++ b/pcre/testdata/testinput7
@@ -835,4 +835,7 @@ of case for anything other than the ASCII letters. --/

 /[Q-U]+/8iWBZ 

+/^s?c/mi8I
+    scat
+
 /-- End of testinput7 --/
--- a/pcre/testdata/testinput8
+++ b/pcre/testdata/testinput8
@@ -4831,4 +4831,10 @@
 /[ab]{2,}?/
    aaaa    

+'\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+
 /-- End of testinput8 --/
--- a/pcre/testdata/testoutput1
+++ b/pcre/testdata/testoutput1
@@ -223,7 +223,7 @@ No match
    babababc
 No match

-/^\ca\cA\c[\c{\c:/
+/^\ca\cA\c[;\c:/
    \x01\x01\e;z
 0: \x01\x01\x1b;z

@@ -8234,6 +8234,16 @@ MK: M
    aaaabcde
 0: aaaab
 1: aaaab
+    
+/((?(R)a|(?1)))*/
+    aaa
+ 0: aaa
+ 1: a
+
+/((?(R)a|(?1)))+/
+    aaa
+ 0: aaa
+ 1: a

 /a(*:any 
 name)/K
@@ -9313,4 +9323,92 @@ No match
 0+ c
 1: ab

+/(?:x|(?:(xx|yy)+|x|x|x|x|x)|a|a|a)bc/
+    acb
+No match
+
+'\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")++\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+
+'\A([^\"1]++|[\"2]([^\"3]*+|[\"4][\"5])*+[\"6])++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+ 1:  AFTER 
+ 2: 
+
+/^\w+(?>\s*)(?<=\w)/
+  test test
+ 0: tes
+
+/(?P<same>a)(?P<same>b)/gJ
+    abbaba
+ 0: ab
+ 1: a
+ 2: b
+ 0: ab
+ 1: a
+ 2: b
+
+/(?P<same>a)(?P<same>b)(?P=same)/gJ
+    abbaba
+ 0: aba
+ 1: a
+ 2: b
+
+/(?P=same)?(?P<same>a)(?P<same>b)/gJ
+    abbaba
+ 0: ab
+ 1: a
+ 2: b
+ 0: ab
+ 1: a
+ 2: b
+
+/(?:(?P=same)?(?:(?P<same>a)|(?P<same>b))(?P=same))+/gJ
+    bbbaaabaabb
+ 0: bbbaaaba
+ 1: a
+ 2: b
+ 0: bb
+ 1: <unset>
+ 2: b
+
+/(?:(?P=same)?(?:(?P=same)(?P<same>a)(?P=same)|(?P=same)?(?P<same>b)(?P=same)){2}(?P=same)(?P<same>c)(?P=same)){2}(?P<same>z)?/gJ
+    bbbaaaccccaaabbbcc
+No match
+
+/(?P<Name>a)?(?P<Name2>b)?(?(<Name>)c|d)*l/
+    acl
+ 0: acl
+ 1: a
+    bdl
+ 0: bdl
+ 1: <unset>
+ 2: b
+    adl
+ 0: dl
+    bcl    
+ 0: l
+
+/\sabc/
+    \x{0b}abc
+ 0: \x0babc
+
+/[\Qa]\E]+/
+    aa]]
+ 0: aa]]
+
+/[\Q]a\E]+/
+    aa]]
+ 0: aa]]
+
 /-- End of testinput1 --/
--- a/pcre/testdata/testoutput11-16
+++ b/pcre/testdata/testoutput11-16
@@ -709,4 +709,28 @@ Memory allocation (code space): 14
 62     End
 ------------------------------------------------------------------

+/(((a\2)|(a*)\g<-1>))*a?/B
+------------------------------------------------------------------
+  0  39 Bra
+  2     Brazero
+  3  32 SCBra 1
+  6  27 Once
+  8  12 CBra 2
+ 11   7 CBra 3
+ 14     a
+ 16     \2
+ 18   7 Ket
+ 20  11 Alt
+ 22   5 CBra 4
+ 25     a*
+ 27   5 Ket
+ 29  22 Recurse
+ 31  23 Ket
+ 33  27 Ket
+ 35  32 KetRmax
+ 37     a?+
+ 39  39 Ket
+ 41     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-32
+++ b/pcre/testdata/testoutput11-32
@@ -709,4 +709,28 @@ Memory allocation (code space): 28
 62     End
 ------------------------------------------------------------------

+/(((a\2)|(a*)\g<-1>))*a?/B
+------------------------------------------------------------------
+  0  39 Bra
+  2     Brazero
+  3  32 SCBra 1
+  6  27 Once
+  8  12 CBra 2
+ 11   7 CBra 3
+ 14     a
+ 16     \2
+ 18   7 Ket
+ 20  11 Alt
+ 22   5 CBra 4
+ 25     a*
+ 27   5 Ket
+ 29  22 Recurse
+ 31  23 Ket
+ 33  27 Ket
+ 35  32 KetRmax
+ 37     a?+
+ 39  39 Ket
+ 41     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput11-8
+++ b/pcre/testdata/testoutput11-8
@@ -709,4 +709,28 @@ Memory allocation (code space): 10
 76     End
 ------------------------------------------------------------------

+/(((a\2)|(a*)\g<-1>))*a?/B
+------------------------------------------------------------------
+  0  57 Bra
+  3     Brazero
+  4  48 SCBra 1
+  9  40 Once
+ 12  18 CBra 2
+ 17  10 CBra 3
+ 22     a
+ 24     \2
+ 27  10 Ket
+ 30  16 Alt
+ 33   7 CBra 4
+ 38     a*
+ 40   7 Ket
+ 43  33 Recurse
+ 46  34 Ket
+ 49  40 Ket
+ 52  48 KetRmax
+ 55     a?+
+ 57  57 Ket
+ 60     End
+------------------------------------------------------------------
+
 /-- End of testinput11 --/
--- a/pcre/testdata/testoutput15
+++ b/pcre/testdata/testoutput15
@@ -871,7 +871,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting chars: \x09 \x0a \x0c \x0d \x20 \xc2 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \xc2 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@@ -883,15 +883,15 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 
-  \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 
-  \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 
-  \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 
-  \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xc0 \xc1 \xc2 \xc3 \xc4 
+  \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 \xd2 \xd3 
+  \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 \xe1 \xe2 
+  \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef \xf0 \xf1 
+  \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe \xff 
    \x{a2} \x{84} 
 0: \x{a2} \x{84}
    A Z 

--- a/pcre/testdata/testoutput16
+++ b/pcre/testdata/testoutput16
@@ -118,4 +118,24 @@ Starting chars: \x0a \x0b \x0c \x0d \x85
        End
 ------------------------------------------------------------------

+/\x{212a}+/i8SI
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: K k \xe2 
+    KKkk\x{212a}
+ 0: KKkk\x{212a}
+
+/s+/i8SI
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: S s \xc5 
+    SSss\x{17f}
+ 0: SSss\x{17f}
+
 /-- End of testinput16 --/
--- a/pcre/testdata/testoutput18-16
+++ b/pcre/testdata/testoutput18-16
@@ -752,7 +752,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@@ -764,20 +764,20 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 
-  \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 
-  \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 
-  \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 
-  \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 
-  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
-  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
-  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
-  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
-  \xfe \xff 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 
+  \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 
+  \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4 
+  \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 
+  \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 
+  \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 
+  \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 
+  \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef 
+  \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe 
+  \xff 
    \x{a2} \x{84}
 0: \x{a2} \x{84}
    A Z

--- a/pcre/testdata/testoutput18-32
+++ b/pcre/testdata/testoutput18-32
@@ -749,7 +749,7 @@ Options: utf
 No first char
 Need char = 'x'
 Subject length lower bound = 5
-Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 
+Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 \x85 \xa0 
    AB\x{85}xxx\x{a0}XYZ
 0: \x{85}xxx\x{a0}
    AB\x{a0}xxx\x{85}XYZ
@@ -761,20 +761,20 @@ Options: utf
 No first char
 Need char = ' '
 Subject length lower bound = 3
-Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e 
-  \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d 
-  \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ 
-  A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e 
-  f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 
-  \x84 \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 
-  \x94 \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 
-  \xa4 \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 
-  \xb3 \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 
-  \xc2 \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 
-  \xd1 \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf 
-  \xe0 \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee 
-  \xef \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd 
-  \xfe \xff 
+Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0e \x0f 
+  \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e 
+  \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C 
+  D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h 
+  i j k l m n o p q r s t u v w x y z { | } ~ \x7f \x80 \x81 \x82 \x83 \x84 
+  \x86 \x87 \x88 \x89 \x8a \x8b \x8c \x8d \x8e \x8f \x90 \x91 \x92 \x93 \x94 
+  \x95 \x96 \x97 \x98 \x99 \x9a \x9b \x9c \x9d \x9e \x9f \xa1 \xa2 \xa3 \xa4 
+  \xa5 \xa6 \xa7 \xa8 \xa9 \xaa \xab \xac \xad \xae \xaf \xb0 \xb1 \xb2 \xb3 
+  \xb4 \xb5 \xb6 \xb7 \xb8 \xb9 \xba \xbb \xbc \xbd \xbe \xbf \xc0 \xc1 \xc2 
+  \xc3 \xc4 \xc5 \xc6 \xc7 \xc8 \xc9 \xca \xcb \xcc \xcd \xce \xcf \xd0 \xd1 
+  \xd2 \xd3 \xd4 \xd5 \xd6 \xd7 \xd8 \xd9 \xda \xdb \xdc \xdd \xde \xdf \xe0 
+  \xe1 \xe2 \xe3 \xe4 \xe5 \xe6 \xe7 \xe8 \xe9 \xea \xeb \xec \xed \xee \xef 
+  \xf0 \xf1 \xf2 \xf3 \xf4 \xf5 \xf6 \xf7 \xf8 \xf9 \xfa \xfb \xfc \xfd \xfe 
+  \xff 
    \x{a2} \x{84}
 0: \x{a2} \x{84}
    A Z

--- a/pcre/testdata/testoutput19
+++ b/pcre/testdata/testoutput19
@@ -85,4 +85,24 @@ No starting char list
        End
 ------------------------------------------------------------------

+/\x{212a}+/i8SI
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: K k \xff 
+    KKkk\x{212a}
+ 0: KKkk\x{212a}
+
+/s+/i8SI
+Capturing subpattern count = 0
+Options: caseless utf
+No first char
+No need char
+Subject length lower bound = 1
+Starting chars: S s \xff 
+    SSss\x{17f}
+ 0: SSss\x{17f}
+
 /-- End of testinput19 --/ 
--- a/pcre/testdata/testoutput2
+++ b/pcre/testdata/testoutput2
@@ -5821,13 +5821,13 @@ No match
 No match

 /a{11111111111111111111}/I
-Failed: number too big in {} quantifier at offset 22
+Failed: number too big in {} quantifier at offset 8

 /(){64294967295}/I
-Failed: number too big in {} quantifier at offset 14
+Failed: number too big in {} quantifier at offset 9

 /(){2,4294967295}/I
-Failed: number too big in {} quantifier at offset 15
+Failed: number too big in {} quantifier at offset 11

 "(?i:a)(?i:b)(?i:c)(?i:d)(?i:e)(?i:f)(?i:g)(?i:h)(?i:i)(?i:j)(k)(?i:l)A\1B"I
 Capturing subpattern count = 1
@@ -14093,6 +14093,30 @@ Failed: malformed number or name after (?( at offset 4
 /(?(R&6yh)abc)/
 Failed: group name must start with a non-digit at offset 5

+/(((a\2)|(a*)\g<-1>))*a?/BZ
+------------------------------------------------------------------
+        Bra
+        Brazero
+        SCBra 1
+        Once
+        CBra 2
+        CBra 3
+        a
+        \2
+        Ket
+        Alt
+        CBra 4
+        a*
+        Ket
+        Recurse
+        Ket
+        Ket
+        KetRmax
+        a?+
+        Ket
+        End
+------------------------------------------------------------------
+
 /-- Test the ugly "start or end of word" compatibility syntax --/

 /[[:<:]]red[[:>:]]/BZ
@@ -14149,4 +14173,37 @@ Failed: parentheses are too deeply nested (stack check) at offset 0
 /(((((a)))))/Q
 ** Missing 0 or 1 after /Q

+/^\w+(?>\s*)(?<=\w)/BZ
+------------------------------------------------------------------
+        Bra
+        ^
+        \w+
+        Once_NC
+        \s*+
+        Ket
+        AssertB
+        Reverse
+        \w
+        Ket
+        Ket
+        End
+------------------------------------------------------------------
+
+/\othing/
+Failed: missing opening brace after \o at offset 1
+
+/\o{}/
+Failed: digits missing in \x{} or \o{} at offset 1
+
+/\o{whatever}/
+Failed: non-octal character in \o{} (closing brace missing?) at offset 3
+
+/\xthing/
+
+/\x{}/
+Failed: digits missing in \x{} or \o{} at offset 3
+
+/\x{whatever}/
+Failed: non-hex character in \x{} (closing brace missing?) at offset 3
+
 /-- End of testinput2 --/
--- a/pcre/testdata/testoutput6
+++ b/pcre/testdata/testoutput6
@@ -719,9 +719,9 @@ No match
 0: \x{6e9}
    \x{060b}
 0: \x{60b}
-    \x{061c}
- 0: \x{61c}
    ** Failers
+No match
+    \x{061c}
 No match
    X\x{06e9}   
 No match
@@ -2457,4 +2457,8 @@ No match
    Ss\x{17f}
 0: Ss\x{17f}

+/^s?c/mi8
+    scat
+ 0: sc
+
 /-- End of testinput6 --/
--- a/pcre/testdata/testoutput7
+++ b/pcre/testdata/testoutput7
@@ -2287,4 +2287,12 @@ No match
        End
 ------------------------------------------------------------------

+/^s?c/mi8I
+Capturing subpattern count = 0
+Options: caseless multiline utf
+First char at start or follows newline
+Need char = 'c' (caseless)
+    scat
+ 0: sc
+
 /-- End of testinput7 --/
--- a/pcre/testdata/testoutput8
+++ b/pcre/testdata/testoutput8
@@ -7777,4 +7777,12 @@ Matched, but offsets vector is too small to show all matches
 1: aaa
 2: aa

+'\A(?:[^\"]++|\"(?:[^\"]*+|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+
+'\A(?:[^\"]++|\"(?:[^\"]++|\"\")*+\")++'
+    NON QUOTED \"QUOT\"\"ED\" AFTER \"NOT MATCHED
+ 0: NON QUOTED "QUOT""ED" AFTER 
+
 /-- End of testinput8 --/
--- a/pcre/ucp.h
+++ b/pcre/ucp.h
@@ -192,7 +192,31 @@ enum {
  ucp_Miao,
  ucp_Sharada,
  ucp_Sora_Sompeng,
-  ucp_Takri
+  ucp_Takri,
+  /* New for Unicode 7.0.0: */
+  ucp_Bassa_Vah,
+  ucp_Caucasian_Albanian,
+  ucp_Duployan,
+  ucp_Elbasan,
+  ucp_Grantha,
+  ucp_Khojki,
+  ucp_Khudawadi,
+  ucp_Linear_A,
+  ucp_Mahajani,
+  ucp_Manichaean,
+  ucp_Mende_Kikakui,
+  ucp_Modi,
+  ucp_Mro,
+  ucp_Nabataean,
+  ucp_Old_North_Arabian,
+  ucp_Old_Permic,
+  ucp_Pahawh_Hmong,
+  ucp_Palmyrene,
+  ucp_Psalter_Pahlavi,
+  ucp_Pau_Cin_Hau,
+  ucp_Siddham,
+  ucp_Tirhuta,
+  ucp_Warang_Citi
 };

 #endif