1. 17 Nov, 2006 1 commit
    • malff/marcsql@weblab.(none)'s avatar
      Bug#19194 (Right recursion in parser for CASE causes excessive stack usage, · ce5a3fcc
      malff/marcsql@weblab.(none) authored
        limitation)
      
      Note to the reviewer
      ====================
      
      Warning: reviewing this patch is somewhat involved.
      Due to the nature of several issues all affecting the same area,
      fixing separately each issue is not practical, since each fix can not be
      implemented and tested independently.
      In particular, the issues with
      - rule recursion
      - nested case statements
      - forward jump resolution (backpatch list)
      are tightly coupled (see below).
      
      Definitions
      ===========
      
      The expression
        CASE expr
        WHEN expr THEN expr
        WHEN expr THEN expr
        ...
        END
      is a "Simple Case Expression".
      
      The expression
        CASE
        WHEN expr THEN expr
        WHEN expr THEN expr
        ...
        END
      is a "Searched Case Expression".
      
      The statement
        CASE expr
        WHEN expr THEN stmts
        WHEN expr THEN stmts
        ...
        END CASE
      is a "Simple Case Statement".
      
      The statement
        CASE
        WHEN expr THEN stmts
        WHEN expr THEN stmts
        ...
        END CASE
      is a "Searched Case Statement".
      
      A "Left Recursive" rule is like
        list:
            element
          | list element
          ;
      
      A "Right Recursive" rule is like
        list:
            element
          | element list
          ;
      
      Left and right recursion produces the same language, the difference only
      affects the *order* in which the text is parsed.
      
      In a descendant parser (usually written manually), right recursion works
      very well, and is typically implemented with a while loop.
      In an ascendant parser (yacc/bison) left recursion works very well,
      and is implemented naturally by the parser stack.
      In both cases, using the wrong type or recursion is very bad and should be
      avoided, as it causes technical issues with the parser implementation.
      
      Before this change
      ==================
      
      The "Simple Case Expression" and "Searched Case Expression" were both
      implemented by the "when_list" and "when_list2" rules, which are left
      recursive (ok).
      
      These rules, however, used lex->when_list instead of using the parser stack,
      which is more complex that necessary, and potentially dangerous because
      of other rules using THD::reset_lex.
      
      The "Simple Case Statement" and "Searched Case Statements" were implemented
      by the "sp_case", "sp_whens" and in part by "sp_proc_stmt" rules.
      Both cases were right recursive (bad).
      
      The grammar involved was convoluted, and is assumed to be the results of
      tweaks to get the code generation to work, but is not what someone would
      naturally write.
      
      In addition, using a common rule for both "Simple" and "Searched" case
      statements was implemented with sp_head::m_flags |= IN_SIMPLE_CASE,
      which is a flag and not a stack, and therefore does not take into account
      *nested* case statements. This leads to incorrect generated code, and either
      a server crash or an incorrect result.
      
      With regards to the backpatch mechanism, a *different* backpatch list was
      created for each jump from "WHEN expr THEN stmt" to "END CASE", which
      relied on the grammar to be right recursive.
      This is a mis-use of the backpatch list, since this list can resolve
      multiple references to the same target at once.
      
      The optimizer algorithm used to detect dead code in the "assembly" SQL
      instructions, implemented by sp_head::opt_mark(uint ip), was recursive
      in some cases (a conditional jump pointing forward to another conditional
      jump).
      In case of specially crafted code, like
      - a long list of "IF expr THEN stmt END IF"
      - a long CASE statement
      this would actually cause a server crash with a stack overflow.
      In general, having a stack that grows proportionally with user data (the
      SQL code given by the client in a CREATE PROCEDURE) is to be avoided.
      
      In debug builds only, creating a SP / SF / Trigger which had a significant
      amount of code would spend --literally-- several minutes in sp_head::create,
      because of the debug code involved with DBUG_PRINT("info", ("Code %s ...
      There are several issues with this code:
      - in a CASE with 5 000 WHEN, there are 15 000 instructions generated,
        which create a sting representation of the code which is 500 000 bytes
        long,
      - using a String instead of an io stream causes performances to degrade
        to a total server freeze, as time is spent doing realloc of a buffer
        always too short,
      - Printing a 500 000 long string in the debug log is too verbose,
      - Generating this string even when DBUG_PRINT is off is useless,
      - Having code that potentially can affect the server behavior, used with
        #ifdef / #endif is useful in some cases, but is also a bad practice.
      
      After this change
      =================
      
      "Case Expressions" (both simple and searched) have been simplified to
      not use LEX::when_list, which has been removed.
      
      Considering all the issues affecting case statements, the grammar for these
      has been totally re written.
      
      The existing actions, used to generate "assembly" sp_inst* code, have been
      preserved but moved in the new grammar, with the following changes:
      
      a) Bison rules are no longer shared between "Simple" and "Searched" case
      statements, because a stack instead of a flag is required to handle them.
      Nested statements are handled naturally by the parser stack, which by
      definition uses the correct rule in the correct context.
      Nested statements of the opposite type (simple vs searched) works correctly.
      The flag sp_head::IN_SIMPLE_CASE is no longer used.
      This is a step towards resolution of WL#2999, which correctly identified
      that temporary parsing flags do not belong to sp_head.
      The code in the action is shared by mean of the case_stmt_action_xxx()
      helpers.
      
      b) The backpatch mechanism, used to resolve forward jumps in the generated
      code, has been changed to:
      - create a label for the instruction following 'END CASE',
      - register each jump at the end of a "WHEN expr THEN stmt" in a *unique*
        backpatch list associated with the 'END CASE' label
      - resolve all the forward jumps for this label at once.
      
      In addition, the code involving backpatch has been commented, so that a
      reader can now understand by reading matching "Registering" and "Resolving"
      comments how the forward jumps are resolved and what target they resolve to,
      as this is far from evident when reading the code alone.
      
      The implementation of sp_head::opt_mark() has been revised to avoid
      recursive calls from jump instructions, and instead add the jump location
      to the list of paths to explore during the flow analysis of the instruction
      graph, with a call to sp_head::add_mark_lead().
      In addition, the flow analysis will stop if an instruction has already
      been marked as reachable, which the previous code failed to do in the
      recursive case.
      sp_head::opt_mark() is now private, to prevent new calls to this method from
      being introduced.
      
      The debug code present in sp_head::create() has been removed.
      Considering that SHOW PROCEDURE CODE is also available in debug builds,
      and can be used anytime regardless of the trace level, as opposed to
      "CREATE PROCEDURE" time and only if the trace was on,
      removing the code actually makes debugging easier (usable trace).
      
      Tests have been written to cover the parser overflow (big CASE),
      and to cover nested CASE statements.
      ce5a3fcc
  2. 12 Oct, 2006 3 commits
    • kroki/tomash@moonlight.intranet's avatar
      Fix after manual merge. · e7c31e81
      kroki/tomash@moonlight.intranet authored
      e7c31e81
    • kroki/tomash@moonlight.intranet's avatar
      Merge moonlight.intranet:/home/tomash/src/mysql_ab/mysql-5.0 · 9e942999
      kroki/tomash@moonlight.intranet authored
      into  moonlight.intranet:/home/tomash/src/mysql_ab/mysql-5.0-bug20953
      9e942999
    • kroki/tomash@moonlight.intranet's avatar
      BUG#20953: create proc with a create view that uses local vars/params · 591c06d4
      kroki/tomash@moonlight.intranet authored
                 should fail to create
      
      The problem was that this type of errors was checked during view
      creation, which doesn't happen when CREATE VIEW is a statement of
      a created stored routine.
      
      The solution is to perform the checks at parse time.  The idea of the
      fix is that the parser checks if a construction just parsed is allowed
      in current circumstances by testing certain flags, and this flags are
      reset for VIEWs.
      
      The side effect of this change is that if the user already have
      such bogus routines, it will now get a error when trying to do
      
        SHOW CREATE PROCEDURE proc;
      
      (and some other) and when trying to execute such routine he will get
      
        ERROR 1457 (HY000): Failed to load routine test.p5. The table mysql.proc is missing, corrupt, or contains bad data (internal code -6)
      
      However there should be very few such users (if any), and they may
      (and should) drop these bogus routines.
      591c06d4
  3. 10 Oct, 2006 10 commits
  4. 09 Oct, 2006 1 commit
    • malff/marcsql@weblab.(none)'s avatar
      Bug#21462 (Stored procedures with no arguments require parenthesis) · 6e809b24
      malff/marcsql@weblab.(none) authored
      The syntax of the CALL statement, to invoke a stored procedure, has been
      changed to make the use of parenthesis optional in the argument list.
      With this change, "CALL p;" is equivalent to "CALL p();".
      
      While the SQL spec does not explicitely mandate this syntax, supporting it
      is needed for practical reasons, for integration with JDBC / ODBC connectors.
      
      Also, warnings in the sql/sql_yacc.yy file, which were not reported by Bison 2.1
      but are now reported by Bison 2.2, have been fixed.
      
      The warning found were:
      bison -y -p MYSQL  -d --debug --verbose sql_yacc.yy
      sql_yacc.yy:653.9-18: warning: symbol UNLOCK_SYM redeclared
      sql_yacc.yy:656.9-17: warning: symbol UNTIL_SYM redeclared
      sql_yacc.yy:658.9-18: warning: symbol UPDATE_SYM redeclared
      sql_yacc.yy:5169.11-5174.11: warning: unused value: $2
      sql_yacc.yy:5208.11-5220.11: warning: unused value: $5
      sql_yacc.yy:5221.11-5234.11: warning: unused value: $5
      conflicts: 249 shift/reduce
      
      "unused value: $2" correspond to the $$=$1 assignment in the 1st {} block
      in table_ref -> join_table {} {},
      which does not procude a result ($$) for the rule but an intermediate $2
      value for the action instead.
      "unused value: $5" are similar, with $$ assignments in {} actions blocks
      which are not for the final reduce.
      6e809b24
  5. 08 Oct, 2006 3 commits
  6. 06 Oct, 2006 8 commits
  7. 05 Oct, 2006 5 commits
  8. 04 Oct, 2006 1 commit
  9. 03 Oct, 2006 8 commits