1. 21 May, 2002 9 commits
  2. 20 May, 2002 15 commits
    • Tim Peters's avatar
      Since I did the work to write the inner Okapi scoring loop in C, may as · 315bcde9
      Tim Peters authored
      well check it in.  This yields an overall 133% speedup on a "hot" search
      for 'python' in my python-dev archive (a word that appears in all but
      2 documents).  For those who read the email, turned out it was a
      significant speedup to iterate over an IIBTree's items rather than to
      materialize the items into an explicit list first.
      
      This is now within 20% of simply doing "IIBucket(the_IIBTree)" (i.e.,
      no arithmetic at all), so there's no significant possibility remaining
      for speeding the inner score loop.
      315bcde9
    • Guido van Rossum's avatar
      setUp(): assign the lexicon to self.lexicon directly rather than · 53b46dc9
      Guido van Rossum authored
      creating it anonymously and then pulling it out of the zc_index
      object.
      53b46dc9
    • Guido van Rossum's avatar
      Always have a splitter. (We'll change this to a choice of splitters · 0ff6d33b
      Guido van Rossum authored
      once we have more than one on the menu.)
      0ff6d33b
    • Guido van Rossum's avatar
      pt_changePrefs(): the dtprefs_cols/rows arguments could be expressed · d53e1580
      Guido van Rossum authored
      in percentages; strip the percent sign to avoid a traceback calling
      int() when these variables are used.
      d53e1580
    • Guido van Rossum's avatar
      _apply_index(): return None when the query string is empty. · 130af9ce
      Guido van Rossum authored
      I'm unclear whether this is really the right thing, but at least this
      prevents crashes when nothing is entered in the search box.
      130af9ce
    • Guido van Rossum's avatar
      index_object(): don't die if obj doesn't have an attribute named · 68957496
      Guido van Rossum authored
      _fieldname; simply return 0 in this case.
      68957496
    • Guido van Rossum's avatar
      0a97b655
    • Guido van Rossum's avatar
    • Guido van Rossum's avatar
      Add Zope Copyright notice. · 90bae6a7
      Guido van Rossum authored
      90bae6a7
    • Guido van Rossum's avatar
      Add Zope Copyright notice. · 53c5d967
      Guido van Rossum authored
      Fix typo in docstring.
      53c5d967
    • Guido van Rossum's avatar
      QueryParser.py: · 47bb995d
      Guido van Rossum authored
      - Rephrased the description of the grammar, pointing out that the
        lexicon decides on globbing syntax.
      
      - Refactored term and atom parsing (moving atom parsing into a
        separate method).  The previously checked-in version accidentally
        accepted some invalid forms like ``foo AND -bar''; this is fixed.
      
      tests/testQueryParser.py:
      
      - Each test is now in a separate method; this produces more output
        (alas) but makes pinpointing the errors much simpler.
      
      - Added some tests catching ``foo AND -bar'' and similar.
      
      - Added an explicit test class for the handling of stopwords.  The
        "and/" test no longer has to check self.__class__.
      
      - Some refactoring of the TestQueryParser class; the utility methods
        are now in a base class TestQueryParserBase, in a different order;
        compareParseTrees() now shows the parse tree it got when raising an
        exception.  The parser is now self.parser instead of self.p (see
        below).
      
      tests/testZCTextIndex.py:
      
      - setUp() no longer needs to assign to self.p; the parser is
        consistently called self.parser now.
      47bb995d
    • Guido van Rossum's avatar
    • Guido van Rossum's avatar
    • Guido van Rossum's avatar
      Refactor the query parser to rely on the lexicon for parsing terms. · b82b2746
      Guido van Rossum authored
      ILexicon.py:
      
        - Added parseTerms() and isGlob().
      
        - Added get_word(), get_wid() (get_word() is old; get_wid() for symmetry).
      
        - Reflowed some text.
      
      IQueryParser.py:
      
        - Expanded docs for parseQuery().
      
        - Added getIgnored() and parseQueryEx().
      
      IPipelineElement.py:
      
        - Added processGlob().
      
      Lexicon.py:
      
        - Added parseTerms() and isGlob().
      
        - Added get_wid().
      
        - Some pipeline elements now support processGlob().
      
      ParseTree.py:
      
        - Clarified the error message for calling executeQuery() on a
          NotNode.
      
      QueryParser.py (lots of changes):
      
        - Change private names __tokens etc. into protected _tokens etc.
      
        - Add getIgnored() and parseQueryEx() methods.
      
        - The atom parser now uses the lexicon's parseTerms() and isGlob()
          methods.
      
        - Query parts that consist only of stopwords (as determined by the
          lexicon), or of stopwords and negated terms, yield None instead of
          a parse tree node; the ignored term is added to self._ignored.
          None is ignored when combining terms for AND/OR/NOT operators, and
          when an operator has no non-None operands, the operator itself
          returns None.  When this None percolates all the way to the top,
          the parser raises a ParseError exception.
      
      tests/testQueryParser.py:
      
        - Changed test expressions of the form "a AND b AND c" to "aa AND bb
          AND cc" so that the terms won't be considered stopwords.
      
        - The test for "and/" can only work for the base class.
      
      tests/testZCTextIndex.py:
      
        - Added copyright notice.
      
        - Refactor testStopWords() to have two helpers, one for success, one
          for failures.
      
        - Change testStopWords() to require parser failure for those queries
          that have only stopwords or stopwords plus negated terms.
      
        - Improve compareSet() to sort the sets of keys, and use a more
          direct way of extracting the keys.  This wasn't strictly needed
          (nothing fails without this), but the old approach of copying the
          keys into a dict in a loop depends on the dict hashing to always
          return keys in the same order.
      b82b2746
    • Matt Behrens's avatar
      revert stopper setup.py-age; stopper is not in the Zope module. ok · 5f66a3ce
      Matt Behrens authored
      guido@.
      
      when/if merge day comes for the installer this will make for less
      confusion :-)
      5f66a3ce
  3. 19 May, 2002 6 commits
  4. 18 May, 2002 5 commits
  5. 17 May, 2002 5 commits