Commits · f4c2c29ba49f74d1eb147d677d98bcde12de6be0 · Kirill Smelkov / Zope

15 May, 2002 2 commits

Use the new SetOps for mass union/intersection. · f4c2c29b
Tim Peters authored May 15, 2002

f4c2c29b

Squash bug duplication by moving the clever mass-union and mass- · 08fe38f4

Tim Peters authored May 15, 2002

intersection gimmicks into their own functions with their own test suite.

This turned up two bugs:

1. The mass weighted union gimmick was incorrect when passed a list with
   a single mapping.  In that case, it neglected to multiply the mapping
   by the given weight.

2. The underlying weighted{Intersection, Union} code does something crazy
   if you pass it weights less than 0.  I had vaguely hoped to be able
   to subtract scores by passing 1 and -1 as weights, but this doesn't
   work.  It's hard to say exactly what it does then.  The line
       weightedUnion(IIBTree(), mapping, 1, -2)
   seems to return a mapping with the same keys, but *all* of whose
   values are -2, regardless of the original mapping's values.

08fe38f4

14 May, 2002 21 commits

add some examples of what google and ultraseek return for queries of · 8927f982
Jeremy Hylton authored May 14, 2002
```
www.python.org.

next step is to add queries using ZCTextIndex
```
8927f982
Remove _ in call to a string's split() method. · cf64b682
Jeremy Hylton authored May 14, 2002
```
A little overzealous in the last checkin.
```
cf64b682

Make OkapiIndex the default index. · 154154ed

Jeremy Hylton authored May 14, 2002

ZCTextIndex has grown a new argument with a default value that can be
used to specify an Index class to use.  The default is OkapiIndex.Index.

There is a little kludge to make the test succeed.
testZCTestIndex.IndexTests uses the Index.Index tests instead of
OkapiIndex.Index.  Tim will probably fix this.

154154ed

Coding convention update: avoid use of "__" prefix for instance vars. · b3cb1b87
Fred Drake authored May 14, 2002

b3cb1b87
Consistently use a single leading underscore for instance variable · 9db492f1
Guido van Rossum authored May 14, 2002
```
names.
```
9db492f1
Use underscore for internal methods · 769fad63
Jeremy Hylton authored May 14, 2002

769fad63

Some cosmetic changes · dbdffd61

Jeremy Hylton authored May 14, 2002

Re-order imports so that all Zope imports go together and are separate
from all the ZCTextIndex imports.

Reformat _apply_index() doc string to use std Python style, which is
one-line summary followed by paragraphs of text that start at the same
offset as the function name.

Do comparison of None using is instead of ==.

dbdffd61

Add a comment about some of the data structures. · 1d0d9654
Fred Drake authored May 14, 2002

1d0d9654
Added clear method to comply with plug-in index API. · be21d3ca
Casey Duncan authored May 14, 2002

be21d3ca
Added ZMI icons for index and lexicon objects. · a33a96b9
Casey Duncan authored May 14, 2002

a33a96b9
Removed nbest query option, since it is not supported. · fce7f51a
Casey Duncan authored May 14, 2002

fce7f51a

Integration with Zope complete. ZCTextIndex is now a bonafide Plug-in index. · 0226c34d

Casey Duncan authored May 14, 2002

Some additional plug-in index APIs were added to ZCTextIndex and support APIs added to Index and Lexicon.

_apply_index does not use NBest since ZCatalog has an incompatible strategy for finding the top results. NBest might be abstracted from this product for general consumption in application code.

0226c34d

Remove an obsolete comment. · e5cbcd43
Tim Peters authored May 14, 2002

e5cbcd43
Add mechanics of compiling the Products.ZCTextIndex.stopper module. · c800a471
Fred Drake authored May 14, 2002

c800a471
Add test cases for the C version of StopWordRemover. · 5069d2f2
Fred Drake authored May 14, 2002

5069d2f2
Fix _union() -- it wasn't getting what it expected from pop_smallest() · ca38cb41
Guido van Rossum authored May 14, 2002
```
inside the while loop either.
```
ca38cb41
Simplify code to allow multiple "false" end tags in CDATA content. · 59b09255
Fred Drake authored May 14, 2002

59b09255
Add double end tag to test cdata ignore · faa28298
matt@zope.com authored May 14, 2002

faa28298

There's no point in encoding the number of continuation bytes in the · 5ee2de80

Guido van Rossum authored May 14, 2002

first byte -- we always find the end of a particular encoded number by
searching for the next byte with the high bit set. This simplifies
the encoding and gives us more space for small encodings: 128 values
can now be encoded in 1 byte, and 16K in 2 bytes.

5ee2de80

Merged TextIndexDS9-branch into trunk. · 61e89f2f
Guido van Rossum authored May 14, 2002

61e89f2f

Many small cleanups and simplifications. · a340cb9d

Jeremy Hylton authored May 14, 2002

_indexedSearch():

    Simplify logic that called _apply_index() for each index in the
    catalog.  The if statement under the comment "Optimization" had
    identical code on either branch.  Perhaps the odd indentation made
    this confusing.  Regardless, remove the conditional.

    Change computation of normalized scores to multiply first, then
    divide.  Use literal 100. to make sure mult and div are floating
    point ops.

searchResults():

    Simplify logic at beginning of searchResults().  The first two
    conditionals depended on kw, so organize the logic to make that
    clearer.

    Write helper method to find "sort-on" and "sort-index" instead of
    duplicating code in searchResults().

    For case were results are sorted, simplify construction of the
    final LazyCat and make it more efficient to boot.  Instead of use
    a list comprehension and a reduce + lambda to construct list and
    length of contained lists, do it with one explicit for loop that
    constructs both values.

        Note: I did detailed timing stats on three ways to compute the
        length of a sequence of sequences.  reduce + lambda was the
        slowest.  For short lists, an explicit for loop is fastest.
        For long lists, reduce(operater.add, map(len, list)) is
        fastest.  The explicit for loop is big win here, because we've
        got to walk over the elements anyway to undo the Schwarzian
        transform.

Sundry:

Use getattr() with default value of None in preference to hasattr()
followed by getattr().  This gets the same result with half the work.

Changes for consistent and frequent use of whitespace.

Use types.StringType and isinstance() to test for strings.

a340cb9d

13 May, 2002 1 commit
- Remove unused function to silence compiler warning. · 8b2a64dc
  Jeremy Hylton authored May 13, 2002
  
  8b2a64dc
10 May, 2002 3 commits
- "Fix" false bug: When something that looks like an end tag occurs in CDATA · 1c2de949
  Fred Drake authored May 10, 2002
```
content, but does not match the expected end tag, treat it as character data.
This is mostly useful when script includes string literal that include end
tags.
```
  1c2de949
- Add a comment explaining why the new test is wrong. · 070b5be2
  Fred Drake authored May 10, 2002
  
  070b5be2
- Route html end tag inside html comment inside a CDATA mode triggering tag · 50b412e9
  matt@zope.com authored May 10, 2002
```
from 2.5 branch.
```
  50b412e9
09 May, 2002 2 commits
- · 518fe525
  Andreas Jung authored May 09, 2002
```
      - Collector 386: workaround for hanging FTP connections
        with NcFTP
```
  518fe525
- avoid using PyObject_CallFunction when a single parameter may be a tuple · 9945cdf2
  Toby Dickenson authored May 09, 2002
  
  9945cdf2
07 May, 2002 5 commits
- Add "remove_stale_bytecode()". This removes .pyc and .pyo files that · efd87296
  Guido van Rossum authored May 07, 2002
```
don't have a corresponding .py file, to prevent tests that import
deleted modules from running using the stale bytecode files.  This has
bitten enough people enough times that it's time it became a standard
part of every test suite runner.  (Zope3 already has it.)

Somebody merge this into the Zope2 trunk please.
```
  efd87296
- Merged distutils-config-branch. · 8215cdb5
  Shane Hathaway authored May 07, 2002
  
  8215cdb5
- correct unicode-aware joining of sequences · fd5a6aa6
  Toby Dickenson authored May 07, 2002
  
  fd5a6aa6
- allow the initial value of a property to contain unicode characters · 3fd44509
  Toby Dickenson authored May 07, 2002
  
  3fd44509
- refactored much code that conditionally called v.read(). Also, allow the input... · 5185570e
  Toby Dickenson authored May 07, 2002
```
refactored much code that conditionally called v.read(). Also, allow the input to a converter to be a unicode string. This is not used by ZPublisher, but is helpful for other code that uses the converters such as OFS.PropertyManager
```
  5185570e
06 May, 2002 1 commit
- removed unused imports · 1655e985
  Andreas Jung authored May 06, 2002
  
  1655e985
03 May, 2002 2 commits

Eased the query optimization introduced with Zope 2.4.X · bbddef1c

Andreas Jung authored May 03, 2002

(revision 1.72 or Catalog.py).

TTW searches are no longer optimized to restore the old
behaviour (user does not fill out any form fields ->
return all hits)

For application related searches we keep the optimization
but it is possible to disable optimization by passing
'optimize=0' as additional parameters to searchResults().

bbddef1c

fixed typo · e9c8fea9
Andreas Jung authored May 03, 2002

e9c8fea9

30 Apr, 2002 3 commits

Reformat text. · 94236806
Jeremy Hylton authored Apr 30, 2002

94236806
Reflow long line. · d5004255
Jeremy Hylton authored Apr 30, 2002

d5004255

A few small cleanups. · 22c494a0

Jeremy Hylton authored Apr 30, 2002

Simplify initZopeSplitter() and remove unnecessary PyErr_Occurred().

Use string macros for objects that are guaranteed to be strings.

Remove unnecessary \ at end of line.

In innermost loop of splitter function, replace ASSIGN() macro with
Py_DECREF() and simple assignment.  The macro was doing more work
than necessary because it called XDECREF on an object that was
guaranteed not to be NULL.

Use less horizontal whitespace in next_word().

22c494a0