Commits · b21d1ff22952382860b6a163ae9325ea5d3bc30b · Kirill Smelkov / linux

05 Mar, 2005 40 commits

[PATCH] PPC/PPC64: Abstract cpu_feature checks. · b21d1ff2

Olof Johansson authored Mar 04, 2005

Abstract most manual mask checks of cpu_features with cpu_has_feature()
Signed-off-by: Olof Johansson <olof@austin.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b21d1ff2

[PATCH] make mapping->tree_lock an rwlock · 1eeae015

bill.irwin@oracle.com authored Mar 04, 2005

Convert mapping->tree_lock to an rwlock.

with:

dd if=/dev/zero of=foo bs=1 count=2M  0.80s user 4.15s system 99% cpu 4.961 total
dd if=/dev/zero of=foo bs=1 count=2M  0.73s user 4.26s system 100% cpu 4.987 total
dd if=/dev/zero of=foo bs=1 count=2M  0.79s user 4.25s system 100% cpu 5.034 total

dd if=foo of=/dev/null bs=1  0.80s user 3.12s system 99% cpu 3.928 total
dd if=foo of=/dev/null bs=1  0.77s user 3.15s system 100% cpu 3.914 total
dd if=foo of=/dev/null bs=1  0.92s user 3.02s system 100% cpu 3.935 total

(3.926: 1.87 usecs)

without:

dd if=/dev/zero of=foo bs=1 count=2M  0.85s user 3.92s system 99% cpu 4.780 total
dd if=/dev/zero of=foo bs=1 count=2M  0.78s user 4.02s system 100% cpu 4.789 total
dd if=/dev/zero of=foo bs=1 count=2M  0.82s user 3.94s system 99% cpu 4.763 total
dd if=/dev/zero of=foo bs=1 count=2M  0.71s user 4.10s system 99% cpu 4.810 tota

dd if=foo of=/dev/null bs=1  0.76s user 2.68s system 100% cpu 3.438 total
dd if=foo of=/dev/null bs=1  0.74s user 2.72s system 99% cpu 3.465 total
dd if=foo of=/dev/null bs=1  0.67s user 2.82s system 100% cpu 3.489 total
dd if=foo of=/dev/null bs=1  0.70s user 2.62s system 99% cpu 3.326 total

(3.430: 1.635 usecs)


So on a P4, the additional cost of the rwlock is ~240 nsecs for a
one-byte-write().  On the other hand:

From: Peter Chubb <peterc@gelato.unsw.edu.au>

  As part of the Gelato scalability focus group, we've been running OSDL's
  Re-AIM7 benchmark with an I/O intensive load with varying numbers of
  processors.  The current kernel shows severe contention on the tree_lock in
  the address space structure when running on tmpfs or ext2 on a RAM disk.


  Lockstat output for a 12-way:

  SPINLOCKS         HOLD            WAIT
    UTIL  CON    MEAN(  MAX )   MEAN(  MAX )(% CPU)     TOTAL NOWAIT SPIN RJECT  NAME

          5.5%  0.4us(3177us)   28us(  20ms)(44.2%) 131821954 94.5%  5.5% 0.00%  *TOTAL*

   72.3% 13.1%  0.5us( 9.5us)   29us(  20ms)(42.5%)  50542055 86.9% 13.1%    0%  find_lock_page+0x30
   23.8%    0%  385us(3177us)    0us                    23235  100%    0%    0%  exit_mmap+0x50
   11.5% 0.82%  0.1us( 101us)   17us(5670us)( 1.6%)  50665658 99.2% 0.82%    0%  dnotify_parent+0x70


  Replacing the spinlock with a multi-reader lock fixes this problem,
  without unduly affecting anything else.

  Here are the benchmark results (jobs per minute at a 50-client level, average
  of 5 runs, standard deviation in parens) on an HP Olympia with 3 cells, 12
  processors, and dnotify turned off (after this spinlock, the spinlock in
  dnotify_parent is the worst contended for this workload).

  	 tmpfs...............               ext2...............
  #CPUs	 spinlock      rwlock               spinlock     rwlock
      1     7556(15)      7588(17)  +0.42%      3744(20)     3791(16) +1.25%
      2	 13743(31)     13791(33)  +0.35%      6405(30)     6413(24) +0.12%
      4	 23334(111)    22881(154) -2%        9648(51)     9595(50)  -0.55%
      8	 33580(240)    36163(190) +7.7%     13183(63)    13070(68)  -0.85%
     12	 28748(170)    44064(238)+53%      12681(49)	 14504(105)+14% 

  And on a pentium3 single processsor:
      1    4177(4)        4169(2)  -0.2%        3811(4)     3820(3) +0.23%

  I'm not sure what's happening in the 4-processor case.  The important thing to
  note is that with a spinlock, the benchmark shows worse performance for a 12
  than for an 8-way box; with the patch, the 12 way performs better, as
  expected.  We've done some runs with 16-way as well; without the patch below,
  the 16-way performs worse than the 12-way.


It's a tricky tradeoff, but large-smp is hurt a lot more by the spinlocks than
small-smp is by the rwlocks.  And I don't think we really want to implement
compile-time either-or-locks.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1eeae015

[PATCH] only unmap what intersects a direct_IO op · 3db29f35

Zach Brown authored Mar 04, 2005

Now that we're only invalidating the pages that intersected a direct IO
write we might as well only unmap the intersecting bytes as well.  This
passed a light fsx load with page cache, direct, and mmap IO.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

3db29f35

[PATCH] write and wait on range before direct io read · daf70db8

Zach Brown authored Mar 04, 2005

This adds filemap_write_and_wait_range(mapping, lstart, lend) which starts
writeback and waits on a range of pages.  We call this from
__blkdev_direct_IO with just the range that is going to be read by the
direct_IO read.  It was lightly tested with fsx and ext3 and passed.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

daf70db8

[PATCH] invalidate range of pages after direct IO write · 7afadfdc

Zach Brown authored Mar 04, 2005

Presently we invalidate all of a file's pages when writing to any part of
that file with direct-IO.

After a direct IO write only invalidate the pages that the write intersected.
invalidate_inode_pages2_range(mapping, pgoff start, pgoff end) is added and
called from generic_file_direct_IO().

While we're in there, invalidate_inode_pages2() was calling
unmap_mapping_range() with the wrong convention in the single page case.
It was providing the byte offset of the final page rather than the length
of the hole being unmapped.  This is also fixed.

This was lightly tested with a 10k op fsx run with O_DIRECT on a 16MB file
in ext3 on a junky old IDE drive.  Totaling vmstat columns of blocks read
and written during the runs shows that read traffic drops significantly.
The run time seems to have gone down a little.

Two runs before the patch gave the following user/real/sys times and total
blocks in and out:

0m28.029s 0m20.093s 0m3.166s 16673 125107 
0m27.949s 0m20.068s 0m3.227s 18426 126094

and after the patch:

0m26.775s 0m19.996s 0m3.060s 3505 124982
0m26.856s 0m19.935s 0m3.052s 3505 125279

akpm:

- Don't look up more pages than we're going to use

- Don't test page->index until we've locked the page

- Check for the cursor wrapping at the end of the mapping.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

7afadfdc

[PATCH] Move accounting function calls out of critical vm code paths · 545a604c

Christoph Lameter authored Mar 04, 2005

In the 2.6.11 development cycle function calls have been added to lots
of hot vm paths to do accounting. I think these should not go into the
final 2.6.1 release because these statistics can be collected in a different
way that does not require the updating of counters from frequently used
vm code paths and is consistent with the methods use elsewhere in the kernel
to obtain statistics.

These function calls are

acct_update_integrals	-> Account for processes based on stime changes
update_mem_hiwater	-> takes rss and total_vm hiwater marks.

acct_update_integrals is only useful to call if stime changes otherwise
it will simply return. It is therefore best to relocate the function call
to acct_update_integral into the function that updates stime which is
account_system_time and remove it from the vm code paths.

update_mem_hiwater finds the rss hiwater mark.  We call that from timer
context as well.  This means that processes' high-water marks are now
sampled statistically, at timer-interrupt time rather than
deterministically.  This may or may not be a problem..

This means that the rss limit is not always updated if rss is increased
and thus not as accurate. But the benefit is that the rss checks do no
pollute the vm paths and that it is consistent with the rss limit check.

The following patch removes acct_update_integrals and update_mem_hiwater
from the hot vm paths.
Signed-off-by: Christoph Lameter <clameter@sgi.com>

From: Jay Lan <jlan@sgi.com>

The new "move-accounting-function-calls-out-of-critical-vm-code-paths"
patch in 2.6.11-rc3-mm2 was different from the code i tested.

In particular, it mistakenly dropped the accounting routine calls
in fs/exec.c. The calls in do_execve() are needed to properly
initialize accounting fields. Specifically, the tsk->acct_stimexpd
needs to be initialized to tsk->stime.

I have discussed this with Christoph Lameter and he gave me full
blessings to bring the calls back.
Signed-off-by: Jay Lan <jlan@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

545a604c

[PATCH] Randomisation: top-of-stack randomization · bf57909f

Arjan van de Ven authored Mar 04, 2005

In addition to randomisation of the stack pointer within the stack, the stack
itself should be randomized too.  We need both approaches, we can only
randomize the stack itself in pagesize increments.  However randomizing large
ranges with the stackpointer runs into the situation where a huge chunk of the
stack rlimit is used by the randomisation; this is undesirable so we need to
do both.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bf57909f

[PATCH] Randomisation: add ADDR_NO_RANDOMIZE personality · 8ec0defa

Arjan van de Ven authored Mar 04, 2005

Introduce a personality that disables randomisation, so that users can use
setarch and related commands to run specific applications without
randomisation.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8ec0defa

[PATCH] Randomisation: enable by default · 3f06ae03

Arjan van de Ven authored Mar 04, 2005

Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

3f06ae03

[PATCH] Randomisation: mmap randomisation · 2fd35017

Arjan van de Ven authored Mar 04, 2005

The patch below randomizes the starting point of the mmap area.

This has the effect that all non-prelinked shared libaries and all bigger
malloc()s will be randomized between various invocations of the binary. 
Prelinked binaries get a address-hint from ld.so in their mmap and are thus
exempt from this randomisation, in order to not break the prelink advantage.
The randomisation range is 1 megabyte (this is bigger than the stack
randomisation since the stack randomisation only needs 16 bytes alignment
while the mmap needs page alignment, a 64kb range would not have given enough
entropy to be effective)
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2fd35017

[PATCH] Randomisation: stack randomisation · ccc875c1

Arjan van de Ven authored Mar 04, 2005

The patch below replaces the existing 8Kb randomisation of the userspace stack
pointer (which is currently only done for Hyperthreaded P-IVs) with a more
general randomisation over a 64Kb range.  64Kb is not a lot, but it's a start
and once the dust settles we can increase this value to a more agressive
value.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ccc875c1

[PATCH] Randomisation: add PF_RANDOMIZE · c518b108

Arjan van de Ven authored Mar 04, 2005

Even though there is a global flag to disable randomisation, it's useful to
have a per process flag too; the patch below introduces this per process flag
and automatically sets it for "new" binaries.

Eventually we will want to tie this to the legacy-va-space personality
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c518b108

[PATCH] Randomisation: infrastructure · 967423e8

Arjan van de Ven authored Mar 04, 2005

The patch below introduces get_random_int() and randomize_range(), two helpers
used in later patches in the series.  get_random_int() shares the tcp/ip
random number stuff so the CONFIG_INET ifdef needs to move slightly, and to
reduce the damange due to that, secure_ip_id() needs to move inside random.c
From: Frank Sorenson <frank@tuxrocks.com>
Acked-By: Jeff Dike <jdike@addtoit.com>

The stack randomization patches that went into 2.6.11-rc3-mm1 broke
compilation of ARCH=um.  This patch fixes compiling by adding arch_align_stack
back in.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Frank Sorenson <frank@tuxrocks.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

967423e8

[PATCH] Randomisation: global sysctl · 818bcba5

Arjan van de Ven authored Mar 04, 2005

This first patch of the series introduces a sysctl (default off) that
enables/disables the randomisation feature globally.  Since randomisation may
make it harder to debug really tricky situations (reproducability goes down),
the sysadmin needs a way to disable it globally.
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

818bcba5

[PATCH] initialize a spin lock in CPM2 uart driver · ba4cfe03

Kumar Gala authored Mar 04, 2005

Static initialization of spin locks that are otherwise accessed prior to
initialization.
Signed-off-by: Jaka Mocnik <jaka@activetools.si>
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ba4cfe03

[PATCH] IB: MAD cancel callbacks from thread · 998a5a63

Sean Hefty authored Mar 04, 2005

Modify ib_cancel_mad() to invoke a user's send completion callback from
a different thread context than that used by the caller.  This allows a
caller to hold a lock while calling cancel that is also acquired from
their send handler.
Signed-off-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

998a5a63

[PATCH] IB/mthca: implement query of device caps · 3af2e092

Michael S. Tsirkin authored Mar 04, 2005

Set device_cap_flags field in mthca's query_device method.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

3af2e092

[PATCH] IB/mthca: QP locking optimization · 1781dd37

Michael S. Tsirkin authored Mar 04, 2005

1. Split the QP spinlock into separate send and receive locks.

   The only place where we have to lock both is upon modify_qp, and
   that is not on data path.

2. Avoid taking any QP locks when polling CQ.

   This last part is achieved by getting rid of the cur field in
   mthca_wq, and calculating the number of outstanding WQEs by
   comparing the head and tail fields.  head is only updated by
   post, tail is only updated by poll.

   In a rare case where an overrun is detected, a CQ is locked and the
   overrun condition is re-tested, to avoid any potential for stale
   tail values.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1781dd37

[PATCH] IB/mthca: mem-free multicast table · c9171ea9

Roland Dreier authored Mar 04, 2005

Tie up one last loose end by mapping enough context memory to cover
the whole multicast table during initialization, and then enable
mem-free mode.  mthca now supports enough of mem-free mode so that
IPoIB works with a mem-free HCA.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

c9171ea9

[PATCH] IB/mthca: mem-free work request posting · 86a637df

Roland Dreier authored Mar 04, 2005

Implement posting send and receive work requests for mem-free mode.
Also tidy up a few things in send/receive posting for Tavor mode (fix
smp_wmb()s that should really be just wmb()s, annotate tests in the
fast path with likely()/unlikely()).
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

86a637df

[PATCH] IB/mthca: mem-free address vectors · bcd3df17

Roland Dreier authored Mar 04, 2005

Update address vector handling to support mem-free mode.  In mem-free
mode, the address vector (in hardware format) is copied by the driver
into each send work queue entry, so our address handle creation can
become pretty trivial: we just kmalloc() a buffer to hold the
formatted address vector.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

bcd3df17

[PATCH] IB/mthca: mem-free QP initialization · f7332bc0

Roland Dreier authored Mar 04, 2005

Update QP initialization and cleanup to handle mem-free mode.  In
mem-free mode, work queue sizes have to be rounded up to a power of 2,
we need to allocate doorbells, there must be memory mapped for the
entries in the QP and extended QP context table that we use, and the
entries of the receive queue must be initialized.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f7332bc0

[PATCH] IB/mthca: mem-free CQ operations · 2766dbe2

Roland Dreier authored Mar 04, 2005

Add support for CQ data path operations (request notification, update
consumer index) in mem-free mode.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2766dbe2

[PATCH] IB/mthca: mem-free CQ initialization · b925f6a8

Roland Dreier authored Mar 04, 2005

Update CQ initialization and cleanup to handle mem-free mode: we need
to make sure the HCA has memory mapped for the entry in the CQ context
table we will use and also allocate doorbell records.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

b925f6a8

[PATCH] IB/mthca: refactor CQ buffer allocate/free · 8688ff94

Roland Dreier authored Mar 04, 2005

Factor the allocation and freeing of completion queue buffers into
mthca_alloc_cq_buf() and mthca_free_cq_buf().  This makes the code
more readable and will eventually make handling userspace CQs simpler
(the kernel doesn't have to allocate a buffer at all).
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

8688ff94

[PATCH] IB/mthca: mem-free doorbell record writing · 2e922f5a

Roland Dreier authored Mar 04, 2005

Add a mthca_write_db_rec() to wrap writing doorbell records.  On
64-bit archs, this is just a 64-bit write, while on 32-bit archs it
splits the write into two 32-bit writes with a memory barrier to make
sure the two halves of the record are written in the correct order.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

2e922f5a

[PATCH] IB/mthca: mem-free doorbell record allocation · 83a2f648

Roland Dreier authored Mar 04, 2005

Mem-free mode requires the driver to allocate additional doorbell pages
for each user access region.  Add support for this in mthca_memfree.c,
and have the driver allocate a table in db_tab for kernel use.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

83a2f648

[PATCH] IB/mthca: tweak MAP_ICM_page firmware command · 117c1f28

Roland Dreier authored Mar 04, 2005

Have MAP_ICM_page() firmware command map assume pages are always the
HCA-native 4K size rather than using the kernel's page size.  This
will make handling doorbell pages for mem-free mode simpler.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

117c1f28

[PATCH] IB/mthca: tweak firmware command debug messages · 76d68fc0

Roland Dreier authored Mar 04, 2005

Slightly improve debugging output for UNMAP_ICM and MODIFY_QP firmware commands.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

76d68fc0

[PATCH] IB/mthca: mem-free interrupt handling · 1a899597

Roland Dreier authored Mar 04, 2005

Update interrupt handling code to handle mem-free mode.  While we're
at it, improve the Tavor interrupt handling to avoid an extra MMIO
read of the event cause register.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

1a899597

[PATCH] IB/mthca: mem-free EQ initialization · f584626b

Roland Dreier authored Mar 04, 2005

Add code to initialize EQ context properly in both Tavor and mem-free mode.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f584626b

[PATCH] IB/mthca: mem-free memory region support · e8988c69

Roland Dreier authored Mar 04, 2005

Add support for mem-free mode to memory region code.  This mostly
amounts to properly munging between keys and indices.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

e8988c69

[PATCH] IB/mthca: dynamic context memory mapping for mem-free mode · f753f5de

Roland Dreier authored Mar 04, 2005

Add support for mapping more memory into HCA's context to cover
context tables when new objects are allocated.  Pass the object
size into mthca_alloc_icm_table(), reference count the ICM chunks,
and add new mthca_table_get() and mthca_table_put() functions to
handle mapping memory when allocating or destroying objects.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

f753f5de

[PATCH] IB/mthca: add UAR allocation · cfb26599

Roland Dreier authored Mar 04, 2005

Add support for allocating user access regions (UARs).  Use this to
allocate a region for kernel at driver init instead using hard-coded
MTHCA_KAR_PAGE index.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

cfb26599

[PATCH] IB/mthca: map registers for mem-free mode · aa96c192

Roland Dreier authored Mar 04, 2005

Move the request/ioremap of regions related to event handling into
mthca_eq.c.  Map the correct regions depending on whether we're in
Tavor or native mem-free mode.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

aa96c192

[PATCH] IB: remove unsignaled receives · 96ddbcba

Michael S. Tsirkin authored Mar 04, 2005

Remove support for unsignaled receive requests.  This is a
non-standard extension to the IB spec that is not used by any known
applications or protocols, and is not supported by newer hardware.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

96ddbcba

[PATCH] IB/mthca: CQ cleanups · 386690a4

Roland Dreier authored Mar 04, 2005

Simplify some of the code for CQ handling slightly.
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

386690a4

[PATCH] IB/mthca: improve CQ locking part 2 · ef049c27

Michael S. Tsirkin authored Mar 04, 2005

Locking during the poll cq operation can be reduced by locking the cq
while qp is being removed from the qp array.  This also avoids an
extra atomic operation for reference counting.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

ef049c27

[PATCH] IB/mthca: improve CQ locking part 1 · 0b6e6a38

Michael S. Tsirkin authored Mar 04, 2005

Avoid taking the CQ table lock in the fast path path by using
synchronize_irq() after removing a CQ from the table to make sure that
no completion events are still in progress.  This gets a nice speedup
(about 4%) in IP over IB on my hardware.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

0b6e6a38

[PATCH] IB/mthca: CQ minor tweaks · aab64dc6

Michael S. Tsirkin authored Mar 04, 2005

Clean up CQ code so that we only calculate the address of a CQ entry
once when using it.
Signed-off-by: Michael S. Tsirkin <mst@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@topspin.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>

aab64dc6