Commits · 9f81036c54ed1f860d2807c5a6aa4f2b30c21204 · nexedi / linux

21 May, 2007 4 commits

IB/cm: Improve local id allocation · 9f81036c

Michael S. Tsirkin authored May 21, 2007

The IB CM uses an idr for local id allocations, with a running counter
as start_id.  This fails to generate distinct ids if

1. An id is constantly created and destroyed
2. A chunk of ids just beyond the current next_id value is occupied

This in turn leads to an increased chance of connection request being
mis-detected as a duplicate, sometimes for several retries, until
next_id gets past the block of allocated ids. This has been observed
in practice.

As a fix, remember the last id allocated and start immediately above it.
This also fixes a problem with the old code, where next_id might
overflow and become negative.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Acked-by: Sean Hefty <sean.hefty@intel.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

9f81036c

IPoIB/cm: Fix SRQ WR leak · 518b1646

Michael S. Tsirkin authored May 21, 2007

SRQ WR leakage has been observed with IPoIB/CM: e.g. flipping ports on
and off will, with time, leak out all WRs and then all connections
will start getting RNR NAKs.  Fix this in the way suggested by spec:
move the QP being destroyed to the error state, wait for "Last WQE
Reached" event and then post WR on a "drain QP" connected to the same
CQ.  Once we observe a completion on the drain QP, it's safe to call
ib_destroy_qp.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

518b1646

IB/ipoib: Fix typos in error messages · 24bd1e4e

Michael S. Tsirkin authored May 18, 2007

Trivial error message fixups.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

24bd1e4e

IB/mlx4: Check if SRQ is full when posting receive · 56a8c8b6

Roland Dreier authored May 20, 2007

Make mlx4_post_srq_recv() fail if the SRQ is full (head == tail).
Signed-off-by: Roland Dreier <rolandd@cisco.com>

56a8c8b6

20 May, 2007 1 commit

IB/mlx4: Pass send queue sizes from userspace to kernel · 2446304d

Eli Cohen authored May 17, 2007

Pass the number of WQEs for the send queue and their size from userspace
to the kernel to avoid having to keep the QP size calculations in sync
between the kernel driver and libmlx4. This fixes a bug seen with the
current mlx4_ib driver and current libmlx4 caused by a difference in the
calculated sizes for SQ WQEs. Also, this gives more flexibility for
userspace to experiment with using multiple WQE BBs for a single SQ WQE.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

2446304d

19 May, 2007 19 commits

IB/mlx4: Fix check of opcode in mlx4_ib_post_send() · 59b0ed12

Roland Dreier authored May 19, 2007

wr->opcode is invalid if it's >= ARRAY_SIZE(mlx4_ib_opcode), not just
strictly >.

This was spotted by the Coverity checker (CID 1643).
Signed-off-by: Roland Dreier <rolandd@cisco.com>

59b0ed12

mlx4_core: Fix array overrun in dump_dev_cap_flags() · 23c15c21

Roland Dreier authored May 19, 2007

Don't overrun fname[] array when decoding device flags.

This was spotted by the Coverity checker (CID 1642).
Signed-off-by: Roland Dreier <rolandd@cisco.com>

23c15c21

IB/mlx4: Fix RESET to RESET and RESET to ERROR transitions · 65adfa91

Michael S. Tsirkin authored May 14, 2007

According to the IB spec, a QP can be moved from RESET back to RESET
or to the ERROR state, but mlx4 firmware does not support this and
returns an error if we try.  Fix the RESET to RESET transition by
just returning 0 without doing anything, and fix RESET to ERROR by
moving the QP from RESET to INIT with dummy parameters and then
transitioning from INIT to ERROR.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

65adfa91

IB/mthca: Fix RESET to ERROR transition · b18aad71

Michael S. Tsirkin authored May 14, 2007

According to the IB spec, a QP can be moved from RESET to the ERROR
state, but mthca firmware does not support this and returns an error if
we try. Work around this FW limitation by moving the QP from RESET to
INIT with dummy parameters and then transitioning from INIT to ERROR.
Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

b18aad71

IB/mlx4: Set GRH:HopLimit when sending globally routed MADs · 15261303

Roland Dreier authored May 19, 2007

This is the same issue discovered in mthca by Rolf Manderscheid
<rvm@obsidianresearch.com>.
Signed-off-by: Roland Dreier <rolandd@cisco.com>

15261303

IB/mthca: Set GRH:HopLimit when building MLX headers · 3f37cae6

Rolf Manderscheid authored May 17, 2007

Global CM packets used by rmda_cm were being sent with a GRH:hopLimit
of zero, causing them to be dropped by the router. The problem is a
missing initialization of the hop_limit field in mthca_read_ah(),
which was called by build_mlx_header() when sending a MAD on QP1.
Signed-off-by: Rolf Manderscheid <rvm@obsidianresearch.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

3f37cae6

IB/mlx4: Fix check of max_qp_dest_rdma in modify QP · 1f8f7b7a

Eli Cohen authored May 17, 2007

max_qp_dest_rdma is already in natural units - no need to shift.  This
was discovered by a test that deliberately requests more outstanding
atomic operation than the device supports.

Found by Sagi Rotem at Mellanox.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

1f8f7b7a

IB/mthca: Fix use-after-free on device restart · de57c9f1

Ali Ayoub authored May 17, 2007

Signed-off-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

de57c9f1

IB/ehca: Return proper error code if register_mr fails · bd5a6ccc

Hoang-Nam Nguyen authored May 16, 2007

Set the return code of ehca_register_mr() to ENOMEM if the corresponding
firmware call fails due to out of resources. Some other error codes
were explicitly mapped to EINVAL -- just remove those cases so they
get mapped to the default case, which already returns EINVAL anyway.
Signed-off-by: Hoang-Nam Nguyen <hnguyen@de.ibm.com>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

bd5a6ccc

IPoIB: Handle P_Key table reordering · 26bbf13c

Yosef Etigin authored May 19, 2007

SM reconfiguration or failover possibly causes a shuffling of the values
in the P_Key table. Right now, IPoIB only queries for the P_Key index
once when it creates the device QP, and hence there are problems if the
index of a P_Key value changes.  Fix this by using the PKEY_CHANGE event
to trigger a recheck of the P_Key index.
Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

26bbf13c

IB/core: Use start_port() and end_port() · 1af4c435

Roland Dreier authored May 19, 2007

Clean up ib_query_port() and ib_modify_port() slightly by using the 
just-added start_port() and end_port() helpers.
Signed-off-by: Roland Dreier <rolandd@cisco.com>

1af4c435

IB/core: Add helpers for uncached GID and P_Key searches · 5eb620c8

Yosef Etigin authored May 14, 2007

Add ib_find_gid() and ib_find_pkey() functions that use uncached device
queries. The calls might block but the returns are always up-to-date.
Cache P_Key and GID table lengths in core to avoid extra port info queries.
Signed-off-by: Yosef Etigin <yosefe@voltaire.com>
Acked-by: Michael S. Tsirkin <mst@dev.mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

5eb620c8

IB/ipath: Fix potential deadlock with multicast spinlocks · 8b8c8bca

Roland Dreier authored May 19, 2007

Lockdep found the following potential deadlock between mcast_lock and
n_mcast_grps_lock: mcast_lock is taken from both interrupt context and
process context, so spin_lock_irqsave() must be used to take it.
n_mcast_grps_lock is only taken from process context, so at first it
seems safe to take it with plain spin_lock(); however, it also nests
inside mcast_lock, and hence we could deadlock:

  cpu A                                   cpu B
    ipath_mcast_add():
      spin_lock_irq(&mcast_lock);

                                            ipath_mcast_detach():
                                              spin_lock(&n_mcast_grps_lock);

                                            <enter interrupt>

                                            ipath_mcast_find():
                                              spin_lock_irqsave(&mcast_lock);

      spin_lock(&n_mcast_grps_lock);

Fix this by using spin_lock_irq() to take n_mcast_grps_lock.
Signed-off-by: Roland Dreier <rolandd@cisco.com>

8b8c8bca

IB/core: Free umem when mm is already gone · 7b82cd8e

Eli Cohen authored May 14, 2007

Free umem when task's mm is already destroyed by the time
ib_umem_release gets called.

Found by Dotan Barak at Mellanox.
Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Roland Dreier <rolandd@cisco.com>

7b82cd8e

Linux v2.6.22-rc2 · 55b637c6
Linus Torvalds authored May 18, 2007

55b637c6

cciss: Fix pci_driver.shutdown while device is still active · e9ca75b5

Gerald Britton authored May 14, 2007

Fix an Oops in the cciss driver caused by system shutdown while a filesystem
on a cciss device is still active.  The cciss_remove_one function only
properly removes the device if the device has been cleanly released by its
users, which is not the case when the pci_driver.shutdown method is called.

This patch adds a new cciss_shutdown function to better match the pattern
used by various SCSI drivers: deactivate device interrupts and flush caches.
It also alters the cciss_remove_one function to match and readds the
__devexit annotation that was removed when cciss_remove_one was serving as
the pci_driver.shutdown method.
Signed-off-by: Gerald Britton <gbritton@alum.mit.edu>
Acked-by: Mike Miller <mike.miller@hp.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

e9ca75b5

Further update of the i386 boot documentation · dec04cff

H. Peter Anvin authored May 17, 2007

A number of items in the i386 boot documentation have been either
vague, outdated or hard to read.  This text revision adds several more
examples, including a memory map for a modern kernel load.  It also
adds a field-by-field detailed description of the setup header, and a
bootloader ID for Qemu.
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

dec04cff

Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 · 66123549

Linus Torvalds authored May 18, 2007

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  [CRYPTO] tcrypt: Add missing error check
  [CRYPTO] padlock: Make CRYPTO_DEV_PADLOCK a tristate again

66123549

Fix roundup_pow_of_two(1) · 1a06a52e

Rolf Eike Beer authored May 17, 2007

1 is a power of two, therefore roundup_pow_of_two(1) should return 1. It does
in case the argument is a variable but in case it's a constant it behaves
wrong and returns 0. Probably nobody ever did it so this was never noticed.
Signed-off-by: Rolf Eike Beer <eike-kernel@sf-tec.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

1a06a52e

18 May, 2007 16 commits

timerfd use waitqueue lock ... · 18963c01

Davide Libenzi authored May 18, 2007

The timerfd was using the unlocked waitqueue operations, but it was
using a different lock, so poll_wait() would race with it.

This makes timerfd directly use the waitqueue lock.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

18963c01

eventfd use waitqueue lock ... · d48eb233

Davide Libenzi authored May 18, 2007

The eventfd was using the unlocked waitqueue operations, but it was
using a different lock, so poll_wait() would race with it.

This makes eventfd directly use the waitqueue lock.
Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

d48eb233

Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc · 347b4599

Linus Torvalds authored May 18, 2007

* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: (32 commits)
  [POWERPC] Remove build warnings in windfarm_core
  [POWERPC] Pass per-file CFLAGs for platform specific op codes
  [POWERPC] Correct #endif comment
  [POWERPC] Fix ppc_rtas_progress_show()
  [POWERPC] Fix sed command lines for zlib source construction
  [POWERPC] Specify GNUTARGET on $(AR) invocations
  [POWERPC] Make sure device node type/name is not NULL on hot-added nodes
  [POWERPC] Small fixes for the Ebony device tree
  [POWERPC] Fix warning on UP
  [POWERPC] cell_defconfig: Disable cpufreq and pmi
  [POWERPC] Fix IO space on PCI buses created from of_platform
  [POWERPC] Add spinlock to request_phb_iospace()
  [POWERPC] Fix make rules for treeImage.initrd
  [POWERPC] Remove warning in mpic.c
  [POWERPC] Update pasemi_defconfig
  [POWERPC] pasemi: CONFIG_GENERIC_TBSYNC no longer needed
  [POWERPC] Update iseries_defconfig
  [POWERPC] Wire up some more syscalls
  [POWERPC] Fix bug adding properties with flatdevtree.c's ft_set_prop()
  [POWERPC] Remove fixup_bigphys_addr() for arch/powerpc to avoid link error
  ...

347b4599

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6 · 939e3428
Linus Torvalds authored May 18, 2007
```
* master.kernel.org:/pub/scm/linux/kernel/git/davem/sparc-2.6:
  [SPARC64]: Fix sched_clock() et al.
```
939e3428

Merge master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 · bfea13d4

Linus Torvalds authored May 18, 2007

* master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6:
  [IPV4]: Remove IPVS icmp hack from route.c for now.
  [IPV4]: Correct rp_filter help text.
  [TCP]: TCP_CONG_YEAH requires TCP_CONG_VEGAS
  [TCP] slow start: Make comments and code logic clearer.
  [BLUETOOTH]: Fix locking in hci_sock_dev_event().
  [NET]: Fix BMSR_100{HALF,FULL}2 defines in linux/mii.h
  [NET]: lockdep classes in register_netdevice

bfea13d4

slub: another slabinfo fix · 32f9306b

Christoph Lameter authored May 18, 2007

The slab manipulation functions should not be triggered by slabs that
are unresovable in the subset of slabs selected on the command line.
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

32f9306b

revert "cancel_delayed_work: use del_timer() instead of del_timer_sync()" · 223a10a9

Oleg Nesterov authored May 18, 2007

As pointed out by Jarek Poplawski, the patch

	[WORKQUEUE]: cancel_delayed_work: use del_timer() instead of del_timer_sync()
	commit: 071b6386

was wrong, it was merged by mistake after that.

From the changelog:

	after this patch:
		...
		delayed_work_timer_fn->__queue_work() in progress.

		The latter doesn't differ from the caller's POV,

it does make a difference if the caller calls flush_workqueue() after
cancel_delayed_work(), in that case flush_workqueue() can miss this
work_struct.
Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Jarek Poplawski <jarkao2@o2.pl>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

223a10a9

[IPV4]: Remove IPVS icmp hack from route.c for now. · f6c5d736

David S. Miller authored May 18, 2007

Revert: 2d771cd8

This is dangerous if enabled and a better solution to the
problem is being worked on.
Signed-off-by: David S. Miller <davem@davemloft.net>

f6c5d736

[CRYPTO] tcrypt: Add missing error check · 29059d12

Herbert Xu authored May 18, 2007

The return value of crypto_hash_final isn't checked in test_hash_cycles.
This patch corrects this. Thanks to Eric Sesterhenn for reporting this.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

29059d12

[SPARC64]: Fix sched_clock() et al. · 03983ab8

David S. Miller authored May 17, 2007

SPARC64_NSEC_PER_CYC_SHIFT was set too high.
Signed-off-by: David S. Miller <davem@davemloft.net>

03983ab8

Revert "[PATCH] x86: Drop cc-options call for all options supported in gcc 3.2+" · b4652239

Linus Torvalds authored May 17, 2007

This reverts commit c8fdd247.

It turns out the kernel was correct, and the gcc complaint was a gcc
bug.  The preferred stack boundary is expressed not in bytes, but in the
the log2() of the preferred boundary, so "-mpreferred-stack-boundary=2"
is in fact exactly what we want, but a gcc that is compiled for x86-64
will consider it an error (because the 64-bit calling sequence says that
the stack should be 16-byte aligned) even if we are then using "-m32" to
generate 32-bit code.
Noted-by: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz>
Cc: Jan Hubicka <jh@suse.cz>
Acked-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

b4652239

[CRYPTO] padlock: Make CRYPTO_DEV_PADLOCK a tristate again · d158325e

Herbert Xu authored May 18, 2007

Turning it into a boolean was unnecessary and caused ALGAPI to be
pinned down as a boolean to. This patch makes it a tristate again.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>

d158325e

Merge git://git.linux-nfs.org/pub/linux/nfs-2.6 · b42895d6

Linus Torvalds authored May 17, 2007

* git://git.linux-nfs.org/pub/linux/nfs-2.6:
  SUNRPC: Fix sparse warnings
  NLM: Fix sparse warnings
  NFS: Fix more sparse warnings
  NFS: Fix some 'sparse' warnings...
  SUNRPC: remove dead variable 'rpciod_running'
  NFS4: Fix incorrect use of sizeof() in fs/nfs/nfs4xdr.c
  NFS: use zero_user_page
  NLM: don't use CLONE_SIGHAND in nlmclnt_recovery
  NLM: Fix locking client timeouts...

b42895d6

Merge branch 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev · d3a36fb8

Linus Torvalds authored May 17, 2007

* 'upstream-linus' of master.kernel.org:/pub/scm/linux/kernel/git/jgarzik/libata-dev:
  sata_via: pcim_iomap_regions() conversion missed BAR5
  libata: remove libata.spindown_compat
  sata_nv: fix fallout of devres conversion
  drivers/ata: remove the wildcard from sata_nv driver

d3a36fb8

sata_via: pcim_iomap_regions() conversion missed BAR5 · 8fd7d1b1

Tejun Heo authored May 17, 2007

pcim_iomap_regions() conversion missed BAR5.  Fix it.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

8fd7d1b1

libata: remove libata.spindown_compat · d9aca22c

Tejun Heo authored May 17, 2007

With STANDBYDOWN tracking added, libata.spindown_compat isn't
necessary anymore.  If userspace shutdown(8) issues STANDBYNOW, libata
warns.  If userspace shutdown(8) doesn't issue STANDBYNOW, libata does
the right thing.  Userspace can tell whether kernel supports spindown
by testing whether sysfs node manage_start_stop exists as before.
Signed-off-by: Tejun Heo <htejun@gmail.com>
Signed-off-by: Jeff Garzik <jeff@garzik.org>

d9aca22c