- 26 Sep, 2014 3 commits
-
-
Sebastian Andrzej Siewior authored
The serial8250_do_startup() function unconditionally clears the interrupts and for that it reads from the RX-FIFO without checking if there is a byte in the FIFO or not. This works fine on OMAP4+ HW like AM335x or DRA7. OMAP3630 ES1.1 (which means probably all OMAP3 and earlier) does not like this: |Unhandled fault: external abort on non-linefetch (0x1028) at 0xfb020000 |Internal error: : 1028 [#1] ARM |Modules linked in: |CPU: 0 PID: 1 Comm: swapper Not tainted 3.16.0-00022-g7edcb57-dirty #1213 |task: de0572c0 ti: de058000 task.ti: de058000 |PC is at mem32_serial_in+0xc/0x1c |LR is at serial8250_do_startup+0x220/0x85c |Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment kernel |Control: 10c5387d Table: 80004019 DAC: 00000015 |[<c03051d4>] (mem32_serial_in) from [<c0307fe8>] (serial8250_do_startup+0x220/0x85c) |[<c0307fe8>] (serial8250_do_startup) from [<c0309e00>] (omap_8250_startup+0x5c/0xe0) |[<c0309e00>] (omap_8250_startup) from [<c030863c>] (serial8250_startup+0x18/0x2c) |[<c030863c>] (serial8250_startup) from [<c030394c>] (uart_startup+0x78/0x1d8) |[<c030394c>] (uart_startup) from [<c0304678>] (uart_open+0xe8/0x114) |[<c0304678>] (uart_open) from [<c02e9e10>] (tty_open+0x1a8/0x5a4) Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastian Andrzej Siewior authored
While comparing the OMAP-serial and the 8250 part of this I noticed that the latter does not use run time-pm. Here are the pieces. It is basically a get before first register access and a last_busy + put after last access. This has to be enabled from userland _and_ UART_CAP_RPM is required for this. The runtime PM can usually work transparently in the background however there is one exception to this: After serial8250_tx_chars() completes there still may be unsent bytes in the FIFO (depending on CPU speed vs baud rate + flow control). Even if the TTY-buffer is empty we do not want RPM to disable the device because it won't send the remaining bytes. Instead we leave serial8250_tx_chars() with RPM enabled and wait for the FIFO empty interrupt. Once we enter serial8250_tx_chars() with an empty buffer we know that the FIFO is empty and since we are not going to send anything, we can disable the device. That xchg() is to ensure that serial8250_tx_chars() can be called multiple times and only the first invocation will actually invoke the runtime PM function. So that the last invocation of __stop_tx() will disable runtime pm. NOTE: do not enable RPM on the device unless you know what you do! If the device goes idle, it won't be woken up by incomming RX data _unless_ there is a wakeup irq configured which is usually the RX pin configure for wakeup via the reset module. The RX activity will then wake up the device from idle. However the first character is garbage and lost. The following bytes will be received once the device is up in time. On the beagle board xm (omap3) it takes approx 13ms from the first wakeup byte until the first byte that is received properly if the device was in core-off. v5…v8: - drop RPM from serial8250_set_mctrl() it will be used in restore path which already has RPM active and holds dev->power.lock v4…v5: - add a wrapper around rpm function and introduce UART_CAP_RPM to ensure RPM put is invoked after the TX FIFO is empty. v3…v4: - added runtime to the console code - removed device_may_wakeup() from serial8250_set_sleep() Cc: mika.westerberg@linux.intel.com Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastian Andrzej Siewior authored
The OMAP UART provides support for HW assisted flow control. What is missing is the support to throttle / unthrottle callbacks which are used by the omap-serial driver at the moment. This patch adds the callbacks. It should be safe to add them since they are only invoked from the serial_core (uart_throttle()) if the feature flags are set. Reviewed-by: Tony Lindgren <tony@atomide.com> Tested-by: Tony Lindgren <tony@atomide.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 25 Sep, 2014 1 commit
-
-
Peter Hurley authored
Commit c545b66c, 'tty: Serialize tcflow() with other tty flow control changes' and commit 99416322, 'tty: Workaround Alpha non-atomic byte storage in tty_struct' work around compiler bugs and non-atomic storage on multiple arches by padding bitfields out to the declared type which is unsigned long. However, the width varies by arch. Pad bitfields to actual width of unsigned long (which is BITS_PER_LONG). Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 24 Sep, 2014 15 commits
-
-
Peter Hurley authored
The Alpha EV4/EV5 cpus can corrupt adjacent byte and short data because those cpus use RMW to store byte and short data. Thus, concurrent adjacent byte stores could become corrupted, if serialized by a different lock. tty_struct uses different locks to protect certain fields within the structure, and thus is vulnerable to byte stores which are not atomic. Merge the ->ctrl_status byte and packet mode bit, both protected by the ->ctrl_lock, into an unsigned long. The padding bits are necessary to force the compiler to allocate the type specified; otherwise, gcc will ignore the type specifier and allocate the minimum number of bytes required to store the bitfield. In turn, this would allow Alpha EV4/EV5 cpus to corrupt adjacent byte or short storage (because those cpus use RMW to store byte and short data). gcc versions < 4.7.2 will also corrupt storage adjacent to bitfields smaller than unsigned long on ia64, ppc64, hppa64, and sparc64, thus requiring more than unsigned int storage (which would otherwise be sufficient to fix the Alpha non-atomic storage problem). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
While transmitting a START/STOP char for tcflow(TCION/TCIOFF), prevent a termios change. Otherwise, a garbage in-band flow control char may be sent, if the termios change overlaps the transmission setup. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Relocate the file-scope function, send_prio_char(), as a global helper tty_send_xchar(). Remove the global declarations for tty_write_lock()/tty_write_unlock(), as these are file-scope only now. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Use newly-introduced tty->flow_lock to serialize updates to tty->flow_stopped (via tcflow()) and with concurrent tty flow control changes from other sources. Merge the storage for ->stopped and ->flow_stopped, now that both flags are serialized by ->flow_lock. The padding bits are necessary to force the compiler to allocate the type specified; otherwise, gcc will ignore the type specifier and allocate the minimum number of bytes necessary to store the bitfield. In turn, this would allow Alpha EV4 and EV5 cpus to corrupt adjacent byte storage because those cpus use RMW to store byte and short data. gcc versions < 4.7.2 will also corrupt storage adjacent to bitfields smaller than unsigned long on ia64, ppc64, hppa64 and sparc64, thus requiring more than unsigned int storage (which would otherwise be sufficient to workaround the Alpha non-atomic byte/short storage problem). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
When a master pty is set to packet mode, flow control changes to the slave pty cause notifications to the master pty via reads and polls. However, these tests are occurring for all ttys, not just ptys. Implement flow control packet mode notifications in the pty driver. Only the slave side implements the flow control handlers since packet mode is asymmetric; the master pty receives notifications for slave-side changes, but not vice versa. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Without serialization, the flow control state can become inverted wrt. the actual hardware state. For example, CPU 0 | CPU 1 stop_tty() | lock ctrl_lock | tty->stopped = 1 | unlock ctrl_lock | | start_tty() | lock ctrl_lock | tty->stopped = 0 | unlock ctrl_lock | driver->start() driver->stop() | In this case, the flow control state now indicates the tty has been started, but the actual hardware state has actually been stopped. Introduce tty->flow_lock spinlock to serialize tty flow control changes. Split out unlocked __start_tty()/__stop_tty() flavors for use by ioctl(TCXONC) in follow-on patch. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The stopped, hw_stopped, flow_stopped and packet bits are smp-unsafe and interrupt-unsafe. For example, CPU 0 | CPU 1 | tty->flow_stopped = 1 | tty->hw_stopped = 0 One of these updates will be corrupted, as the bitwise operation on the bitfield is non-atomic. Ensure each flag has a separate memory location, so concurrent updates do not corrupt orthogonal states. Because DEC Alpha EV4 and EV5 cpus (from 1995) perform RMW on smaller-than-machine-word storage, "separate memory location" must be int instead of byte. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
uart_set_termios() is called with interrupts enabled; no need to save and restore the interrupt state when taking the uart port lock. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
Commit 64851636, serial: bfin-uart: Remove ASYNC_CTS_FLOW flag for hardware automatic CTS, open-codes uart_handle_cts_change() when CONFIG_SERIAL_BFIN_HARD_CTSRTS to skip start and stop tx. But the CTS interrupt handler _still_ calls uart_handle_cts_change(); only call uart_handle_cts_change() if !CONFIG_SERIAL_BFIN_HARD_CTSRTS. cc: Sonic Zhang <sonic.zhang@analog.com> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The tty core does not test tty->hw_stopped; remove from drivers which don't test it themselves. Acked-by: Johan Hovold <johan@kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
tty->hw_stopped is not used by the tty core and is thread-unsafe; hw_stopped is a member of a bitfield whose fields are updated non-atomically and no lock is suitable for serializing updates. Replace serial core usage of tty->hw_stopped with uport->hw_stopped. Use int storage which works around Alpha EV4/5 non-atomic byte storage, since uart_port uses different locks to protect certain fields within the structure. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
ISDN4Linux always enables CTS flow control and does not use the tty_port_cts_enabled() helper function; remove ASYNC_CTS_FLOW state enable/disable. cc: Karsten Keil <isdn@linux-pingi.de> cc: <netdev@vger.kernel.org> Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The serial core uses the tty port flags, ASYNC_CTS_FLOW and ASYNC_CD_CHECK, to track whether CTS and DCD changes should be ignored or handled. However, the tty port flags are not safe for atomic bit operations and no lock provides serialized updates. Introduce the struct uart_port status field to track CTS and DCD enable states, and serialize access with uart port lock. Substitute uart_cts_enabled() helper for tty_port_cts_enabled(). Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
The serial core provides two helper functions, uart_handle_dcd_change() and uart_handle_cts_change(), for UART drivers to use at interrupt time. The serial core expects the UART driver to hold the uart port lock when calling these helpers to prevent state corruption. If lockdep enabled, trigger a warning if the uart port lock is not held when calling these helper functions. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peter Hurley authored
An interface may need to assert a lock invariant and not flood the system logs; add a lockdep helper macro equivalent to lockdep_assert_held() which only WARNs once. Signed-off-by: Peter Hurley <peter@hurleysoftware.com> Acked-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 20 Sep, 2014 1 commit
-
-
Alexander Shiyan authored
This patch fixes COMPILE_TEST build of serial_mctrl_gpio module for architectures with custom termios.h header. sparc64:allmodconfig: In file included from drivers/tty/serial/serial_mctrl_gpio.c:21:0: include/uapi/asm-generic/termios.h:22:8: error: redefinition of 'struct termio' ./arch/sparc/include/uapi/asm/termbits.h:16:8: note: originally defined here make[3]: *** [drivers/tty/serial/serial_mctrl_gpio.o] Error 1 Reported-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Alexander Shiyan <shc_work@mail.ru> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 15 Sep, 2014 4 commits
-
-
Greg Kroah-Hartman authored
We want those fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro: "double iput() on failure exit in lustre, racy removal of spliced dentries from ->s_anon in __d_materialise_dentry() plus a bunch of assorted RCU pathwalk fixes" The RCU pathwalk fixes end up fixing a couple of cases where we incorrectly dropped out of RCU walking, due to incorrect initialization and testing of the sequence locks in some corner cases. Since dropping out of RCU walk mode forces the slow locked accesses, those corner cases slowed down quite dramatically. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: be careful with nd->inode in path_init() and follow_dotdot_rcu() don't bugger nd->seq on set_root_rcu() from follow_dotdot_rcu() fix bogus read_seqretry() checks introduced in b37199e6 move the call of __d_drop(anon) into __d_materialise_unique(dentry, anon) [fix] lustre: d_make_root() does iput() on dentry allocation failure
-
Linus Torvalds authored
The performance regression that Josef Bacik reported in the pathname lookup (see commit 99d263d4 "vfs: fix bad hashing of dentries") made me look at performance stability of the dcache code, just to verify that the problem was actually fixed. That turned up a few other problems in this area. There are a few cases where we exit RCU lookup mode and go to the slow serializing case when we shouldn't, Al has fixed those and they'll come in with the next VFS pull. But my performance verification also shows that link_path_walk() turns out to have a very unfortunate 32-bit store of the length and hash of the name we look up, followed by a 64-bit read of the combined hash_len field. That screws up the processor store to load forwarding, causing an unnecessary hickup in this critical routine. It's caused by the ugly calling convention for the "hash_name()" function, and easily fixed by just making hash_name() fill in the whole 'struct qstr' rather than passing it a pointer to just the hash value. With that, the profile for this function looks much smoother. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
- 14 Sep, 2014 12 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linuxLinus Torvalds authored
Pull parisc updates from Helge Deller: "The most important patch is a new Light Weigth Syscall (LWS) for 8, 16, 32 and 64 bit atomic CAS operations which is required in order to be able to implement the atomic gcc builtins on our platform. Other than that, we wire up the seccomp, getrandom and memfd_create syscalls, fixes a minor off-by-one bug and a wrong printk string" * 'parisc-3.17-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Implement new LWS CAS supporting 64 bit operations. parisc: Wire up seccomp, getrandom and memfd_create syscalls parisc: dino: fix %d confusingly prefixed with 0x in format string parisc: sys_hpux: NUL terminator is one past the end
-
Al Viro authored
in the former we simply check if dentry is still valid after picking its ->d_inode; in the latter we fetch ->d_inode in the same places where we fetch dentry and its ->d_seq, under the same checks. Cc: stable@vger.kernel.org # 2.6.38+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
return the value instead, and have path_init() do the assignment. Broken by "vfs: Fix absolute RCU path walk failures due to uninitialized seq number", which was Cc-stable with 2.6.38+ as destination. This one should go where it went. To avoid dummy value returned in case when root is already set (it would do no harm, actually, since the only caller that doesn't ignore the return value is guaranteed to have nd->root *not* set, but it's more obvious that way), lift the check into callers. And do the same to set_root(), to keep them in sync. Cc: stable@vger.kernel.org # 2.6.38+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
git://github.com/jonmason/ntbLinus Torvalds authored
Pull ntb driver bugfixes from Jon Mason: "NTB driver fixes for queue spread and buffer alignment. Also, update to MAINTAINERS to reflect new e-mail address" * tag 'ntb-3.17' of git://github.com/jonmason/ntb: ntb: Add alignment check to meet hardware requirement MAINTAINERS: update NTB info NTB: correct the spread of queues over mw's
-
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tipLinus Torvalds authored
Pull ARM irq chip fixes from Thomas Gleixner: "Another pile of ARM specific irq chip fixlets: - off by one bugs in the crossbar driver - missing annotations - a bunch of "make it compile" updates I pulled the lot today from Jason, but it has been in -next for at least a week" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: gic-v3: Declare rdist as __percpu pointer to __iomem pointer irqchip: gic: Make gic_default_routable_irq_domain_ops static irqchip: exynos-combiner: Fix compilation error on ARM64 irqchip: crossbar: Off by one bugs in init irqchip: gic-v3: Tag all low level accessors __maybe_unused irqchip: gic-v3: Only define gic_peek_irq() when building SMP
-
git://git.infradead.org/users/jcooper/linuxThomas Gleixner authored
irqchip fixes for v3.17 from Jason Cooper - GIC/GICV3: Various fixlets - crossbar: Fix off-by-one bug - exynos-combiner: Fix arm64 build error
-
Dave Jiang authored
The NTB translate register must have the value to be BAR size aligned. This alignment check make sure that the DMA memory allocated has the proper alignment. Another requirement for NTB to function properly with memory window BAR size greater or equal to 4M is to use the CMA feature in 3.16 kernel with the appropriate CONFIG_CMA_ALIGNMENT and CONFIG_CMA_SIZE_MBYTES set. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jon Mason <jdmason@kudzu.us>
-
Jon Mason authored
Update my contact info to my personal email address and add Dave Jiang. Signed-off-by: Jon Mason <jon.mason@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
-
Jon Mason authored
The detection of an uneven number of queues on the given memory windows was not correct. The mw_num is zero based and the mod should be division to spread them evenly over the mw's. Signed-off-by: Jon Mason <jon.mason@intel.com>
-
Al Viro authored
read_seqretry() returns true on mismatch, not on match... Cc: stable@vger.kernel.org # 3.15+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
and lock the right list there Cc: stable@vger.kernel.org Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
Al Viro authored
double-free is a bad thing Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 13 Sep, 2014 4 commits
-
-
Linus Torvalds authored
Merge branches 'locking-urgent-for-linus' and 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex and timer fixes from Thomas Gleixner: "A oneliner bugfix for the jinxed futex code: - Drop hash bucket lock in the error exit path. I really could slap myself for intruducing that bug while fixing all the other horror in that code three month ago ... and the timer department is not too proud about the following fixes: - Deal with a long standing rounding bug in the timeval to jiffies conversion. It's a real issue and this fix fell through the cracks for quite some time. - Another round of alarmtimer fixes. Finally this code gets used more widely and the subtle issues hidden for quite some time are noticed and fixed. Nothing really exciting, just the itty bitty details which bite the serious users here and there" * 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Unlock hb->lock in futex_wait_requeue_pi() error path * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: alarmtimer: Lock k_itimer during timer callback alarmtimer: Do not signal SIGEV_NONE timers alarmtimer: Return relative times in timer_gettime jiffies: Fix timeval conversion to jiffies
-
Guy Martin authored
The current LWS cas only works correctly for 32bit. The new LWS allows for CAS operations of variable size. Signed-off-by: Guy Martin <gmsoft@tuxicoman.be> Cc: <stable@vger.kernel.org> # 3.13+ Signed-off-by: Helge Deller <deller@gmx.de>
-
Linus Torvalds authored
Josef Bacik found a performance regression between 3.2 and 3.10 and narrowed it down to commit bfcfaa77 ("vfs: use 'unsigned long' accesses for dcache name comparison and hashing"). He reports: "The test case is essentially for (i = 0; i < 1000000; i++) mkdir("a$i"); On xfs on a fio card this goes at about 20k dir/sec with 3.2, and 12k dir/sec with 3.10. This is because we spend waaaaay more time in __d_lookup on 3.10 than in 3.2. The new hashing function for strings is suboptimal for < sizeof(unsigned long) string names (and hell even > sizeof(unsigned long) string names that I've tested). I broke out the old hashing function and the new one into a userspace helper to get real numbers and this is what I'm getting: Old hash table had 1000000 entries, 0 dupes, 0 max dupes New hash table had 12628 entries, 987372 dupes, 900 max dupes We had 11400 buckets with a p50 of 30 dupes, p90 of 240 dupes, p99 of 567 dupes for the new hash My test does the hash, and then does the d_hash into a integer pointer array the same size as the dentry hash table on my system, and then just increments the value at the address we got to see how many entries we overlap with. As you can see the old hash function ended up with all 1 million entries in their own bucket, whereas the new one they are only distributed among ~12.5k buckets, which is why we're using so much more CPU in __d_lookup". The reason for this hash regression is two-fold: - On 64-bit architectures the down-mixing of the original 64-bit word-at-a-time hash into the final 32-bit hash value is very simplistic and suboptimal, and just adds the two 32-bit parts together. In particular, because there is no bit shuffling and the mixing boundary is also a byte boundary, similar character patterns in the low and high word easily end up just canceling each other out. - the old byte-at-a-time hash mixed each byte into the final hash as it hashed the path component name, resulting in the low bits of the hash generally being a good source of hash data. That is not true for the word-at-a-time case, and the hash data is distributed among all the bits. The fix is the same in both cases: do a better job of mixing the bits up and using as much of the hash data as possible. We already have the "hash_32|64()" functions to do that. Reported-by: Josef Bacik <jbacik@fb.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@infradead.org> Cc: Chris Mason <clm@fb.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-
Linus Torvalds authored
The hash_64() function historically does the multiply by the GOLDEN_RATIO_PRIME_64 number with explicit shifts and adds, because unlike the 32-bit case, gcc seems unable to turn the constant multiply into the more appropriate shift and adds when required. However, that means that we generate those shifts and adds even when the architecture has a fast multiplier, and could just do it better in hardware. Use the now-cleaned-up CONFIG_ARCH_HAS_FAST_MULTIPLIER (together with "is it a 64-bit architecture") to decide whether to use an integer multiply or the explicit sequence of shift/add instructions. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-