- 23 Jul, 2013 32 commits
-
-
Oleg Drokin authored
The main reason behind this is ldlm_poold walks all namespaces currently no matter if there are any locks or not. On large systems this could take quite a bit of time, esp. since ldlm_poold is currently woken up once per second. Now every time a client namespace loses it's last resource it is placed into an inactive list that is not touched by ldlm_poold as pointless. On creation of a first resource in a namespace it is placed back into the active list. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2924 Lustre-change: http://review.whamcloud.com/5624Signed-off-by: Oleg Drokin <oleg.drokin@intel.com> Reviewed-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com> Reviewed-by: Niu Yawei <yawei.niu@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
jcl authored
When a client issue an RPC to get a layout lock, it must not hold rpc_lock because in case of a restore the rpc can be blocking for a long time Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3200 Lustre-change: http://review.whamcloud.com/6115Signed-off-by: JC Lafoucriere <jacques-charles.lafoucriere@cea.fr> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Johann Lombardi <johann.lombardi@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Li Wei authored
Lustre puts system errors (e.g., ENOTCONN) on wire as numbers essentially specific to senders' architectures. While this is fine for x86-only sites, where receivers share the same error number definition with senders, problems will arise, however, for sites involving multiple architectures with different error number definitions. For instance, an ENOTCONN reply from a sparc server will be put on wire as -57, which, for an x86 client, means EBADSLT instead. To solve the problem, this patch defines a set of network errors for on-wire or on-disk uses. These errors correspond to a subset of the x86 system errors and share the same number definition, maintaining compatibility with existing x86 clients and servers. Then, either error numbers could be translated at run time, or all host errors going on wire could be replaced with network errors in the code. This patch does the former by introducing both generic and field-specific translation routines and calling them at proper places, so that translations for existing fields are transparent. (Personally, I tend to think the latter way might be worthwhile, as it is more straightforward conceptually. Do we really need so many different errors? Should errors returned by kernel routines really be passed up and eventually put on wire? There could even be security implications in that.) Thank Fujitsu for the original idea and their contributions that make this available upstream. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2743 Lustre-change: http://review.whamcloud.com/5577Signed-off-by: Li Wei <wei.g.li@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Hiroya Nozaki <nozaki.hiroya@jp.fujitsu.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmitry Eremin authored
The race is result of use-after-free situation: ~ ptlrpc_stop_pinger() ~ ptlrpc_pinger_main() --------------------------------------------------------------- thread_set_flags(SVC_STOPPING) cfs_waitq_signal(pinger_thread) ... ... thread_set_flags(SVC_STOPPED) l_wait_event(thread_is_stopped) OBD_FREE_PTR(pinger_thread) ... cfs_waitq_signal(pinger_thread) --------------------------------------------------------------- The memory used by pinger_thread might have been freed and reallocated to something else, when ptlrpc_pinger_main() used it in cvs_waitq_signal(). Signed-off-by: Li Wei <wei.g.li@intel.com> Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3032 Lustre-change: http://review.whamcloud.com/6040Reviewed-by: Faccini Bruno <bruno.faccini@intel.com> Reviewed-by: Mike Pershin <mike.pershin@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Andreas Dilger authored
Print the namespace and OBD device name, as well as the first two lock resource fields (typically the FID) if there is an error with loading the object from disk. This will be more important with FID-on-OST and also the MDS. Using fid_extract_from_res_name() isn't possible in the LDLM code, since the lock resource may not be a FID. Make fid_extract_quota_resid() argument order and name consistent with other fid_*_res() functions, with FID first and resource second. Fix a bug in ofd_lvbo_init() where NULL lvb is accessed on error. Print FID in ofd_lvbo_update() CDEBUG() and CERROR() messages. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2193 Lustre-change: http://review.whamcloud.com/4501Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Bobi Jam <bobijam@gmail.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John L. Hammond authored
In ll_file_ioctl() and ll_swap_layouts() check the result of ll_prep_md_op_data() using IS_ERR(). Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3283 Lustre-change: http://review.whamcloud.com/6275Signed-off-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Reviewed-by: Fan Yong <fan.yong@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dmitry Eremin authored
In case of memory pressure a not locked mutex can be unlocked in function ll_file_open(). This is not allowed and subsequent behavior is not defined. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3157 Lustre-change: http://review.whamcloud.com/6028Signed-off-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: John Hammond <johnlockwoodhammond@gmail.com> Reviewed-by: Nikitas Angelinas <nikitas_angelinas@xyratex.com> Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John L. Hammond authored
In ll_file_data_get() and ll_dir_ioctl() return error on failed allocations. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-2753 Lustre-change: http://review.whamcloud.com/5845Signed-off-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: Sebastien Buisson <sebastien.buisson@bull.net> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Sebastien Buisson authored
Fix 'program hangs' defects found by Coverity version 6.5.1: Missing unlock (LOCK) Returning without unlocking. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3054 Lustre-change: http://review.whamcloud.com/5870Signed-off-by: Sebastien Buisson <sebastien.buisson@bull.net> Reviewed-by: Dmitry Eremin <dmitry.eremin@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
wang di authored
Missing the last bit during INODELOCK check in ll_have_md_lock. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3385 Lustre-change: http://review.whamcloud.com/6438Signed-off-by: wang di <di.wang@intel.com> Reviewed-by: Oleg Drokin <oleg.drokin@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
John L. Hammond authored
In vvp_io_write_start() the stats function ll_rw_stats_tally() was incorrectly called with a rw argument of 0. Correct this and use the macros READ and WRITE in and around ll_rw_stats_tally() for clarity. Intel-bug-id: https://jira.hpdd.intel.com/browse/LU-3384 Lustre-change: http://review.whamcloud.com/6447Signed-off-by: John L. Hammond <john.hammond@intel.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Reviewed-by: Jinshan Xiong <jinshan.xiong@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Paul Bolle authored
Commit 4b5b4c72 ("staging/lustre/libcfs: restore LINVRNT") added "default false" to this Kconfig file. It was obviously meant to use "default n" here. But we might as well drop this line, as a Kconfig bool defaults to 'n' anyway. Signed-off-by: Paul Bolle <pebolle@tiscali.nl> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Dragos Foianu authored
Confirmed by cscope that the functions are not used anymore. A fresh compilation does not yield any errors. Signed-off-by: Dragos Foianu <dragos.foianu@gmail.com> Reviewed-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Zhao Hongjiang authored
kmap_atomic allows only one argument now, just remove the second. Signed-off-by: Zhao Hongjiang <zhaohongjiang@huawei.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
This reverts commit 0ad1ea69 I didn't use git revert because it can not be done cleanly. Hopefully it will be the last time we do it... Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Stephen Rothwell authored
somehow this got dropped during merge window... Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Building on 32bit system, I got warnings like below: drivers/staging/lustre/lustre/llite/../include/lprocfs_status.h:666:7: note: expected ‘long unsigned int *’ but argument is of type ‘size_t *’ char *lprocfs_find_named_value(const char *buffer, const char *name, drivers/staging/lustre/lustre/lov/lov_io.c: In function ‘lov_io_rw_iter_init’: include/asm-generic/div64.h:43:28: warning: comparison of distinct pointer types lacks a cast [enabled by default] (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
dump_trace() is only available on X86. Without it, Lustre's own watchdog is broken. We can only dump current task's stack. The client-side this code is much less likely to hit deadlocks and it's probably OK to drop this altogether, since we hardly have any ptlrpc threads on clients, most notable ones are ldlm cb threads that should not really be blocking on the client anyway. Remove libcfs watchdog for now, until the upstream kernel watchdog can detect distributed deadlocks and dump other kernel threads. Cc: Oleg Drokin <oleg.drokin@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
kuid_t/kgid_t are wrappered when CONFIG_UIDGID_STRICT_TYPE_CHECKS is on. Lustre build is broken because we always treat them as plain __u32. The patch fixes it. Internally, Lustre always use __u32 uid/gid, and convert to kuid_t/kgid_t when necessary. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Got below errors on s390 build: CC [M] drivers/staging/lustre/lustre/llite/dir.o drivers/staging/lustre/lustre/llite/dir.c: In function 'll_dir_filler': drivers/staging/lustre/lustre/llite/dir.c:225:3: error: implicit declaration of function 'prefetchw' [-Werror=implicit-function-declaration] Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
As reported by Fengguang: In file included from drivers/staging/lustre/lustre/obdclass/../include/lustre/lustre_idl.h:99:0, from drivers/staging/lustre/lustre/obdclass/../include/lprocfs_status.h:46, from drivers/staging/lustre/lustre/obdclass/../include/obd_support.h:42, from drivers/staging/lustre/lustre/obdclass/../include/obd_class.h:40, from drivers/staging/lustre/lustre/obdclass/lu_object.c:53: drivers/staging/lustre/lustre/obdclass/../include/lustre/lustre_user.h:356:10: error: field 'lmd_st' has incomplete type drivers/staging/lustre/lustre/obdclass/../include/lustre/lustre_user.h:361:10: error: field 'lmd_st' has incomplete type Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Three functions cfs_cpu_ht_nsiblings, cfs_cpt_cpumask and cfs_cpt_table_print are missing if !CONFIG_SMP. cpumask_t/nodemask_t/__read_mostly/____cacheline_aligned are redefined. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Stephen Rothwell reported below error on powerpc: In file included from drivers/staging/lustre/include/linux/libcfs/libcfs.h:203:0, from drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h:67, from drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c:41: drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c: In function 'kiblnd_dev_need_failover': drivers/staging/lustre/include/linux/libcfs/libcfs_debug.h:215:16: error: implicit declaration of function 'NIPQUAD' [-Werror=implicit-function-declaration] static struct libcfs_debug_msg_data msgdata; \ ^ We should just remove HIPQUAD and replace it with %pI4h. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Joe Perches <joe@perches.com> Signed-off-by: Peng Tao <bergwolf@gmail.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Lustre internal dependency needs to be cleaned up. Currently, libcfs is acting as a basis of all other modules, while other modules in lustre/ directory in turn depend on lnet modules. It creates a dependency loop that need to be fixed. Hopefully we will remove libcfs in the end. So just disable buildin for now. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
If LNetNIInit() fails, we'll get zero ln_refcount. So fail LNetGetId() properly instead of asserting. We can get to it when socklnd fails to scan network interfaces, which is possible if Lustre is builtin. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
It can well be NULL if Lustre is builtin. Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
Change Makefiles to keep link order in match with Lustre module dependency, so that when Lustre is built in kernel, we'll have the same dependency. Otherwise we'll crash kernel if Lustre is builtin due to missing internal dependency. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
Peng Tao authored
The global variable num_physpages is going away. Replace it with totalram_pages. Cc: Jiang Liu <jiang.liu@huawei.com> Signed-off-by: Peng Tao <tao.peng@emc.com> Signed-off-by: Andreas Dilger <andreas.dilger@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-
- 21 Jul, 2013 5 commits
-
-
Linus Torvalds authored
-
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pmLinus Torvalds authored
Pull ACPI video support fixes from Rafael Wysocki: "I'm sending a separate pull request for this as it may be somewhat controversial. The breakage addressed here is not really new and the fixes may not satisfy all users of the affected systems, but we've had so much back and forth dance in this area over the last several weeks that I think it's time to actually make some progress. The source of the problem is that about a year ago we started to tell BIOSes that we're compatible with Windows 8, which we really need to do, because some systems shipping with Windows 8 are tested with it and nothing else, so if we tell their BIOSes that we aren't compatible with Windows 8, we expose our users to untested BIOS/AML code paths. However, as it turns out, some Windows 8-specific AML code paths are not tested either, because Windows 8 actually doesn't use the ACPI methods containing them, so if we declare Windows 8 compatibility and attempt to use those ACPI methods, things break. That occurs mostly in the backlight support area where in particular the _BCM and _BQC methods are plain unusable on some systems if the OS declares Windows 8 compatibility. [ The additional twist is that they actually become usable if the OS says it is not compatible with Windows 8, but that may cause problems to show up elsewhere ] Investigation carried out by Matthew Garrett indicates that what Windows 8 does about backlight is to leave backlight control up to individual graphics drivers. At least there's evidence that it does that if the Intel graphics driver is used, so we've decided to follow Windows 8 in that respect and allow i915 to control backlight (Daniel likes that part). The first commit from Aaron Lu makes ACPICA export the variable from which we can infer whether or not the BIOS believes that we are compatible with Windows 8. The second commit from Matthew Garrett prepares the ACPI video driver by making it initialize the ACPI backlight even if it is not going to be used afterward (that is needed for backlight control to work on Thinkpads). The third commit implements the actual workaround making i915 take over backlight control if the firmware thinks it's dealing with Windows 8 and is based on the work of multiple developers, including Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu. The final commit from Aaron Lu makes us follow Windows 8 by informing the firmware through the _DOS method that it should not carry out automatic brightness changes, so that brightness can be controlled by GUI. Hopefully, this approach will allow us to avoid using blacklists of systems that should not declare Windows 8 compatibility just to avoid backlight control problems in the future. - Change from Aaron Lu makes ACPICA export a variable which can be used by driver code to determine whether or not the BIOS believes that we are compatible with Windows 8. - Change from Matthew Garrett makes the ACPI video driver initialize the ACPI backlight even if it is not going to be used afterward (that is needed for backlight control to work on Thinkpads). - Fix from Rafael J Wysocki implements Windows 8 backlight support workaround making i915 take over bakclight control if the firmware thinks it's dealing with Windows 8. Based on the work of multiple developers including Matthew Garrett, Chun-Yi Lee, Seth Forshee, and Aaron Lu. - Fix from Aaron Lu makes the kernel follow Windows 8 by informing the firmware through the _DOS method that it should not carry out automatic brightness changes, so that brightness can be controlled by GUI" * tag 'acpi-video-3.11' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / video: no automatic brightness changes by win8-compatible firmware ACPI / video / i915: No ACPI backlight if firmware expects Windows 8 ACPI / video: Always call acpi_video_init_brightness() on init ACPICA: expose OSI version
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext[34] tmpfile bugfix from Ted Ts'o: "Fix regression caused by commit af51a2ac which added ->tmpfile() support (along with a similar fix for ext3)" * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext3: fix a BUG when opening a file with O_TMPFILE flag ext4: fix a BUG when opening a file with O_TMPFILE flag
-
Zheng Liu authored
When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel: kernel BUG at fs/ext3/namei.c:1992! kernel: invalid opcode: 0000 [#1] SMP kernel: Modules linked in: ext4 jbd2 crc16 cpufreq_ondemand ipv6 dm_mirror dm_region_hash dm_log dm_mod parport_pc parport serio_raw sg dcdbas pcspkr i2c_i801 ehci_pci ehci_hcd button acpi_cpufreq mperf e1000e ptp pps_core ttm drm_kms_helper drm hwmon i2c_algo_bit i2c_core ext3 jbd sd_mod ahci libahci libata scsi_mod uhci_hcd kernel: CPU: 0 PID: 2882 Comm: tst_tmpfile Not tainted 3.11.0-rc1+ #4 kernel: Hardware name: Dell Inc. OptiPlex 780 /0V4W66, BIOS A05 08/11/2010 kernel: task: ffff880112d30050 ti: ffff8801124d4000 task.ti: ffff8801124d4000 kernel: RIP: 0010:[<ffffffffa00db5ae>] [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP: 0018:ffff8801124d5cc8 EFLAGS: 00010202 kernel: RAX: 0000000000000000 RBX: ffff880111510128 RCX: ffff8801114683a0 kernel: RDX: 0000000000000000 RSI: ffff880111510128 RDI: ffff88010fcf65a8 kernel: RBP: ffff8801124d5d18 R08: 0080000000000000 R09: ffffffffa00d3b7f kernel: R10: ffff8801114683a0 R11: ffff8801032a2558 R12: 0000000000000000 kernel: R13: ffff88010fcf6800 R14: ffff8801032a2558 R15: ffff8801115100d8 kernel: FS: 00007f5d172b5700(0000) GS:ffff880117c00000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b kernel: CR2: 00007f5d16df15d0 CR3: 0000000110b1d000 CR4: 00000000000407f0 kernel: Stack: kernel: 000000000000000c ffff8801048a7dc8 ffff8801114685a8 ffffffffa00b80d7 kernel: ffff8801124d5e38 ffff8801032a2558 ffff88010ce24d68 0000000000000000 kernel: ffff88011146b300 ffff8801124d5d44 ffff8801124d5d78 ffffffffa00db7e1 kernel: Call Trace: kernel: [<ffffffffa00b80d7>] ? journal_start+0x8c/0xbd [jbd] kernel: [<ffffffffa00db7e1>] ext3_tmpfile+0xb2/0x13b [ext3] kernel: [<ffffffff821076f8>] path_openat+0x11f/0x5e7 kernel: [<ffffffff821c86b4>] ? list_del+0x11/0x30 kernel: [<ffffffff82065fa2>] ? __dequeue_entity+0x33/0x38 kernel: [<ffffffff82107cd5>] do_filp_open+0x3f/0x8d kernel: [<ffffffff82112532>] ? __alloc_fd+0x50/0x102 kernel: [<ffffffff820f9296>] do_sys_open+0x13b/0x1cd kernel: [<ffffffff820f935c>] SyS_open+0x1e/0x20 kernel: [<ffffffff82398c02>] system_call_fastpath+0x16/0x1b kernel: Code: 39 c7 0f 85 67 01 00 00 0f b7 03 25 00 f0 00 00 3d 00 40 00 00 74 18 3d 00 80 00 00 74 11 3d 00 a0 00 00 74 0a 83 7b 48 00 74 04 <0f> 0b eb fe 49 8b 85 50 03 00 00 4c 89 f6 48 c7 c7 c0 99 0e a0 kernel: RIP [<ffffffffa00db5ae>] ext3_orphan_add+0x6a/0x1eb [ext3] kernel: RSP <ffff8801124d5cc8> Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk>
-
Zheng Liu authored
When we try to open a file with O_TMPFILE flag, we will trigger a bug. The root cause is that in ext4_orphan_add() we check ->i_nlink == 0 and this check always fails because we set ->i_nlink = 1 in inode_init_always(). We can use the following program to trigger it: int main(int argc, char *argv[]) { int fd; fd = open(argv[1], O_TMPFILE, 0666); if (fd < 0) { perror("open "); return -1; } close(fd); return 0; } The oops message looks like this: kernel BUG at fs/ext4/namei.c:2572! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dlci bridge stp hidp cmtp kernelcapi l2tp_ppp l2tp_netlink l2tp_core sctp libcrc32c rfcomm tun fuse nfnetli nk can_raw ipt_ULOG can_bcm x25 scsi_transport_iscsi ipx p8023 p8022 appletalk phonet psnap vmw_vsock_vmci_transport af_key vmw_vmci rose vsock atm can netrom ax25 af_rxrpc ir da pppoe pppox ppp_generic slhc bluetooth nfc rfkill rds caif_socket caif crc_ccitt af_802154 llc2 llc snd_hda_codec_realtek snd_hda_intel snd_hda_codec serio_raw snd_pcm pcsp kr edac_core snd_page_alloc snd_timer snd soundcore r8169 mii sr_mod cdrom pata_atiixp radeon backlight drm_kms_helper ttm CPU: 1 PID: 1812571 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #12 Hardware name: Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H, BIOS F12a 04/23/2010 task: ffff88007dfe69a0 ti: ffff88010f7b6000 task.ti: ffff88010f7b6000 RIP: 0010:[<ffffffff8125ce69>] [<ffffffff8125ce69>] ext4_orphan_add+0x299/0x2b0 RSP: 0018:ffff88010f7b7cf8 EFLAGS: 00010202 RAX: 0000000000000000 RBX: ffff8800966d3020 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff88007dfe70b8 RDI: 0000000000000001 RBP: ffff88010f7b7d40 R08: ffff880126a3c4e0 R09: ffff88010f7b7ca0 R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801271fd668 R13: ffff8800966d2f78 R14: ffff88011d7089f0 R15: ffff88007dfe69a0 FS: 00007f70441a3740(0000) GS:ffff88012a800000(0000) knlGS:00000000f77c96c0 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000002834000 CR3: 0000000107964000 CR4: 00000000000007e0 DR0: 0000000000780000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600 Stack: 0000000000002000 00000020810b6dde 0000000000000000 ffff88011d46db00 ffff8800966d3020 ffff88011d7089f0 ffff88009c7f4c10 ffff88010f7b7f2c ffff88007dfe69a0 ffff88010f7b7da8 ffffffff8125cfac ffff880100000004 Call Trace: [<ffffffff8125cfac>] ext4_tmpfile+0x12c/0x180 [<ffffffff811cba78>] path_openat+0x238/0x700 [<ffffffff8100afc4>] ? native_sched_clock+0x24/0x80 [<ffffffff811cc647>] do_filp_open+0x47/0xa0 [<ffffffff811db73f>] ? __alloc_fd+0xaf/0x200 [<ffffffff811ba2e4>] do_sys_open+0x124/0x210 [<ffffffff81010725>] ? syscall_trace_enter+0x25/0x290 [<ffffffff811ba3ee>] SyS_open+0x1e/0x20 [<ffffffff816ca8d4>] tracesys+0xdd/0xe2 [<ffffffff81001001>] ? start_thread_common.constprop.6+0x1/0xa0 Code: 04 00 00 00 89 04 24 31 c0 e8 c4 77 04 00 e9 43 fe ff ff 66 25 00 d0 66 3d 00 80 0f 84 0e fe ff ff 83 7b 48 00 0f 84 04 fe ff ff <0f> 0b 49 8b 8c 24 50 07 00 00 e9 88 fe ff ff 0f 1f 84 00 00 00 Here we couldn't call clear_nlink() directly because in d_tmpfile() we will call inode_dec_link_count() to decrease ->i_nlink. So this commit tries to call d_tmpfile() before ext4_orphan_add() to fix this problem. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Zheng Liu <wenqing.lz@taobao.com> Tested-by: Darrick J. Wong <darrick.wong@oracle.com> Tested-by: Dave Jones <davej@redhat.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Al Viro <viro@zeniv.linux.org.uk>
-
- 20 Jul, 2013 3 commits
-
-
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/stagingLinus Torvalds authored
Pull staging tree fixes from Greg KH: "Here are a few iio driver fixes for 3.11-rc2. They are still spread across drivers/iio and drivers/staging/iio so they are coming in through this tree. I've also removed the drivers/staging/csr/ driver as the developers who originally sent it to me have moved on to other companies, and CSR still will not send us the specs for the device, making the driver pretty much obsolete and impossible to fix up. Deleting it now prevents people from sending in lots of tiny codingsyle fixes that will never go anywhere. It also helps to offset the large lustre filesystem merge that happened in 3.11-rc1 in the overall 3.11.0 diffstat. :)" * tag 'staging-3.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: csr: remove driver iio: lps331ap: Fix wrong in_pressure_scale output value iio staging: fix lis3l02dq, read error handling staging:iio:ad7291: add missing .driver_module to struct iio_info iio: ti_am335x_adc: add missing .driver_module to struct iio_info iio: mxs-lradc: Remove useless check in read_raw iio: mxs-lradc: Fix misuse of iio->trig iio: inkern: fix iio_convert_raw_to_processed_unlocked iio: Fix iio_channel_has_info iio:trigger: device_unregister->device_del to avoid double free iio: dac: ad7303: fix error return code in ad7303_probe()
-
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfsLinus Torvalds authored
Pull vfs fixes from Al Viro: "The sget() one is a long-standing bug and will need to go into -stable (in fact, it had been originally caught in RHEL6), the other two are 3.11-only" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: vfs: constify dentry parameter in d_count() livelock avoidance in sget() allow O_TMPFILE to work with O_WRONLY
-
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4Linus Torvalds authored
Pull ext4 bugfixes from Ted Ts'o: "Fixes for 3.11-rc2, sent at 5pm, in the professoinal style. :-)" I'm not sure I like this new level of "professionalism". 9-5, people, 9-5. * tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: call ext4_es_lru_add() after handling cache miss ext4: yield during large unlinks ext4: make the extent_status code more robust against ENOMEM failures ext4: simplify calculation of blocks to free on error ext4: fix error handling in ext4_ext_truncate()
-