1. 02 Mar, 2015 40 commits
    • Johan Hovold's avatar
      gpio: sysfs: fix gpio attribute-creation race · 58409c4e
      Johan Hovold authored
      commit ebbeba12 upstream.
      
      Fix attribute-creation race with userspace by using the default group
      to create also the contingent gpio device attributes.
      
      Fixes: d8f388d8 ("gpio: sysfs interface")
      Signed-off-by: default avatarJohan Hovold <johan@kernel.org>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      [ luis: backported to 3.16:
        - all changes in drivers/gpio/gpiolib.c
        - adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      58409c4e
    • Daniel Borkmann's avatar
      net: sctp: fix race for one-to-many sockets in sendmsg's auto associate · 11fd056b
      Daniel Borkmann authored
      commit 2061dcd6 upstream.
      
      I.e. one-to-many sockets in SCTP are not required to explicitly
      call into connect(2) or sctp_connectx(2) prior to data exchange.
      Instead, they can directly invoke sendmsg(2) and the SCTP stack
      will automatically trigger connection establishment through 4WHS
      via sctp_primitive_ASSOCIATE(). However, this in its current
      implementation is racy: INIT is being sent out immediately (as
      it cannot be bundled anyway) and the rest of the DATA chunks are
      queued up for later xmit when connection is established, meaning
      sendmsg(2) will return successfully. This behaviour can result
      in an undesired side-effect that the kernel made the application
      think the data has already been transmitted, although none of it
      has actually left the machine, worst case even after close(2)'ing
      the socket.
      
      Instead, when the association from client side has been shut down
      e.g. first gracefully through SCTP_EOF and then close(2), the
      client could afterwards still receive the server's INIT_ACK due
      to a connection with higher latency. This INIT_ACK is then considered
      out of the blue and hence responded with ABORT as there was no
      alive assoc found anymore. This can be easily reproduced f.e.
      with sctp_test application from lksctp. One way to fix this race
      is to wait for the handshake to actually complete.
      
      The fix defers waiting after sctp_primitive_ASSOCIATE() and
      sctp_primitive_SEND() succeeded, so that DATA chunks cooked up
      from sctp_sendmsg() have already been placed into the output
      queue through the side-effect interpreter, and therefore can then
      be bundeled together with COOKIE_ECHO control chunks.
      
      strace from example application (shortened):
      
      socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP) = 3
      sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
                 msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
      sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
                 msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
      sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
                 msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
      sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
                 msg_iov(1)=[{"hello", 5}], msg_controllen=0, msg_flags=0}, 0) = 5
      sendmsg(3, {msg_name(28)={sa_family=AF_INET, sin_port=htons(8888), sin_addr=inet_addr("192.168.1.115")},
                 msg_iov(0)=[], msg_controllen=48, {cmsg_len=48, cmsg_level=0x84 /* SOL_??? */, cmsg_type=, ...},
                 msg_flags=0}, 0) = 0 // graceful shutdown for SOCK_SEQPACKET via SCTP_EOF
      close(3) = 0
      
      tcpdump before patch (fooling the application):
      
      22:33:36.306142 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 3879023686] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3139201684]
      22:33:36.316619 IP 192.168.1.115.8888 > 192.168.1.114.41462: sctp (1) [INIT ACK] [init tag: 3345394793] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 3380109591]
      22:33:36.317600 IP 192.168.1.114.41462 > 192.168.1.115.8888: sctp (1) [ABORT]
      
      tcpdump after patch:
      
      14:28:58.884116 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [INIT] [init tag: 438593213] [rwnd: 106496] [OS: 10] [MIS: 65535] [init TSN: 3092969729]
      14:28:58.888414 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [INIT ACK] [init tag: 381429855] [rwnd: 106496] [OS: 10] [MIS: 10] [init TSN: 2141904492]
      14:28:58.888638 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [COOKIE ECHO] , (2) [DATA] (B)(E) [TSN: 3092969729] [...]
      14:28:58.893278 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [COOKIE ACK] , (2) [SACK] [cum ack 3092969729] [a_rwnd 106491] [#gap acks 0] [#dup tsns 0]
      14:28:58.893591 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969730] [...]
      14:28:59.096963 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969730] [a_rwnd 106496] [#gap acks 0] [#dup tsns 0]
      14:28:59.097086 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [DATA] (B)(E) [TSN: 3092969731] [...] , (2) [DATA] (B)(E) [TSN: 3092969732] [...]
      14:28:59.103218 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SACK] [cum ack 3092969732] [a_rwnd 106486] [#gap acks 0] [#dup tsns 0]
      14:28:59.103330 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN]
      14:28:59.107793 IP 192.168.1.115.8888 > 192.168.1.114.35846: sctp (1) [SHUTDOWN ACK]
      14:28:59.107890 IP 192.168.1.114.35846 > 192.168.1.115.8888: sctp (1) [SHUTDOWN COMPLETE]
      
      Looks like this bug is from the pre-git history museum. ;)
      
      Fixes: 08707d54 ("lksctp-2_5_31-0_5_1.patch")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      11fd056b
    • Alexander Duyck's avatar
      fib_trie: Fix /proc/net/fib_trie when CONFIG_IP_MULTIPLE_TABLES is not defined · 952a63d1
      Alexander Duyck authored
      commit a5a519b2 upstream.
      
      In recent testing I had disabled CONFIG_IP_MULTIPLE_TABLES and as a result
      when I ran "cat /proc/net/fib_trie" the main trie was displayed multiple
      times.  I found that the problem line of code was in the function
      fib_trie_seq_next.  Specifically the line below caused the indexes to go in
      the opposite direction of our traversal:
      
      	h = tb->tb_id & (FIB_TABLE_HASHSZ - 1);
      
      This issue was that the RT tables are defined such that RT_TABLE_LOCAL is ID
      255, while it is located at TABLE_LOCAL_INDEX of 0, and RT_TABLE_MAIN is 254
      with a TABLE_MAIN_INDEX of 1.  This means that the above line will return 1
      for the local table and 0 for main.  The result is that fib_trie_seq_next
      will return NULL at the end of the local table, fib_trie_seq_start will
      return the start of the main table, and then fib_trie_seq_next will loop on
      main forever as h will always return 0.
      
      The fix for this is to reverse the ordering of the two tables.  It has the
      advantage of making it so that the tables now print in the same order
      regardless of if multiple tables are enabled or not.  In order to make the
      definition consistent with the multiple tables case I simply masked the to
      RT_TABLE_XXX values by (FIB_TABLE_HASHSZ - 1).  This way the two table
      layouts should always stay consistent.
      
      Fixes: 93456b6d ("[IPV4]: Unify access to the routing tables")
      Signed-off-by: default avatarAlexander Duyck <alexander.h.duyck@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      952a63d1
    • Seth Forshee's avatar
      HID: i2c-hid: Limit reads to wMaxInputLength bytes for input events · 06edf928
      Seth Forshee authored
      commit 6d00f37e upstream.
      
      d1c7e29e (HID: i2c-hid: prevent buffer overflow in early IRQ)
      changed hid_get_input() to read ihid->bufsize bytes, which can be
      more than wMaxInputLength. This is the case with the Dell XPS 13
      9343, and it is causing events to be missed. In some cases the
      missed events are releases, which can cause the cursor to jump or
      freeze, among other problems. Limit the number of bytes read to
      min(wMaxInputLength, ihid->bufsize) to prevent such problems.
      
      Fixes: d1c7e29e "HID: i2c-hid: prevent buffer overflow in early IRQ"
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Reviewed-by: default avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      06edf928
    • Sasha Levin's avatar
      net: rds: use correct size for max unacked packets and bytes · 065f3735
      Sasha Levin authored
      commit db27ebb1 upstream.
      
      Max unacked packets/bytes is an int while sizeof(long) was used in the
      sysctl table.
      
      This means that when they were getting read we'd also leak kernel memory
      to userspace along with the timeout values.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Moritz Muehlenhoff <jmm@inutil.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      065f3735
    • Sasha Levin's avatar
      net: llc: use correct size for sysctl timeout entries · 42182789
      Sasha Levin authored
      commit 6b8d9117 upstream.
      
      The timeout entries are sizeof(int) rather than sizeof(long), which
      means that when they were getting read we'd also leak kernel memory
      to userspace along with the timeout values.
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Cc: Moritz Muehlenhoff <jmm@inutil.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      42182789
    • Andrew Elble's avatar
      GFS2: Fix crash during ACL deletion in acl max entry check in gfs2_set_acl() · 9e2feaeb
      Andrew Elble authored
      commit 27870207 upstream.
      
      Fixes: e01580bf ("gfs2: use generic posix ACL infrastructure")
      Reported-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Tested-by: default avatarEric Meddaugh <etmsys@rit.edu>
      Signed-off-by: default avatarAndrew Elble <aweits@rit.edu>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9e2feaeb
    • Dan Carpenter's avatar
      ALSA: off by one bug in snd_riptide_joystick_probe() · 3a87ac14
      Dan Carpenter authored
      commit e4940626 upstream.
      
      The problem here is that we check:
      
      	if (dev >= SNDRV_CARDS)
      
      Then we increment "dev".
      
             if (!joystick_port[dev++])
      
      Then we use it as an offset into a array with SNDRV_CARDS elements.
      
      	if (!request_region(joystick_port[dev], 8, "Riptide gameport")) {
      
      This has 3 effects:
      1) If you use the module option to specify the joystick port then it has
         to be shifted one space over.
      2) The wrong error message will be printed on failure if you have over
         32 cards.
      3) Static checkers will correctly complain that are off by one.
      
      Fixes: db1005ec ('ALSA: riptide - Fix joystick resource handling')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: default avatarTakashi Iwai <tiwai@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      3a87ac14
    • Uwe Kleine-König's avatar
      pinctrl: pinctrl-imx: don't use invalid value of conf_reg · dd89ea7a
      Uwe Kleine-König authored
      commit 4ff0f034 upstream.
      
      The right check for conf_reg to be invalid it testing against -1 not 0
      as is done in the rest of the driver.
      
      This fixes an oops that can be triggered by:
      
      	cat /sys/kernel/debug/pinctrl/43fac000.iomuxc/*
      
      Fixes: ae75ff81 ("pinctrl: pinctrl-imx: add imx pinctrl core driver")
      Signed-off-by: default avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      [ luis: backported to 3.16:
        - file rename: drivers/pinctrl/freescale/pinctrl-imx.c ->
          drivers/pinctrl/pinctrl-imx.c ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      dd89ea7a
    • Gavin Shan's avatar
      powerpc/kernel: Avoid memory corruption at early stage · 79e43709
      Gavin Shan authored
      commit 6f20e7f2 upstream.
      
      When calling to early_setup(), we pick "boot_paca" up for the master CPU
      and initialize that with initialise_paca(). At that point, the SLB
      shadow buffer isn't populated yet. Updating the SLB shadow buffer should
      corrupt what we had in physical address 0 where the trap instruction is
      usually stored.
      
      This hasn't been observed to cause any trouble in practice, but is
      obviously fishy.
      
      Fixes: 6f4441ef ("powerpc: Dynamically allocate slb_shadow from memblock")
      Signed-off-by: default avatarGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: default avatarMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      79e43709
    • Sergei Shtylyov's avatar
      clk-gate: fix bit # check in clk_register_gate() · e99423f7
      Sergei Shtylyov authored
      commit 2e9dcdae upstream.
      
      In case CLK_GATE_HIWORD_MASK flag is passed to clk_register_gate(), the bit #
      should be no higher than 15, however the corresponding check is obviously off-
      by-one.
      
      Fixes: 04577994 ("clk: gate: add CLK_GATE_HIWORD_MASK")
      Signed-off-by: default avatarSergei Shtylyov <sergei.shtylyov@cogentembedded.com>
      Signed-off-by: default avatarMichael Turquette <mturquette@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e99423f7
    • Dan Carpenter's avatar
      efi: Small leak on error in runtime map code · 89639edd
      Dan Carpenter authored
      commit 86d68a58 upstream.
      
      The "> 0" here should ">= 0" so we free map_entries[0].
      
      Fixes: 926172d4 ('efi: Export EFI runtime memory mapping to sysfs')
      Signed-off-by: default avatarDan Carpenter <dan.carpenter@oracle.com>
      Acked-by: default avatarDave Young <dyoung@redhat.com>
      Signed-off-by: default avatarMatt Fleming <matt.fleming@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      89639edd
    • Geert Uytterhoeven's avatar
      gpio: rcar: Fix error path for devm_kzalloc() failure · 62503026
      Geert Uytterhoeven authored
      commit 7d82bf34 upstream.
      
      If the call to devm_kzalloc() fails, nothing must be cleant up.
      This was missed before because gpio_rcar_probe() had a "return"
      statement after the first "goto err0".
      Signed-off-by: default avatarGeert Uytterhoeven <geert+renesas@glider.be>
      Fixes: df0c6c80 ("gpio: rcar: Add minimal runtime PM support")
      Signed-off-by: default avatarLinus Walleij <linus.walleij@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      62503026
    • Lars-Peter Clausen's avatar
      ASoC: mioa701_wm9713: Fix speaker event · 50992849
      Lars-Peter Clausen authored
      commit 7331ea47 upstream.
      
      Commit f6b2a045 ("ASoC: pxa: mioa701_wm9713: Convert to table based DAPM
      setup") converted the driver to register the board level DAPM elements with
      the card's DAPM context rather than the CODEC's DAPM context. The change
      overlooked that the speaker widget event callback accesses the widget's
      codec field which is only valid if the widget has been registered in a CODEC
      DAPM context. This patch modifies the callback to take an alternative route
      to get the CODEC.
      
      Fixes: f6b2a045 ("ASoC: pxa: mioa701_wm9713: Convert to table based DAPM
      setup")
      Signed-off-by: default avatarLars-Peter Clausen <lars@metafoo.de>
      Signed-off-by: default avatarMark Brown <broonie@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      50992849
    • Al Viro's avatar
      autofs4 copy_dev_ioctl(): keep the value of ->size we'd used for allocation · 2321de6e
      Al Viro authored
      commit 0a280962 upstream.
      
      X-Coverup: just ask spender
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2321de6e
    • Al Viro's avatar
      procfs: fix race between symlink removals and traversals · 48a6aba3
      Al Viro authored
      commit 7e0e953b upstream.
      
      use_pde()/unuse_pde() in ->follow_link()/->put_link() resp.
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      48a6aba3
    • Al Viro's avatar
      debugfs: leave freeing a symlink body until inode eviction · 67083b7b
      Al Viro authored
      commit 0db59e59 upstream.
      
      As it is, we have debugfs_remove() racing with symlink traversals.
      Supply ->evict_inode() and do freeing there - inode will remain
      pinned until we are done with the symlink body.
      
      And rip the idiocy with checking if dentry is positive right after
      we'd verified debugfs_positive(), which is a stronger check...
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      67083b7b
    • Thadeu Lima de Souza Cascardo's avatar
      blk-throttle: check stats_cpu before reading it from sysfs · ad9ea48d
      Thadeu Lima de Souza Cascardo authored
      commit 045c47ca upstream.
      
      When reading blkio.throttle.io_serviced in a recently created blkio
      cgroup, it's possible to race against the creation of a throttle policy,
      which delays the allocation of stats_cpu.
      
      Like other functions in the throttle code, just checking for a NULL
      stats_cpu prevents the following oops caused by that race.
      
      [ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
      [ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
      [ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
      [ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
      [ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
      [ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
      [ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
      [ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
      [ 1137.734202] REGS: c000000f1d213500 TRAP: 0300   Not tainted  (3.19.0)
      [ 1137.734230] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI>  CR: 42008884  XER: 20000000
      [ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
      GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
      GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
      GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
      GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
      GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
      GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
      GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
      GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
      [ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
      [ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
      [ 1137.734943] Call Trace:
      [ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
      [ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
      [ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
      [ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
      [ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
      [ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
      [ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
      [ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
      [ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
      [ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
      [ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
      [ 1137.735383] Instruction dump:
      [ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
      [ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 <7cead02a> e9090008 e9490010 e9290018
      
      And here is one code that allows to easily reproduce this, although this
      has first been found by running docker.
      
      void run(pid_t pid)
      {
      	int n;
      	int status;
      	int fd;
      	char *buffer;
      	buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
      	n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
      	fd = open(CGPATH "/test/tasks", O_WRONLY);
      	write(fd, buffer, n);
      	close(fd);
      	if (fork() > 0) {
      		fd = open("/dev/sda", O_RDONLY | O_DIRECT);
      		read(fd, buffer, 512);
      		close(fd);
      		wait(&status);
      	} else {
      		fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
      		n = read(fd, buffer, BUFFER_SIZE);
      		close(fd);
      	}
      	free(buffer);
      	exit(0);
      }
      
      void test(void)
      {
      	int status;
      	mkdir(CGPATH "/test", 0666);
      	if (fork() > 0)
      		wait(&status);
      	else
      		run(getpid());
      	rmdir(CGPATH "/test");
      }
      
      int main(int argc, char **argv)
      {
      	int i;
      	for (i = 0; i < NR_TESTS; i++)
      		test();
      	return 0;
      }
      Reported-by: default avatarRicardo Marin Matinata <rmm@br.ibm.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      ad9ea48d
    • Jay Lan's avatar
      kdb: fix incorrect counts in KDB summary command output · 0b50c40c
      Jay Lan authored
      commit 14675592 upstream.
      
      The output of KDB 'summary' command should report MemTotal, MemFree
      and Buffers output in kB. Current codes report in unit of pages.
      
      A define of K(x) as
      is defined in the code, but not used.
      
      This patch would apply the define to convert the values to kB.
      Please include me on Cc on replies. I do not subscribe to linux-kernel.
      Signed-off-by: default avatarJay Lan <jlan@sgi.com>
      Signed-off-by: default avatarJason Wessel <jason.wessel@windriver.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0b50c40c
    • James Hogan's avatar
      MIPS: Export MSA functions used by lose_fpu(1) for KVM · db99df36
      James Hogan authored
      commit ca5d2564 upstream.
      
      Export the _save_msa asm function used by the lose_fpu(1) macro to GPL
      modules so that KVM can make use of it when it is built as a module.
      
      This fixes the following build error when CONFIG_KVM=m and
      CONFIG_CPU_HAS_MSA=y due to commit f798217d ("KVM: MIPS: Don't leak
      FPU/DSP to guest"):
      
      ERROR: "_save_msa" [arch/mips/kvm/kvm.ko] undefined!
      
      Fixes: f798217d (KVM: MIPS: Don't leak FPU/DSP to guest)
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9261/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      db99df36
    • James Hogan's avatar
      MIPS: Export FP functions used by lose_fpu(1) for KVM · 56fcf534
      James Hogan authored
      commit 3ce465e0 upstream.
      
      Export the _save_fp asm function used by the lose_fpu(1) macro to GPL
      modules so that KVM can make use of it when it is built as a module.
      
      This fixes the following build error when CONFIG_KVM=m due to commit
      f798217d ("KVM: MIPS: Don't leak FPU/DSP to guest"):
      
      ERROR: "_save_fp" [arch/mips/kvm/kvm.ko] undefined!
      Signed-off-by: default avatarJames Hogan <james.hogan@imgtec.com>
      Fixes: f798217d (KVM: MIPS: Don't leak FPU/DSP to guest)
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Gleb Natapov <gleb@kernel.org>
      Cc: kvm@vger.kernel.org
      Cc: linux-mips@linux-mips.org
      Patchwork: https://patchwork.linux-mips.org/patch/9260/Signed-off-by: default avatarRalf Baechle <ralf@linux-mips.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      56fcf534
    • Ilya Dryomov's avatar
      libceph: fix double __remove_osd() problem · d27ae2c2
      Ilya Dryomov authored
      commit 7eb71e03 upstream.
      
      It turns out it's possible to get __remove_osd() called twice on the
      same OSD.  That doesn't sit well with rb_erase() - depending on the
      shape of the tree we can get a NULL dereference, a soft lockup or
      a random crash at some point in the future as we end up touching freed
      memory.  One scenario that I was able to reproduce is as follows:
      
                  <osd3 is idle, on the osd lru list>
      <con reset - osd3>
      con_fault_finish()
        osd_reset()
                                    <osdmap - osd3 down>
                                    ceph_osdc_handle_map()
                                      <takes map_sem>
                                      kick_requests()
                                        <takes request_mutex>
                                        reset_changed_osds()
                                          __reset_osd()
                                            __remove_osd()
                                        <releases request_mutex>
                                      <releases map_sem>
          <takes map_sem>
          <takes request_mutex>
          __kick_osd_requests()
            __reset_osd()
              __remove_osd() <-- !!!
      
      A case can be made that osd refcounting is imperfect and reworking it
      would be a proper resolution, but for now Sage and I decided to fix
      this by adding a safe guard around __remove_osd().
      
      Fixes: http://tracker.ceph.com/issues/8087
      
      Cc: Sage Weil <sage@redhat.com>
      Signed-off-by: default avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: default avatarSage Weil <sage@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      d27ae2c2
    • Ilya Dryomov's avatar
      libceph: change from BUG to WARN for __remove_osd() asserts · fe654f5e
      Ilya Dryomov authored
      commit cc9f1f51 upstream.
      
      No reason to use BUG_ON for osd request list assertions.
      Signed-off-by: default avatarIlya Dryomov <idryomov@redhat.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      fe654f5e
    • Ilya Dryomov's avatar
      libceph: assert both regular and lingering lists in __remove_osd() · 1505d944
      Ilya Dryomov authored
      commit 7c6e6fc5 upstream.
      
      It is important that both regular and lingering requests lists are
      empty when the OSD is removed.
      Signed-off-by: default avatarIlya Dryomov <ilya.dryomov@inktank.com>
      Reviewed-by: default avatarAlex Elder <elder@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      1505d944
    • Hector Marco-Gisbert's avatar
      x86, mm/ASLR: Fix stack randomization on 64-bit systems · b515b1b0
      Hector Marco-Gisbert authored
      commit 4e7c22d4 upstream.
      
      The issue is that the stack for processes is not properly randomized on
      64 bit architectures due to an integer overflow.
      
      The affected function is randomize_stack_top() in file
      "fs/binfmt_elf.c":
      
        static unsigned long randomize_stack_top(unsigned long stack_top)
        {
                 unsigned int random_variable = 0;
      
                 if ((current->flags & PF_RANDOMIZE) &&
                         !(current->personality & ADDR_NO_RANDOMIZE)) {
                         random_variable = get_random_int() & STACK_RND_MASK;
                         random_variable <<= PAGE_SHIFT;
                 }
                 return PAGE_ALIGN(stack_top) + random_variable;
                 return PAGE_ALIGN(stack_top) - random_variable;
        }
      
      Note that, it declares the "random_variable" variable as "unsigned int".
      Since the result of the shifting operation between STACK_RND_MASK (which
      is 0x3fffff on x86_64, 22 bits) and PAGE_SHIFT (which is 12 on x86_64):
      
      	  random_variable <<= PAGE_SHIFT;
      
      then the two leftmost bits are dropped when storing the result in the
      "random_variable". This variable shall be at least 34 bits long to hold
      the (22+12) result.
      
      These two dropped bits have an impact on the entropy of process stack.
      Concretely, the total stack entropy is reduced by four: from 2^28 to
      2^30 (One fourth of expected entropy).
      
      This patch restores back the entropy by correcting the types involved
      in the operations in the functions randomize_stack_top() and
      stack_maxrandom_size().
      
      The successful fix can be tested with:
      
        $ for i in `seq 1 10`; do cat /proc/self/maps | grep stack; done
        7ffeda566000-7ffeda587000 rw-p 00000000 00:00 0                          [stack]
        7fff5a332000-7fff5a353000 rw-p 00000000 00:00 0                          [stack]
        7ffcdb7a1000-7ffcdb7c2000 rw-p 00000000 00:00 0                          [stack]
        7ffd5e2c4000-7ffd5e2e5000 rw-p 00000000 00:00 0                          [stack]
        ...
      
      Once corrected, the leading bytes should be between 7ffc and 7fff,
      rather than always being 7fff.
      Signed-off-by: default avatarHector Marco-Gisbert <hecmargi@upv.es>
      Signed-off-by: default avatarIsmael Ripoll <iripoll@upv.es>
      [ Rebased, fixed 80 char bugs, cleaned up commit message, added test example and CVE ]
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Fixes: CVE-2015-1593
      Link: http://lkml.kernel.org/r/20150214173350.GA18393@www.outflux.netSigned-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b515b1b0
    • Arnd Bergmann's avatar
      cpufreq: s3c: remove incorrect __init annotations · 22d44228
      Arnd Bergmann authored
      commit 61882b63 upstream.
      
      The two functions s3c2416_cpufreq_driver_init and s3c_cpufreq_register
      are marked init but are called from a context that might be run after
      the __init sections are discarded, as the compiler points out:
      
      WARNING: vmlinux.o(.data+0x1ad9dc): Section mismatch in reference from the variable s3c2416_cpufreq_driver to the function .init.text:s3c2416_cpufreq_driver_init()
      WARNING: drivers/built-in.o(.text+0x35b5dc): Section mismatch in reference from the function s3c2410a_cpufreq_add() to the function .init.text:s3c_cpufreq_register()
      
      This removes the __init markings.
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Acked-by: default avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      22d44228
    • Mikulas Patocka's avatar
      dm snapshot: fix a possible invalid memory access on unload · a0b50c39
      Mikulas Patocka authored
      commit 22aa66a3 upstream.
      
      When the snapshot target is unloaded, snapshot_dtr() waits until
      pending_exceptions_count drops to zero.  Then, it destroys the snapshot.
      Therefore, the function that decrements pending_exceptions_count
      should not touch the snapshot structure after the decrement.
      
      pending_complete() calls free_pending_exception(), which decrements
      pending_exceptions_count, and then it performs up_write(&s->lock) and it
      calls retry_origin_bios() which dereferences  s->origin.  These two
      memory accesses to the fields of the snapshot may touch the dm_snapshot
      struture after it is freed.
      
      This patch moves the call to free_pending_exception() to the end of
      pending_complete(), so that the snapshot will not be destroyed while
      pending_complete() is in progress.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      a0b50c39
    • Mikulas Patocka's avatar
      dm: fix a race condition in dm_get_md · f51576b6
      Mikulas Patocka authored
      commit 2bec1f4a upstream.
      
      The function dm_get_md finds a device mapper device with a given dev_t,
      increases the reference count and returns the pointer.
      
      dm_get_md calls dm_find_md, dm_find_md takes _minor_lock, finds the
      device, tests that the device doesn't have DMF_DELETING or DMF_FREEING
      flag, drops _minor_lock and returns pointer to the device. dm_get_md then
      calls dm_get. dm_get calls BUG if the device has the DMF_FREEING flag,
      otherwise it increments the reference count.
      
      There is a possible race condition - after dm_find_md exits and before
      dm_get is called, there are no locks held, so the device may disappear or
      DMF_FREEING flag may be set, which results in BUG.
      
      To fix this bug, we need to call dm_get while we hold _minor_lock. This
      patch renames dm_find_md to dm_get_md and changes it so that it calls
      dm_get while holding the lock.
      Signed-off-by: default avatarMikulas Patocka <mpatocka@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f51576b6
    • John Stultz's avatar
      ntp: Fixup adjtimex freq validation on 32-bit systems · 06246b44
      John Stultz authored
      commit 29183a70 upstream.
      
      Additional validation of adjtimex freq values to avoid
      potential multiplication overflows were added in commit
      5e5aeb43 (time: adjtimex: Validate the ADJ_FREQUENCY values)
      
      Unfortunately the patch used LONG_MAX/MIN instead of
      LLONG_MAX/MIN, which was fine on 64-bit systems, but being
      much smaller on 32-bit systems caused false positives
      resulting in most direct frequency adjustments to fail w/
      EINVAL.
      
      ntpd only does direct frequency adjustments at startup, so
      the issue was not as easily observed there, but other time
      sync applications like ptpd and chrony were more effected by
      the bug.
      
      See bugs:
      
        https://bugzilla.kernel.org/show_bug.cgi?id=92481
        https://bugzilla.redhat.com/show_bug.cgi?id=1188074
      
      This patch changes the checks to use LLONG_MAX for
      clarity, and additionally the checks are disabled
      on 32-bit systems since LLONG_MAX/PPM_SCALE is always
      larger then the 32-bit long freq value, so multiplication
      overflows aren't possible there.
      Reported-by: default avatarJosh Boyer <jwboyer@fedoraproject.org>
      Reported-by: default avatarGeorge Joseph <george.joseph@fairview5.com>
      Tested-by: default avatarGeorge Joseph <george.joseph@fairview5.com>
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Sasha Levin <sasha.levin@oracle.com>
      Link: http://lkml.kernel.org/r/1423553436-29747-1-git-send-email-john.stultz@linaro.org
      [ Prettified the changelog and the comments a bit. ]
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      06246b44
    • Sasha Levin's avatar
      time: adjtimex: Validate the ADJ_FREQUENCY values · e74cd063
      Sasha Levin authored
      commit 5e5aeb43 upstream.
      
      Verify that the frequency value from userspace is valid and makes sense.
      
      Unverified values can cause overflows later on.
      
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarSasha Levin <sasha.levin@oracle.com>
      [jstultz: Fix up bug for negative values and drop redunent cap check]
      Signed-off-by: default avatarJohn Stultz <john.stultz@linaro.org>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e74cd063
    • Sebastian Andrzej Siewior's avatar
      locking/rtmutex: Avoid a NULL pointer dereference on deadlock · eb9abcff
      Sebastian Andrzej Siewior authored
      commit 8d1e5a1a upstream.
      
      With task_blocks_on_rt_mutex() returning early -EDEADLK we never
      add the waiter to the waitqueue. Later, we try to remove it via
      remove_waiter() and go boom in rt_mutex_top_waiter() because
      rb_entry() gives a NULL pointer.
      
      ( Tested on v3.18-RT where rtmutex is used for regular mutex and I
        tried to get one twice in a row. )
      
      Not sure when this started but I guess 397335f0 ("rtmutex: Fix
      deadlock detector for real") or commit 3d5c9340 ("rtmutex:
      Handle deadlock detection smarter").
      Signed-off-by: default avatarSebastian Andrzej Siewior <bigeasy@linutronix.de>
      Acked-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1424187823-19600-1-git-send-email-bigeasy@linutronix.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      eb9abcff
    • NeilBrown's avatar
      md/raid5: Fix livelock when array is both resyncing and degraded. · dae27fb6
      NeilBrown authored
      commit 26ac1073 upstream.
      
      Commit a7854487:
        md: When RAID5 is dirty, force reconstruct-write instead of read-modify-write.
      
      Causes an RCW cycle to be forced even when the array is degraded.
      A degraded array cannot support RCW as that requires reading all data
      blocks, and one may be missing.
      
      Forcing an RCW when it is not possible causes a live-lock and the code
      spins, repeatedly deciding to do something that cannot succeed.
      
      So change the condition to only force RCW on non-degraded arrays.
      Reported-by: default avatarManibalan P <pmanibalan@amiindia.co.in>
      Bisected-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Tested-by: default avatarJes Sorensen <Jes.Sorensen@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Fixes: a7854487Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      dae27fb6
    • Markos Chandras's avatar
      MIPS: kernel: cps-vec: Replace "addi" with "addiu" · b4fe5e2d
      Markos Chandras authored
      commit acac4108 upstream.
      
      The "addi" instruction will trap on overflows which is not something
      we need in this code, so we replace that with "addiu".
      
      Link: http://www.linux-mips.org/archives/linux-mips/2015-01/msg00430.html
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Cc: Paul Burton <paul.burton@imgtec.com>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      b4fe5e2d
    • Markos Chandras's avatar
      MIPS: asm: asmmacro: Replace "add" instructions with "addu" · f27d0731
      Markos Chandras authored
      commit 98a833c1 upstream.
      
      The "add" instruction is actually a macro in binutils and depending on
      the size of the immediate it can expand to an "addi" instruction.
      However, the "addi" instruction traps on overflows which is not
      something we want on address calculation.
      
      Link: http://www.linux-mips.org/archives/linux-mips/2015-01/msg00121.html
      Cc: Paul Burton <paul.burton@imgtec.com>
      Cc: Maciej W. Rozycki <macro@linux-mips.org>
      Signed-off-by: default avatarMarkos Chandras <markos.chandras@imgtec.com>
      [ luis: backported to 3.16: adjusted context ]
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      f27d0731
    • Daniel J Blueman's avatar
      EDAC, amd64_edac: Prevent OOPS with >16 memory controllers · 0c24a6c4
      Daniel J Blueman authored
      commit 0c510cc8 upstream.
      
      When DRAM errors occur on memory controllers after EDAC_MAX_MCS (16),
      the kernel fatally dereferences unallocated structures, see splat below;
      this occurs on at least NumaConnect systems.
      
      Fix by checking if a memory controller info structure was found.
      
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000320
      IP: [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
      PGD 2f8b5a3067 PUD 2f8b5a2067 PMD 0
      Oops: 0000 [#2] SMP
      Modules linked in:
      CPU: 224 PID: 11930 Comm: stream_c.exe.gn Tainted: G   D    3.19.0 #1
      Hardware name: Supermicro H8QGL/H8QGL, BIOS 3.5b    01/28/2015
      task: ffff8807dbfb8c00 ti: ffff8807dd16c000 task.ti: ffff8807dd16c000
      RIP: 0010:[<ffffffff819f714f>] [<ffffffff819f714f>] decode_bus_error+0x2f/0x2b0
      RSP: 0000:ffff8907dfc03c48 EFLAGS: 00010297
      RAX: 0000000000000001 RBX: 9c67400010080a13 RCX: 0000000000001dc6
      RDX: 000000001dc61dc6 RSI: ffff8907dfc03df0 RDI: 000000000000001c
      RBP: ffff8907dfc03ce8 R08: 0000000000000000 R09: 0000000000000022
      R10: ffff891fffa30380 R11: 00000000001cfc90 R12: 0000000000000008
      R13: 0000000000000000 R14: 000000000000001c R15: 00009c6740001000
      FS: 00007fa97ee18700(0000) GS:ffff8907dfc00000(0000) knlGS:0000000000000000
      CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 0000000000000320 CR3: 0000003f889b8000 CR4: 00000000000407e0
      Stack:
       0000000000000000 ffff8907dfc03df0 0000000000000008 9c67400010080a13
       000000000000001c 00009c6740001000 ffff8907dfc03c88 ffffffff810e4f9a
       ffff8907dfc03ce8 ffffffff81b375b9 0000000000000000 0000000000000010
      Call Trace:
       <IRQ>
       ? vprintk_default
       ? printk
       amd_decode_mce
       notifier_call_chain
       atomic_notifier_call_chain
       mce_log
       machine_check_poll
       mce_timer_fn
       ? mce_cpu_restart
       call_timer_fn.isra.29
       run_timer_softirq
       __do_softirq
       irq_exit
       smp_apic_timer_interrupt
       apic_timer_interrupt
       <EOI>
       ? down_read_trylock
       __do_page_fault
       ? __schedule
       do_page_fault
       page_fault
      Signed-off-by: default avatarDaniel J Blueman <daniel@numascale.com>
      Link: http://lkml.kernel.org/r/1424144078-24589-1-git-send-email-daniel@numascale.com
      [ Boris: massage commit message ]
      Signed-off-by: default avatarBorislav Petkov <bp@suse.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0c24a6c4
    • Mitko Haralanov's avatar
      IB/qib: Do not write EEPROM · 9cee3d5f
      Mitko Haralanov authored
      commit 18c0b82a upstream.
      
      This changeset removes all the code that allows the driver to write to
      the EEPROM and update the recorded error counters and power on hours.
      
      These two stats are unused and writing them exposes a timing risk
      which could leave the EEPROM in a bad state preventing further normal
      operation of the HCA.
      Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarMitko Haralanov <mitko.haralanov@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      9cee3d5f
    • Tony Battersby's avatar
      sg: fix read() error reporting · 2b8a5d91
      Tony Battersby authored
      commit 3b524a68 upstream.
      
      Fix SCSI generic read() incorrectly returning success after detecting an
      error.
      Signed-off-by: default avatarTony Battersby <tonyb@cybernetics.com>
      Acked-by: default avatarDouglas Gilbert <dgilbert@interlog.com>
      Signed-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      2b8a5d91
    • Minh Duc Tran's avatar
      fixed invalid assignment of 64bit mask to host dma_boundary for scatter gather... · 0a79bd3a
      Minh Duc Tran authored
      fixed invalid assignment of 64bit mask to host dma_boundary for scatter gather segment boundary limit.
      
      commit f76a610a upstream.
      
      In reference to bug https://bugzilla.redhat.com/show_bug.cgi?id=1097141
      Assert is seen with AMD cpu whenever calling pci_alloc_consistent.
      
      [   29.406183] ------------[ cut here ]------------
      [   29.410505] kernel BUG at lib/iommu-helper.c:13!
      Signed-off-by: default avatarMinh Tran <minh.tran@emulex.com>
      Fixes: 6733b39aSigned-off-by: default avatarJames Bottomley <JBottomley@Parallels.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      0a79bd3a
    • honclo's avatar
      Added Little Endian support to vtpm module · e03d7be2
      honclo authored
      commit eb71f8a5 upstream.
      
      The tpm_ibmvtpm module is affected by an unaligned access problem.
      ibmvtpm_crq_get_version failed with rc=-4 during boot when vTPM is
      enabled in Power partition, which supports both little endian and
      big endian modes.
      
      We added little endian support to fix this problem:
      1) added cpu_to_be64 calls to ensure BE data is sent from an LE OS.
      2) added be16_to_cpu and be32_to_cpu calls to make sure data received
         is in LE format on a LE OS.
      Signed-off-by: default avatarHon Ching(Vicky) Lo <honclo@linux.vnet.ibm.com>
      Signed-off-by: default avatarJoy Latten <jmlatten@linux.vnet.ibm.com>
      [phuewe: manually applied the patch :( ]
      Reviewed-by: default avatarAshley Lai <ashley@ahsleylai.com>
      Signed-off-by: default avatarPeter Huewe <peterhuewe@gmx.de>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      e03d7be2
    • Filipe Manana's avatar
      Btrfs: fix fsync data loss after adding hard link to inode · bb65fd7d
      Filipe Manana authored
      commit 1a4bcf47 upstream.
      
      We have a scenario where after the fsync log replay we can lose file data
      that had been previously fsync'ed if we added an hard link for our inode
      and after that we sync'ed the fsync log (for example by fsync'ing some
      other file or directory).
      
      This is because when adding an hard link we updated the inode item in the
      log tree with an i_size value of 0. At that point the new inode item was
      in memory only and a subsequent fsync log replay would not make us lose
      the file data. However if after adding the hard link we sync the log tree
      to disk, by fsync'ing some other file or directory for example, we ended
      up losing the file data after log replay, because the inode item in the
      persisted log tree had an an i_size of zero.
      
      This is easy to reproduce, and the following excerpt from my test for
      xfstests shows this:
      
        _scratch_mkfs >> $seqres.full 2>&1
        _init_flakey
        _mount_flakey
      
        # Create one file with data and fsync it.
        # This made the btrfs fsync log persist the data and the inode metadata with
        # a correct inode->i_size (4096 bytes).
        $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 4K 0 4K" -c "fsync" \
             $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Now add one hard link to our file. This made the btrfs code update the fsync
        # log, in memory only, with an inode metadata having a size of 0.
        ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link
      
        # Now force persistence of the fsync log to disk, for example, by fsyncing some
        # other file.
        touch $SCRATCH_MNT/bar
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
      
        # Before a power loss or crash, we could read the 4Kb of data from our file as
        # expected.
        echo "File content before:"
        od -t x1 $SCRATCH_MNT/foo
      
        # Simulate a crash/power loss.
        _load_flakey_table $FLAKEY_DROP_WRITES
        _unmount_flakey
      
        _load_flakey_table $FLAKEY_ALLOW_WRITES
        _mount_flakey
      
        # After the fsync log replay, because the fsync log had a value of 0 for our
        # inode's i_size, we couldn't read anymore the 4Kb of data that we previously
        # wrote and fsync'ed. The size of the file became 0 after the fsync log replay.
        echo "File content after:"
        od -t x1 $SCRATCH_MNT/foo
      
      Another alternative test, that doesn't need to fsync an inode in the same
      transaction it was created, is:
      
        _scratch_mkfs >> $seqres.full 2>&1
        _init_flakey
        _mount_flakey
      
        # Create our test file with some data.
        $XFS_IO_PROG -f -c "pwrite -S 0xaa -b 8K 0 8K" \
             $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Make sure the file is durably persisted.
        sync
      
        # Append some data to our file, to increase its size.
        $XFS_IO_PROG -f -c "pwrite -S 0xcc -b 4K 8K 4K" \
             $SCRATCH_MNT/foo | _filter_xfs_io
      
        # Fsync the file, so from this point on if a crash/power failure happens, our
        # new data is guaranteed to be there next time the fs is mounted.
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foo
      
        # Add one hard link to our file. This made btrfs write into the in memory fsync
        # log a special inode with generation 0 and an i_size of 0 too. Note that this
        # didn't update the inode in the fsync log on disk.
        ln $SCRATCH_MNT/foo $SCRATCH_MNT/foo_link
      
        # Now make sure the in memory fsync log is durably persisted.
        # Creating and fsync'ing another file will do it.
        touch $SCRATCH_MNT/bar
        $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/bar
      
        # As expected, before the crash/power failure, we should be able to read the
        # 12Kb of file data.
        echo "File content before:"
        od -t x1 $SCRATCH_MNT/foo
      
        # Simulate a crash/power loss.
        _load_flakey_table $FLAKEY_DROP_WRITES
        _unmount_flakey
      
        _load_flakey_table $FLAKEY_ALLOW_WRITES
        _mount_flakey
      
        # After mounting the fs again, the fsync log was replayed.
        # The btrfs fsync log replay code didn't update the i_size of the persisted
        # inode because the inode item in the log had a special generation with a
        # value of 0 (and it couldn't know the correct i_size, since that inode item
        # had a 0 i_size too). This made the last 4Kb of file data inaccessible and
        # effectively lost.
        echo "File content after:"
        od -t x1 $SCRATCH_MNT/foo
      
      This isn't a new issue/regression. This problem has been around since the
      log tree code was added in 2008:
      
        Btrfs: Add a write ahead tree log to optimize synchronous operations
        (commit e02119d5)
      
      Test cases for xfstests follow soon.
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarLuis Henriques <luis.henriques@canonical.com>
      bb65fd7d