1. 05 Mar, 2014 9 commits
  2. 03 Mar, 2014 10 commits
  3. 01 Mar, 2014 3 commits
    • Filipe David Borba Manana's avatar
      Btrfs: fix data corruption when reading/updating compressed extents · 383728be
      Filipe David Borba Manana authored
      commit a2aa75e1 upstream.
      
      When using a mix of compressed file extents and prealloc extents, it
      is possible to fill a page of a file with random, garbage data from
      some unrelated previous use of the page, instead of a sequence of zeroes.
      
      A simple sequence of steps to get into such case, taken from the test
      case I made for xfstests, is:
      
         _scratch_mkfs
         _scratch_mount "-o compress-force=lzo"
         $XFS_IO_PROG -f -c "pwrite -S 0x06 -b 18670 266978 18670" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "falloc 26450 665194" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "truncate 542872" $SCRATCH_MNT/foobar
         $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/foobar
      
      This results in the following file items in the fs tree:
      
         item 4 key (257 INODE_ITEM 0) itemoff 15879 itemsize 160
             inode generation 6 transid 6 size 542872 block group 0 mode 100600
         item 5 key (257 INODE_REF 256) itemoff 15863 itemsize 16
             inode ref index 2 namelen 6 name: foobar
         item 6 key (257 EXTENT_DATA 0) itemoff 15810 itemsize 53
             extent data disk byte 0 nr 0 gen 6
             extent data offset 0 nr 24576 ram 266240
             extent compression 0
         item 7 key (257 EXTENT_DATA 24576) itemoff 15757 itemsize 53
             prealloc data disk byte 12849152 nr 241664 gen 6
             prealloc data offset 0 nr 241664
         item 8 key (257 EXTENT_DATA 266240) itemoff 15704 itemsize 53
             extent data disk byte 12845056 nr 4096 gen 6
             extent data offset 0 nr 20480 ram 20480
             extent compression 2
         item 9 key (257 EXTENT_DATA 286720) itemoff 15651 itemsize 53
             prealloc data disk byte 13090816 nr 405504 gen 6
             prealloc data offset 0 nr 258048
      
      The on disk extent at offset 266240 (which corresponds to 1 single disk block),
      contains 5 compressed chunks of file data. Each of the first 4 compress 4096
      bytes of file data, while the last one only compresses 3024 bytes of file data.
      Therefore a read into the file region [285648 ; 286720[ (length = 4096 - 3024 =
      1072 bytes) should always return zeroes (our next extent is a prealloc one).
      
      The solution here is the compression code path to zero the remaining (untouched)
      bytes of the last page it uncompressed data into, as the information about how
      much space the file data consumes in the last page is not known in the upper layer
      fs/btrfs/extent_io.c:__do_readpage(). In __do_readpage we were correctly zeroing
      the remainder of the page but only if it corresponds to the last page of the inode
      and if the inode's size is not a multiple of the page size.
      
      This would cause not only returning random data on reads, but also permanently
      storing random data when updating parts of the region that should be zeroed.
      For the example above, it means updating a single byte in the region [285648 ; 286720[
      would store that byte correctly but also store random data on disk.
      
      A test case for xfstests follows soon.
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      383728be
    • Filipe David Borba Manana's avatar
      Btrfs: fix tree mod logging · 3939448a
      Filipe David Borba Manana authored
      commit 5de865ee upstream.
      
      While running the test btrfs/004 from xfstests in a loop, it failed
      about 1 time out of 20 runs in my desktop. The failure happened in
      the backref walking part of the test, and the test's error message was
      like this:
      
        btrfs/004 93s ... [failed, exit status 1] - output mismatch (see /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
            --- tests/btrfs/004.out	2013-11-26 18:25:29.263333714 +0000
            +++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad	2013-12-10 15:25:10.327518516 +0000
            @@ -1,3 +1,8 @@
             QA output created by 004
             *** test backref walking
            -*** done
            +unexpected output from
            +	/home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
            +expected inum: 405, expected address: 454656, file: /home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d6/d3d/d156/fce, got:
            +
             ...
             (Run 'diff -u tests/btrfs/004.out /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad' to see the entire diff)
        Ran: btrfs/004
        Failures: btrfs/004
        Failed 1 of 1 tests
      
      But immediately after the test finished, the btrfs inspect-internal command
      returned the expected output:
      
        $ btrfs inspect-internal logical-resolve -P 141512704 /home/fdmanana/btrfs-tests/scratch_1
        inode 405 offset 454656 root 258
        inode 405 offset 454656 root 5
      
      It turned out this was because the btrfs_search_old_slot() calls performed
      during backref walking (backref.c:__resolve_indirect_ref) were not finding
      anything. The reason for this turned out to be that the tree mod logging
      code was not logging some node multi-step operations atomically, therefore
      btrfs_search_old_slot() callers iterated often over an incomplete tree that
      wasn't fully consistent with any tree state from the past. Besides missing
      items, this often (but not always) resulted in -EIO errors during old slot
      searches, reported in dmesg like this:
      
      [ 4299.933936] ------------[ cut here ]------------
      [ 4299.933949] WARNING: CPU: 0 PID: 23190 at fs/btrfs/ctree.c:1343 btrfs_search_old_slot+0x57b/0xab0 [btrfs]()
      [ 4299.933950] Modules linked in: btrfs raid6_pq xor pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) bnep rfcomm bluetooth parport_pc ppdev binfmt_misc joydev snd_hda_codec_h
      [ 4299.933977] CPU: 0 PID: 23190 Comm: btrfs Tainted: G        W  O 3.12.0-fdm-btrfs-next-16+ #70
      [ 4299.933978] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./Z77 Pro4, BIOS P1.50 09/04/2012
      [ 4299.933979]  000000000000053f ffff8806f3fd98f8 ffffffff8176d284 0000000000000007
      [ 4299.933982]  0000000000000000 ffff8806f3fd9938 ffffffff8104a81c ffff880659c64b70
      [ 4299.933984]  ffff880659c643d0 ffff8806599233d8 ffff880701e2e938 0000160000000000
      [ 4299.933987] Call Trace:
      [ 4299.933991]  [<ffffffff8176d284>] dump_stack+0x55/0x76
      [ 4299.933994]  [<ffffffff8104a81c>] warn_slowpath_common+0x8c/0xc0
      [ 4299.933997]  [<ffffffff8104a86a>] warn_slowpath_null+0x1a/0x20
      [ 4299.934003]  [<ffffffffa065d3bb>] btrfs_search_old_slot+0x57b/0xab0 [btrfs]
      [ 4299.934005]  [<ffffffff81775f3b>] ? _raw_read_unlock+0x2b/0x50
      [ 4299.934010]  [<ffffffffa0655001>] ? __tree_mod_log_search+0x81/0xc0 [btrfs]
      [ 4299.934019]  [<ffffffffa06dd9b0>] __resolve_indirect_refs+0x130/0x5f0 [btrfs]
      [ 4299.934027]  [<ffffffffa06a21f1>] ? free_extent_buffer+0x61/0xc0 [btrfs]
      [ 4299.934034]  [<ffffffffa06de39c>] find_parent_nodes+0x1fc/0xe40 [btrfs]
      [ 4299.934042]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934048]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934056]  [<ffffffffa06df980>] iterate_extent_inodes+0xe0/0x250 [btrfs]
      [ 4299.934058]  [<ffffffff817762db>] ? _raw_spin_unlock+0x2b/0x50
      [ 4299.934065]  [<ffffffffa06dfb82>] iterate_inodes_from_logical+0x92/0xb0 [btrfs]
      [ 4299.934071]  [<ffffffffa06b13e0>] ? defrag_lookup_extent+0xe0/0xe0 [btrfs]
      [ 4299.934078]  [<ffffffffa06b7015>] btrfs_ioctl+0xf65/0x1f60 [btrfs]
      [ 4299.934080]  [<ffffffff811658b8>] ? handle_mm_fault+0x278/0xb00
      [ 4299.934083]  [<ffffffff81075563>] ? up_read+0x23/0x40
      [ 4299.934085]  [<ffffffff8177a41c>] ? __do_page_fault+0x20c/0x5a0
      [ 4299.934088]  [<ffffffff811b2946>] do_vfs_ioctl+0x96/0x570
      [ 4299.934090]  [<ffffffff81776e23>] ? error_sti+0x5/0x6
      [ 4299.934093]  [<ffffffff810b71e8>] ? trace_hardirqs_off_caller+0x28/0xd0
      [ 4299.934096]  [<ffffffff81776a09>] ? retint_swapgs+0xe/0x13
      [ 4299.934098]  [<ffffffff811b2eb1>] SyS_ioctl+0x91/0xb0
      [ 4299.934100]  [<ffffffff813eecde>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [ 4299.934102]  [<ffffffff8177ef12>] system_call_fastpath+0x16/0x1b
      [ 4299.934104] ---[ end trace 48f0cfc902491414 ]---
      [ 4299.934378] btrfs bad fsid on block 0
      
      These tree mod log operations that must be performed atomically, tree_mod_log_free_eb,
      tree_mod_log_eb_copy, tree_mod_log_insert_root and tree_mod_log_insert_move, used to
      be performed atomically before the following commit:
      
        c8cc6341
        (Btrfs: stop using GFP_ATOMIC for the tree mod log allocations)
      
      That change removed the atomicity of such operations. This patch restores the
      atomicity while still not doing the GFP_ATOMIC allocations of tree_mod_elem
      structures, so it has to do the allocations using GFP_NOFS before acquiring
      the mod log lock.
      
      This issue has been experienced by several users recently, such as for example:
      
        http://www.spinics.net/lists/linux-btrfs/msg28574.html
      
      After running the btrfs/004 test for 679 consecutive iterations with this
      patch applied, I didn't ran into the issue anymore.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      3939448a
    • Filipe David Borba Manana's avatar
      Btrfs: return immediately if tree log mod is not necessary · 4e6e3bd4
      Filipe David Borba Manana authored
      commit 78357766 upstream.
      
      In ctree.c:tree_mod_log_set_node_key() we were calling
      __tree_mod_log_insert_key() even when the modification doesn't need
      to be logged. This would allocate a tree_mod_elem structure, fill it
      and pass it to  __tree_mod_log_insert(), which would just acquire
      the tree mod log write lock and then free the tree_mod_elem structure
      and return (that is, a no-op).
      
      Therefore call tree_mod_log_insert() instead of __tree_mod_log_insert()
      which just returns immediately if the modification doesn't need to be
      logged (without allocating the structure, fill it, acquire write lock,
      free structure).
      Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      4e6e3bd4
  4. 26 Feb, 2014 18 commits
    • Knut Petersen's avatar
      perf: Enforce 1 as lower limit for perf_event_max_sample_rate · 2b183ff2
      Knut Petersen authored
      commit 723478c8 upstream.
      
      /proc/sys/kernel/perf_event_max_sample_rate will accept
      negative values as well as 0.
      
      Negative values are unreasonable, and 0 causes a
      divide by zero exception in perf_proc_update_handler.
      
      This patch enforces a lower limit of 1.
      Signed-off-by: default avatarKnut Petersen <Knut_Petersen@t-online.de>
      Signed-off-by: default avatarPeter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/5242DB0C.4070005@t-online.deSigned-off-by: default avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      2b183ff2
    • Eric Dumazet's avatar
      net: use __GFP_NORETRY for high order allocations · 454d0fc5
      Eric Dumazet authored
      [ Upstream commit ed98df33 ]
      
      sock_alloc_send_pskb() & sk_page_frag_refill()
      have a loop trying high order allocations to prepare
      skb with low number of fragments as this increases performance.
      
      Problem is that under memory pressure/fragmentation, this can
      trigger OOM while the intent was only to try the high order
      allocations, then fallback to order-0 allocations.
      
      We had various reports from unexpected regressions.
      
      According to David, setting __GFP_NORETRY should be fine,
      as the asynchronous compaction is still enabled, and this
      will prevent OOM from kicking as in :
      
      CFSClientEventm invoked oom-killer: gfp_mask=0x42d0, order=3, oom_adj=0,
      oom_score_adj=0, oom_score_badness=2 (enabled),memcg_scoring=disabled
      CFSClientEventm
      
      Call Trace:
       [<ffffffff8043766c>] dump_header+0xe1/0x23e
       [<ffffffff80437a02>] oom_kill_process+0x6a/0x323
       [<ffffffff80438443>] out_of_memory+0x4b3/0x50d
       [<ffffffff8043a4a6>] __alloc_pages_may_oom+0xa2/0xc7
       [<ffffffff80236f42>] __alloc_pages_nodemask+0x1002/0x17f0
       [<ffffffff8024bd23>] alloc_pages_current+0x103/0x2b0
       [<ffffffff8028567f>] sk_page_frag_refill+0x8f/0x160
       [<ffffffff80295fa0>] tcp_sendmsg+0x560/0xee0
       [<ffffffff802a5037>] inet_sendmsg+0x67/0x100
       [<ffffffff80283c9c>] __sock_sendmsg_nosec+0x6c/0x90
       [<ffffffff80283e85>] sock_sendmsg+0xc5/0xf0
       [<ffffffff802847b6>] __sys_sendmsg+0x136/0x430
       [<ffffffff80284ec8>] sys_sendmsg+0x88/0x110
       [<ffffffff80711472>] system_call_fastpath+0x16/0x1b
      Out of Memory: Kill process 2856 (bash) score 9999 or sacrifice child
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarDavid Rientjes <rientjes@google.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      454d0fc5
    • Florian Westphal's avatar
      net: ip, ipv6: handle gso skbs in forwarding path · 91e97e57
      Florian Westphal authored
      commit fe6cc55f upstream.
      
      Marcelo Ricardo Leitner reported problems when the forwarding link path
      has a lower mtu than the incoming one if the inbound interface supports GRO.
      
      Given:
      Host <mtu1500> R1 <mtu1200> R2
      
      Host sends tcp stream which is routed via R1 and R2.  R1 performs GRO.
      
      In this case, the kernel will fail to send ICMP fragmentation needed
      messages (or pkt too big for ipv6), as GSO packets currently bypass dstmtu
      checks in forward path. Instead, Linux tries to send out packets exceeding
      the mtu.
      
      When locking route MTU on Host (i.e., no ipv4 DF bit set), R1 does
      not fragment the packets when forwarding, and again tries to send out
      packets exceeding R1-R2 link mtu.
      
      This alters the forwarding dstmtu checks to take the individual gso
      segment lengths into account.
      
      For ipv6, we send out pkt too big error for gso if the individual
      segments are too big.
      
      For ipv4, we either send icmp fragmentation needed, or, if the DF bit
      is not set, perform software segmentation and let the output path
      create fragments when the packet is leaving the machine.
      It is not 100% correct as the error message will contain the headers of
      the GRO skb instead of the original/segmented one, but it seems to
      work fine in my (limited) tests.
      
      Eric Dumazet suggested to simply shrink mss via ->gso_size to avoid
      sofware segmentation.
      
      However it turns out that skb_segment() assumes skb nr_frags is related
      to mss size so we would BUG there.  I don't want to mess with it considering
      Herbert and Eric disagree on what the correct behavior should be.
      
      Hannes Frederic Sowa notes that when we would shrink gso_size
      skb_segment would then also need to deal with the case where
      SKB_MAX_FRAGS would be exceeded.
      
      This uses sofware segmentation in the forward path when we hit ipv4
      non-DF packets and the outgoing link mtu is too small.  Its not perfect,
      but given the lack of bug reports wrt. GRO fwd being broken this is a
      rare case anyway.  Also its not like this could not be improved later
      once the dust settles.
      Acked-by: default avatarHerbert Xu <herbert@gondor.apana.org.au>
      Reported-by: default avatarMarcelo Ricardo Leitner <mleitner@redhat.com>
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      91e97e57
    • Florian Westphal's avatar
      net: core: introduce netif_skb_dev_features · 0777d823
      Florian Westphal authored
      commit d2069403 upstream.
      
      Will be used by upcoming ipv4 forward path change that needs to
      determine feature mask using skb->dst->dev instead of skb->dev.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0777d823
    • Florian Westphal's avatar
      net: add and use skb_gso_transport_seglen() · 449c5733
      Florian Westphal authored
      commit de960aa9 upstream.
      
      This moves part of Eric Dumazets skb_gso_seglen helper from tbf sched to
      skbuff core so it may be reused by upcoming ip forwarding path patch.
      Signed-off-by: default avatarFlorian Westphal <fw@strlen.de>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      449c5733
    • Daniel Borkmann's avatar
      net: sctp: fix sctp_connectx abi for ia32 emulation/compat mode · 515266de
      Daniel Borkmann authored
      [ Upstream commit ffd59393 ]
      
      SCTP's sctp_connectx() abi breaks for 64bit kernels compiled with 32bit
      emulation (e.g. ia32 emulation or x86_x32). Due to internal usage of
      'struct sctp_getaddrs_old' which includes a struct sockaddr pointer,
      sizeof(param) check will always fail in kernel as the structure in
      64bit kernel space is 4bytes larger than for user binaries compiled
      in 32bit mode. Thus, applications making use of sctp_connectx() won't
      be able to run under such circumstances.
      
      Introduce a compat interface in the kernel to deal with such
      situations by using a 'struct compat_sctp_getaddrs_old' structure
      where user data is copied into it, and then sucessively transformed
      into a 'struct sctp_getaddrs_old' structure with the help of
      compat_ptr(). That fixes sctp_connectx() abi without any changes
      needed in user space, and lets the SCTP test suite pass when compiled
      in 32bit and run on 64bit kernels.
      
      Fixes: f9c67811 ("sctp: Fix regression introduced by new sctp_connectx api")
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Acked-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: default avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      515266de
    • Duan Jiong's avatar
      ipv4: fix counter in_slow_tot · a97a04ff
      Duan Jiong authored
      [ Upstream commit a6254864 ]
      
      since commit 89aef892("ipv4: Delete routing cache."), the counter
      in_slow_tot can't work correctly.
      
      The counter in_slow_tot increase by one when fib_lookup() return successfully
      in ip_route_input_slow(), but actually the dst struct maybe not be created and
      cached, so we can increase in_slow_tot after the dst struct is created.
      Signed-off-by: default avatarDuan Jiong <duanj.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      a97a04ff
    • Jiri Bohac's avatar
      bonding: 802.3ad: make aggregator_identifier bond-private · c92210fd
      Jiri Bohac authored
      [ Upstream commit 163c8ff3 ]
      
      aggregator_identifier is used to assign unique aggregator identifiers
      to aggregators of a bond during device enslaving.
      
      aggregator_identifier is currently a global variable that is zeroed in
      bond_3ad_initialize().
      
      This sequence will lead to duplicate aggregator identifiers for eth1 and eth3:
      
      create bond0
      change bond0 mode to 802.3ad
      enslave eth0 to bond0 		//eth0 gets agg id 1
      enslave eth1 to bond0 		//eth1 gets agg id 2
      create bond1
      change bond1 mode to 802.3ad
      enslave eth2 to bond1		//aggregator_identifier is reset to 0
      				//eth2 gets agg id 1
      enslave eth3 to bond0 		//eth3 gets agg id 2
      
      Fix this by making aggregator_identifier private to the bond.
      Signed-off-by: default avatarJiri Bohac <jbohac@suse.cz>
      Acked-by: default avatarVeaceslav Falico <vfalico@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      c92210fd
    • Emil Goode's avatar
      usbnet: remove generic hard_header_len check · e8047a7d
      Emil Goode authored
      [ Upstream commit eb85569f ]
      
      This patch removes a generic hard_header_len check from the usbnet
      module that is causing dropped packages under certain circumstances
      for devices that send rx packets that cross urb boundaries.
      
      One example is the AX88772B which occasionally send rx packets that
      cross urb boundaries where the remaining partial packet is sent with
      no hardware header. When the buffer with a partial packet is of less
      number of octets than the value of hard_header_len the buffer is
      discarded by the usbnet module.
      
      With AX88772B this can be reproduced by using ping with a packet
      size between 1965-1976.
      
      The bug has been reported here:
      
      https://bugzilla.kernel.org/show_bug.cgi?id=29082
      
      This patch introduces the following changes:
      - Removes the generic hard_header_len check in the rx_complete
        function in the usbnet module.
      - Introduces a ETH_HLEN check for skbs that are not cloned from
        within a rx_fixup callback.
      - For safety a hard_header_len check is added to each rx_fixup
        callback function that could be affected by this change.
        These extra checks could possibly be removed by someone
        who has the hardware to test.
      - Removes a call to dev_kfree_skb_any() and instead utilizes the
        dev->done list to queue skbs for cleanup.
      
      The changes place full responsibility on the rx_fixup callback
      functions that clone skbs to only pass valid skbs to the
      usbnet_skb_return function.
      Signed-off-by: default avatarEmil Goode <emilgoode@gmail.com>
      Reported-by: default avatarIgor Gnatenko <i.gnatenko.brain@gmail.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      e8047a7d
    • Emil Goode's avatar
      net: asix: add missing flag to struct driver_info · 30899235
      Emil Goode authored
      [ Upstream commit d43ff4cd ]
      
      The struct driver_info ax88178_info is assigned the function
      asix_rx_fixup_common as it's rx_fixup callback. This means that
      FLAG_MULTI_PACKET must be set as this function is cloning the
      data and calling usbnet_skb_return. Not setting this flag leads
      to usbnet_skb_return beeing called a second time from within
      the rx_process function in the usbnet module.
      Signed-off-by: default avatarEmil Goode <emilgoode@gmail.com>
      Reported-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      30899235
    • Michael S. Tsirkin's avatar
      vhost: fix ref cnt checking deadlock · 1d2e001b
      Michael S. Tsirkin authored
      [ Upstream commit 0ad8b480 ]
      
      vhost checked the counter within the refcnt before decrementing.  It
      really wanted to know that it is the one that has the last reference, as
      a way to batch freeing resources a bit more efficiently.
      
      Note: we only let refcount go to 0 on device release.
      
      This works well but we now access the ref counter twice so there's a
      race: all users might see a high count and decide to defer freeing
      resources.
      In the end no one initiates freeing resources until the last reference
      is gone (which is on VM shotdown so might happen after a looooong time).
      
      Let's do what we probably should have done straight away:
      switch from kref to plain atomic, documenting the
      semantics, return the refcount value atomically after decrement,
      then use that to avoid the deadlock.
      Reported-by: default avatarQin Chuanyu <qinchuanyu@huawei.com>
      Signed-off-by: default avatarMichael S. Tsirkin <mst@redhat.com>
      Acked-by: default avatarJason Wang <jasowang@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      1d2e001b
    • Nithin Sujir's avatar
      tg3: Fix deadlock in tg3_change_mtu() · 917cc7f0
      Nithin Sujir authored
      [ Upstream commit c6993dfd ]
      
      Quoting David Vrabel -
      "5780 cards cannot have jumbo frames and TSO enabled together.  When
      jumbo frames are enabled by setting the MTU, the TSO feature must be
      cleared.  This is done indirectly by calling netdev_update_features()
      which will call tg3_fix_features() to actually clear the flags.
      
      netdev_update_features() will also trigger a new netlink message for the
      feature change event which will result in a call to tg3_get_stats64()
      which deadlocks on the tg3 lock."
      
      tg3_set_mtu() does not need to be under the tg3 lock since converting
      the flags to use set_bit(). Move it out to after tg3_netif_stop().
      Reported-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Tested-by: default avatarDavid Vrabel <david.vrabel@citrix.com>
      Signed-off-by: default avatarMichael Chan <mchan@broadcom.com>
      Signed-off-by: default avatarNithin Nayak Sujir <nsujir@broadcom.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      917cc7f0
    • John Ogness's avatar
      tcp: tsq: fix nonagle handling · 6f7656e8
      John Ogness authored
      [ Upstream commit bf06200e ]
      
      Commit 46d3ceab ("tcp: TCP Small Queues") introduced a possible
      regression for applications using TCP_NODELAY.
      
      If TCP session is throttled because of tsq, we should consult
      tp->nonagle when TX completion is done and allow us to send additional
      segment, especially if this segment is not a full MSS.
      Otherwise this segment is sent after an RTO.
      
      [edumazet] : Cooked the changelog, added another fix about testing
      sk_wmem_alloc twice because TX completion can happen right before
      setting TSQ_THROTTLED bit.
      
      This problem is particularly visible with recent auto corking,
      but might also be triggered with low tcp_limit_output_bytes
      values or NIC drivers delaying TX completion by hundred of usec,
      and very low rtt.
      
      Thomas Glanzmann for example reported an iscsi regression, caused
      by tcp auto corking making this bug quite visible.
      
      Fixes: 46d3ceab ("tcp: TCP Small Queues")
      Signed-off-by: default avatarJohn Ogness <john.ogness@linutronix.de>
      Signed-off-by: default avatarEric Dumazet <edumazet@google.com>
      Reported-by: default avatarThomas Glanzmann <thomas@glanzmann.de>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      6f7656e8
    • Bjørn Mork's avatar
      net: qmi_wwan: add Netgear Aircard 340U · dce84839
      Bjørn Mork authored
      [ Upstream commit fbd3a77d ]
      
      This device was mentioned in an OpenWRT forum.  Seems to have a "standard"
      Sierra Wireless ifnumber to function layout:
       0: qcdm
       2: nmea
       3: modem
       8: qmi
       9: storage
      Signed-off-by: default avatarBjørn Mork <bjorn@mork.no>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      dce84839
    • Sabrina Dubroca's avatar
      netpoll: fix netconsole IPv6 setup · 7b694760
      Sabrina Dubroca authored
      [ Upstream commit 00fe11b3 ]
      
      Currently, to make netconsole start over IPv6, the source address
      needs to be specified. Without a source address, netpoll_parse_options
      assumes we're setting up over IPv4 and the destination IPv6 address is
      rejected.
      
      Check if the IP version has been forced by a source address before
      checking for a version mismatch when parsing the destination address.
      Signed-off-by: default avatarSabrina Dubroca <sd@queasysnail.net>
      Acked-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7b694760
    • Maciej Żenczykowski's avatar
      net: fix 'ip rule' iif/oif device rename · 7d9c1b6b
      Maciej Żenczykowski authored
      [ Upstream commit 946c032e ]
      
      ip rules with iif/oif references do not update:
      (detach/attach) across interface renames.
      Signed-off-by: default avatarMaciej Żenczykowski <maze@google.com>
      CC: Willem de Bruijn <willemb@google.com>
      CC: Eric Dumazet <edumazet@google.com>
      CC: Chris Davis <chrismd@google.com>
      CC: Carlo Contavalli <ccontavalli@google.com>
      
      Google-Bug-Id: 12936021
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      7d9c1b6b
    • Geert Uytterhoeven's avatar
      ipv4: Fix runtime WARNING in rtmsg_ifa() · 0c360776
      Geert Uytterhoeven authored
      [ Upstream commit 63b5f152 ]
      
      On m68k/ARAnyM:
      
      WARNING: CPU: 0 PID: 407 at net/ipv4/devinet.c:1599 0x316a99()
      Modules linked in:
      CPU: 0 PID: 407 Comm: ifconfig Not tainted
      3.13.0-atari-09263-g0c71d68014d1 #1378
      Stack from 10c4fdf0:
              10c4fdf0 002ffabb 000243e8 00000000 008ced6c 00024416 00316a99 0000063f
              00316a99 00000009 00000000 002501b4 00316a99 0000063f c0a86117 00000080
              c0a86117 00ad0c90 00250a5a 00000014 00ad0c90 00000000 00000000 00000001
              00b02dd0 00356594 00000000 00356594 c0a86117 eff6c9e4 008ced6c 00000002
              008ced60 0024f9b4 00250b52 00ad0c90 00000000 00000000 00252390 00ad0c90
              eff6c9e4 0000004f 00000000 00000000 eff6c9e4 8000e25c eff6c9e4 80001020
      Call Trace: [<000243e8>] warn_slowpath_common+0x52/0x6c
       [<00024416>] warn_slowpath_null+0x14/0x1a
       [<002501b4>] rtmsg_ifa+0xdc/0xf0
       [<00250a5a>] __inet_insert_ifa+0xd6/0x1c2
       [<0024f9b4>] inet_abc_len+0x0/0x42
       [<00250b52>] inet_insert_ifa+0xc/0x12
       [<00252390>] devinet_ioctl+0x2ae/0x5d6
      
      Adding some debugging code reveals that net_fill_ifaddr() fails in
      
          put_cacheinfo(skb, ifa->ifa_cstamp, ifa->ifa_tstamp,
                                    preferred, valid))
      
      nla_put complains:
      
          lib/nlattr.c:454: skb_tailroom(skb) = 12, nla_total_size(attrlen) = 20
      
      Apparently commit 5c766d64 ("ipv4:
      introduce address lifetime") forgot to take into account the addition of
      struct ifa_cacheinfo in inet_nlmsg_size(). Hence add it, like is already
      done for ipv6.
      Suggested-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: default avatarCong Wang <cwang@twopensource.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      0c360776
    • Oliver Hartkopp's avatar
      can: add destructor for self generated skbs · b3097b17
      Oliver Hartkopp authored
      [ Upstream commit 0ae89beb ]
      
      Self generated skbuffs in net/can/bcm.c are setting a skb->sk reference but
      no explicit destructor which is enforced since Linux 3.11 with commit
      376c7311 (net: add a temporary sanity check in skb_orphan()).
      
      This patch adds some helper functions to make sure that a destructor is
      properly defined when a sock reference is assigned to a CAN related skb.
      To create an unshared skb owned by the original sock a common helper function
      has been introduced to replace open coded functions to create CAN echo skbs.
      Signed-off-by: default avatarOliver Hartkopp <socketcan@hartkopp.net>
      Tested-by: default avatarAndre Naujoks <nautsch2@gmail.com>
      Reviewed-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: default avatarJiri Slaby <jslaby@suse.cz>
      b3097b17