1. 09 Aug, 2013 10 commits
    • Zach Brown's avatar
      btrfs: don't loop on large offsets in readdir · db62efbb
      Zach Brown authored
      When btrfs readdir() hits the last entry it sets the readdir offset to a
      huge value to stop buggy apps from breaking when the same name is
      returned by readdir() with concurrent rename()s.
      
      But unconditionally setting the offset to INT_MAX causes readdir() to
      loop returning any entries with offsets past INT_MAX.  It only takes a
      few hours of constant file creation and removal to create entries past
      INT_MAX.
      
      So let's set the huge offset to LLONG_MAX if the last entry has already
      overflowed 32bit loff_t.   Without large offsets behaviour is identical.
      With large offsets 64bit apps will work and 32bit apps will be no more
      broken than they currently are if they see large offsets.
      Signed-off-by: default avatarZach Brown <zab@redhat.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      db62efbb
    • Josef Bacik's avatar
      Btrfs: check to see if root_list is empty before adding it to dead roots · cfad392b
      Josef Bacik authored
      A user reported a panic when running with autodefrag and deleting snapshots.
      This is because we could end up trying to add the root to the dead roots list
      twice.  To fix this check to see if we are empty before adding ourselves to the
      dead roots list.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      cfad392b
    • Josef Bacik's avatar
      Btrfs: release both paths before logging dir/changed extents · f3b15ccd
      Josef Bacik authored
      The ceph guys tripped over this bug where we were still holding onto the
      original path that we used to copy the inode with when logging.  This is based
      on Chris's fix which was reported to fix the problem.  We need to drop the paths
      in two cases anyway so just move the drop up so that we don't have duplicate
      code.  Thanks,
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      f3b15ccd
    • Josef Bacik's avatar
      Btrfs: allow splitting of hole em's when dropping extent cache · ee20a983
      Josef Bacik authored
      I noticed while running multi-threaded fsync tests that sometimes fsck would
      complain about an improper gap.  This happens because we fail to add a hole
      extent to the file, which was happening when we'd split a hole EM because
      btrfs_drop_extent_cache was just discarding the whole em instead of splitting
      it.  So this patch fixes this by allowing us to split a hole em properly, which
      means that added holes actually get logged properly and we no longer see this
      fsck error.  Thankfully we're tolerant of these sort of problems so a user would
      not see any adverse effects of this bug, other than fsck complaining.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ee20a983
    • Josef Bacik's avatar
      Btrfs: make sure the backref walker catches all refs to our extent · ed8c4913
      Josef Bacik authored
      Because we don't mess with the offset into the extent for compressed we will
      properly find both extents for this case
      
      [extent a][extent b][rest of extent a]
      
      but because we already added a ref for the front half we won't add the inode
      information for the second half.  This causes us to leak that memory and not
      print out the other offset when we do logical-resolve.  So fix this by calling
      ulist_add_merge and then add our eie to the existing entry if there is one.
      With this patch we get both offsets out of logical-resolve.  With this and the
      other 2 patches I've sent we now pass btrfs/276 on my vm with compress-force=lzo
      set.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      ed8c4913
    • Josef Bacik's avatar
      Btrfs: fix backref walking when we hit a compressed extent · 8ca15e05
      Josef Bacik authored
      If you do btrfs inspect-internal logical-resolve on a compressed extent that has
      been partly overwritten it won't find anything.  This is because we try and
      match the extent offset we've searched for based on the extent offset in the
      data extent entry.  However this doesn't work for compressed extents because the
      offsets are for the uncompressed size, not the compressed size.  So instead only
      do this check if we are not compressed, that way we can get an actual entry for
      the physical offset rather than nothing for compressed.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      8ca15e05
    • Josef Bacik's avatar
      Btrfs: do not offset physical if we're compressed · b76bb701
      Josef Bacik authored
      xfstest btrfs/276 was freaking out on slower boxes partly because fiemap was
      offsetting the physical based on the extent offset.  This is perfectly fine with
      uncompressed extents, however the extent offset is into the uncompressed area,
      not the compressed.  So we can return a physical value that isn't at all within
      the area we have allocated on disk.  Fix this by returning the start of the
      extent if it is compressed no matter what the offset.  Thanks,
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      b76bb701
    • Liu Bo's avatar
      Btrfs: fix extent buffer leak after backref walking · b5b9b5b3
      Liu Bo authored
      commit 47fb091f(Btrfs: fix unlock after free on rewinded tree blocks)
      takes an extra increment on the reference of allocated dummy extent buffer, so now we
      cannot free this dummy one, and end up with extent buffer leak.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Reviewed-by: default avatarJan Schmidt <list.btrfs@jan-o-sch.net>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      b5b9b5b3
    • Liu Bo's avatar
      Btrfs: fix a bug of snapshot-aware defrag to make it work on partial extents · e68afa49
      Liu Bo authored
      For partial extents, snapshot-aware defrag does not work as expected,
      since
      a) we use the wrong logical offset to search for parents, which should be
         disk_bytenr + extent_offset, not just disk_bytenr,
      b) 'offset' returned by the backref walking just refers to key.offset, not
         the 'offset' stored in btrfs_extent_data_ref which is
         (key.offset - extent_offset).
      
      The reproducer:
      $ mkfs.btrfs sda
      $ mount sda /mnt
      $ btrfs sub create /mnt/sub
      $ for i in `seq 5 -1 1`; do dd if=/dev/zero of=/mnt/sub/foo bs=5k count=1 seek=$i conv=notrunc oflag=sync; done
      $ btrfs sub snap /mnt/sub /mnt/snap1
      $ btrfs sub snap /mnt/sub /mnt/snap2
      $ sync; btrfs filesystem defrag /mnt/sub/foo;
      $ umount /mnt
      $ btrfs-debug-tree sda (Here we can check whether the defrag operation is snapshot-awared.
      
      This addresses the above two problems.
      Signed-off-by: default avatarLiu Bo <bo.li.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      e68afa49
    • Jie Liu's avatar
      btrfs: fix file truncation if FALLOC_FL_KEEP_SIZE is specified · 7cddc193
      Jie Liu authored
      Create a small file and fallocate it to a big size with
      FALLOC_FL_KEEP_SIZE option, then truncate it back to the
      small size again, the disk free space is not changed back
      in this case. i.e,
      
      total 4
      -rw-r--r-- 1 root root 512 Jun 28 11:35 test
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G   56K  7.2G   1% /mnt
      
      -rw-r--r-- 1 root root 512 Jun 28 11:35 /mnt/test
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G  5.1G  2.2G  70% /mnt
      
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G  5.1G  2.2G  70% /mnt
      
      With this fix, the truncated up space is back as:
      Filesystem      Size  Used Avail Use% Mounted on
      ....
      /dev/sdb1       8.0G   56K  7.2G   1% /mnt
      Signed-off-by: default avatarJie Liu <jeff.liu@oracle.com>
      Signed-off-by: default avatarJosef Bacik <jbacik@fusionio.com>
      Signed-off-by: default avatarChris Mason <chris.mason@fusionio.com>
      7cddc193
  2. 04 Aug, 2013 6 commits
  3. 03 Aug, 2013 17 commits
  4. 02 Aug, 2013 7 commits
    • Paul Moore's avatar
      netlabel: use domain based selectors when address based selectors are not available · 6a8b7f0c
      Paul Moore authored
      NetLabel has the ability to selectively assign network security labels
      to outbound traffic based on either the LSM's "domain" (different for
      each LSM), the network destination, or a combination of both.  Depending
      on the type of traffic, local or forwarded, and the type of traffic
      selector, domain or address based, different hooks are used to label the
      traffic; the goal being minimal overhead.
      
      Unfortunately, there is a bug such that a system using NetLabel domain
      based traffic selectors does not correctly label outbound local traffic
      that is not assigned to a socket.  The issue is that in these cases
      the associated NetLabel hook only looks at the address based selectors
      and not the domain based selectors.  This patch corrects this by
      checking both the domain and address based selectors so that the correct
      labeling is applied, regardless of the configuration type.
      
      In order to acomplish this fix, this patch also simplifies some of the
      NetLabel domainhash structures to use a more common outbound traffic
      mapping type: struct netlbl_dommap_def.  This simplifies some of the code
      in this patch and paves the way for further simplifications in the
      future.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      6a8b7f0c
    • Roman Gushchin's avatar
      net: check net.core.somaxconn sysctl values · 5f671d6b
      Roman Gushchin authored
      It's possible to assign an invalid value to the net.core.somaxconn
      sysctl variable, because there is no checks at all.
      
      The sk_max_ack_backlog field of the sock structure is defined as
      unsigned short. Therefore, the backlog argument in inet_listen()
      shouldn't exceed USHRT_MAX. The backlog argument in the listen() syscall
      is truncated to the somaxconn value. So, the somaxconn value shouldn't
      exceed 65535 (USHRT_MAX).
      Also, negative values of somaxconn are meaningless.
      
      before:
      $ sysctl -w net.core.somaxconn=256
      net.core.somaxconn = 256
      $ sysctl -w net.core.somaxconn=65536
      net.core.somaxconn = 65536
      $ sysctl -w net.core.somaxconn=-100
      net.core.somaxconn = -100
      
      after:
      $ sysctl -w net.core.somaxconn=256
      net.core.somaxconn = 256
      $ sysctl -w net.core.somaxconn=65536
      error: "Invalid argument" setting key "net.core.somaxconn"
      $ sysctl -w net.core.somaxconn=-100
      error: "Invalid argument" setting key "net.core.somaxconn"
      
      Based on a prior patch from Changli Gao.
      Signed-off-by: default avatarRoman Gushchin <klamm@yandex-team.ru>
      Reported-by: default avatarChangli Gao <xiaosuo@gmail.com>
      Suggested-by: default avatarEric Dumazet <edumazet@google.com>
      Acked-by: default avatarEric Dumazet <edumazet@google.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      5f671d6b
    • Denis Kirjanov's avatar
      sis900: Fix the tx queue timeout issue · 3508ea33
      Denis Kirjanov authored
      [  198.720048] ------------[ cut here ]------------
      [  198.720108] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:255 dev_watchdog+0x229/0x240()
      [  198.720118] NETDEV WATCHDOG: eth0 (sis900): transmit queue 0 timed out
      [  198.720125] Modules linked in: bridge stp llc dmfe sundance 3c59x sis900 mii
      [  198.720159] CPU: 0 PID: 0 Comm: swapper Not tainted 3.11.0-rc3+ #12
      [  198.720167] Hardware name: System Manufacturer System Name/TUSI-M, BIOS ASUS TUSI-M ACPI BIOS
      Revision 1013 Beta 001 12/14/2001
      [  198.720175]  000000ff c13fa6b9 c169ddcc c12208d6 c169ddf8 c1031e4d c1664a84 c169de24
      [  198.720197]  00000000 c165f5ea 000000ff c13fa6b9 00000001 000000ff c1664a84 c169de10
      [  198.720217]  c1031f13 00000009 c169de08 c1664a84 c169de24 c169de50 c13fa6b9 c165f5ea
      [  198.720240] Call Trace:
      [  198.720257]  [<c13fa6b9>] ? dev_watchdog+0x229/0x240
      [  198.720274]  [<c12208d6>] dump_stack+0x16/0x20
      [  198.720306]  [<c1031e4d>] warn_slowpath_common+0x7d/0xa0
      [  198.720318]  [<c13fa6b9>] ? dev_watchdog+0x229/0x240
      [  198.720330]  [<c1031f13>] warn_slowpath_fmt+0x33/0x40
      [  198.720342]  [<c13fa6b9>] dev_watchdog+0x229/0x240
      [  198.720357]  [<c103f158>] call_timer_fn+0x78/0x150
      [  198.720369]  [<c103f0e0>] ? internal_add_timer+0x40/0x40
      [  198.720381]  [<c13fa490>] ? dev_init_scheduler+0xa0/0xa0
      [  198.720392]  [<c103f33f>] run_timer_softirq+0x10f/0x200
      [  198.720412]  [<c103954f>] ? __do_softirq+0x6f/0x210
      [  198.720424]  [<c13fa490>] ? dev_init_scheduler+0xa0/0xa0
      [  198.720435]  [<c1039598>] __do_softirq+0xb8/0x210
      [  198.720467]  [<c14b54d2>] ? _raw_spin_unlock+0x22/0x30
      [  198.720484]  [<c1003245>] ? handle_irq+0x25/0xd0
      [  198.720496]  [<c1039c0c>] irq_exit+0x9c/0xb0
      [  198.720508]  [<c14bc9d7>] do_IRQ+0x47/0x94
      [  198.720534]  [<c1056078>] ? hrtimer_start+0x28/0x30
      [  198.720564]  [<c14bc8b1>] common_interrupt+0x31/0x38
      [  198.720589]  [<c1008692>] ? default_idle+0x22/0xa0
      [  198.720600]  [<c10083c7>] arch_cpu_idle+0x17/0x30
      [  198.720631]  [<c106d23d>] cpu_startup_entry+0xcd/0x180
      [  198.720643]  [<c14ae30a>] rest_init+0xaa/0xb0
      [  198.720654]  [<c14ae260>] ? reciprocal_value+0x50/0x50
      [  198.720668]  [<c17044e0>] ? repair_env_string+0x60/0x60
      [  198.720679]  [<c1704bda>] start_kernel+0x29a/0x350
      [  198.720690]  [<c17044e0>] ? repair_env_string+0x60/0x60
      [  198.720721]  [<c1704269>] i386_start_kernel+0x39/0xa0
      [  198.720729] ---[ end trace 81e0a6266f5c73a8 ]---
      [  198.720740] eth0: Transmit timeout, status 00000204 00000000
      
      timer routine checks the link status and if it's up calls
      netif_carrier_on() allowing upper layer to start the tx queue
      even if the auto-negotiation process is not finished.
      
      Also remove ugly auto-negotiation check from the sis900_start_xmit()
      
      CC: Duan Fugang <B38611@freescale.com>
      CC: Ben Hutchings <bhutchings@solarflare.com>
      Signed-off-by: default avatarDenis Kirjanov <kda@linux-powerpc.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      3508ea33
    • Linus Torvalds's avatar
      Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband · abe03080
      Linus Torvalds authored
      Pull infiniband/rdma fixes from Roland Dreier:
       - Fixes for the newly merged mlx5 hardware driver
       - Stack info leak fixes from Dan Carpenter
       - Fixes for pkey table handling with SR-IOV
       - A few other small things
      
      * tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
        IPoIB: Fix pkey change flow for virtualization environments
        IPoIB: Make sure child devices use valid/proper pkeys
        IB/core: Create QP1 using the pkey index which contains the default pkey
        mlx5_core: Variable may be used uninitialized
        mlx5_core: Implement new initialization sequence
        mlx5_core: Fix use after free in mlx5_cmd_comp_handler()
        IB/mlx5: Fix stack info leak in mlx5_ib_alloc_ucontext()
        IB/mlx5: Fix error return code in init_one()
        IB/mlx4: Use default pkey when creating tunnel QPs
        RDMA/cma: Only call cma_save_ib_info() for CM REQs
        RDMA/cma: Fix accessing invalid private data for UD
        RDMA/cma: Fix gcc warning
        Revert "RDMA/nes: Fix compilation error when nes_debug is enabled"
        IB/qib: Add err_decode() call for ring dump
        RDMA/cxgb3: Fix stack info leak in iwch_create_cq()
        RDMA/nes: Fix info leaks in nes_create_qp() and nes_create_cq()
        RDMA/ocrdma: Fix several stack info leaks
        RDMA/cxgb4: Fix stack info leak in c4iw_create_qp()
        RDMA/ocrdma: Remove unused include
      abe03080
    • Linus Torvalds's avatar
      Merge tag 'gpio-for-v3.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio · 1cb39a6c
      Linus Torvalds authored
      Pull GPIO fixes from Linus Walleij:
       "Yet another GPIO pull request, fixing the fix from the last one.  It
        turns out that fixing the boot path for device tree boots on OMAP
        breaks out antique systems (such as OMAP1) and we need to find a
        better way.  So we're reverting that "fix" for the moment and thinking
        about something better.
      
        Also fixing a build issue on the MSM driver"
      
      * tag 'gpio-for-v3.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
        gpio_msm: Fix build error due to missing err.h
        Revert "gpio/omap: don't create an IRQ mapping for every GPIO on DT"
        Revert "gpio/omap: auto request GPIO as input if used as IRQ via DT"
        Revert "gpio/omap: fix build error when OF_GPIO is not defined."
      1cb39a6c
    • Daniel Borkmann's avatar
      net: rtm_to_ifaddr: free ifa if ifa_cacheinfo processing fails · 446266b0
      Daniel Borkmann authored
      Commit 5c766d64 ("ipv4: introduce address lifetime") leaves the ifa
      resource that was allocated via inet_alloc_ifa() unfreed when returning
      the function with -EINVAL. Thus, free it first via inet_free_ifa().
      Signed-off-by: default avatarDaniel Borkmann <dborkman@redhat.com>
      Reviewed-by: default avatarJiri Pirko <jiri@resnulli.us>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      446266b0
    • Lekensteyn's avatar
      r8169: remove "PHY reset until link up" log spam · 9bb8eeb5
      Lekensteyn authored
      This message was added in commit a7154cb8 (June 2004, [PATCH] r8169:
      link handling and phy reset rework) and is printed every ten seconds
      when no cable is connected and runtime power management is disabled.
      (Before that commit, "Reset RTL8169s PHY" would be printed instead.)
      Signed-off-by: default avatarPeter Wu <lekensteyn@gmail.com>
      Acked-by: default avatarFrancois Romieu <romieu@fr.zoreil.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      9bb8eeb5