1. 23 Sep, 2014 11 commits
    • Chao Yu's avatar
      f2fs: skip punching hole in special condition · 14cecc5c
      Chao Yu authored
      Now punching hole in directory is not supported in f2fs, so let's limit file
      type in punch_hole().
      
      In addition, in punch_hole if offset is exceed file size, we should skip
      punching hole.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      14cecc5c
    • Chao Yu's avatar
      f2fs: support large sector size · 55cf9cb6
      Chao Yu authored
      Block size in f2fs is 4096 bytes, so theoretically, f2fs can support 4096 bytes
      sector device at maximum. But now f2fs only support 512 bytes size sector, so
      block device such as zRAM which uses page cache as its block storage space will
      not be mounted successfully as mismatch between sector size of zRAM and sector
      size of f2fs supported.
      
      In this patch we support large sector size in f2fs, so block device with sector
      size of 512/1024/2048/4096 bytes can be supported in f2fs.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      55cf9cb6
    • Chao Yu's avatar
      f2fs: fix to truncate blocks past EOF in ->setattr · 09db6a2e
      Chao Yu authored
      By using FALLOC_FL_KEEP_SIZE in ->fallocate of f2fs, we can fallocate block past
      EOF without changing i_size of inode. These blocks past EOF will not be
      truncated in ->setattr as we truncate them only when change the file size.
      
      We should give a chance to truncate blocks out of filesize in setattr().
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      09db6a2e
    • Jaegeuk Kim's avatar
      f2fs: update i_size when __allocate_data_block · 976e4c50
      Jaegeuk Kim authored
      The f2fs_direct_IO uses __allocate_data_block, but inside the allocation path,
      we should update i_size at the changed time to update its inode page.
      Otherwise, we can get wrong i_size after roll-forward recovery.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      976e4c50
    • Jaegeuk Kim's avatar
      f2fs: use MAX_BIO_BLOCKS(sbi) · 90a893c7
      Jaegeuk Kim authored
      This patch cleans up a simple macro.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      90a893c7
    • Jaegeuk Kim's avatar
      f2fs: remove redundant operation during roll-forward recovery · c52e1b10
      Jaegeuk Kim authored
      If same data is updated multiple times, we don't need to redo whole the
      operations.
      Let's just update the lastest one.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      c52e1b10
    • Jaegeuk Kim's avatar
      f2fs: do not skip latest inode information · 19c9c466
      Jaegeuk Kim authored
      In f2fs_sync_file, if there is no written appended writes, it skips
      to write its node blocks.
      But, if there is up-to-date inode page, we should write it to update
      its metadata during the roll-forward recovery.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      19c9c466
    • Jaegeuk Kim's avatar
      f2fs: fix roll-forward missing scenarios · 441ac5cb
      Jaegeuk Kim authored
      We can summarize the roll forward recovery scenarios as follows.
      
      [Term] F: fsync_mark, D: dentry_mark
      
      1. inode(x) | CP | inode(x) | dnode(F)
      -> Update the latest inode(x).
      
      2. inode(x) | CP | inode(F) | dnode(F)
      -> No problem.
      
      3. inode(x) | CP | dnode(F) | inode(x)
      -> Recover to the latest dnode(F), and drop the last inode(x)
      
      4. inode(x) | CP | dnode(F) | inode(F)
      -> No problem.
      
      5. CP | inode(x) | dnode(F)
      -> The inode(DF) was missing. Should drop this dnode(F).
      
      6. CP | inode(DF) | dnode(F)
      -> No problem.
      
      7. CP | dnode(F) | inode(DF)
      -> If f2fs_iget fails, then goto next to find inode(DF).
      
      8. CP | dnode(F) | inode(x)
      -> If f2fs_iget fails, then goto next to find inode(DF).
         But it will fail due to no inode(DF).
      
      So, this patch adds some missing points such as #1, #5, #7, and #8.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      441ac5cb
    • Jaegeuk Kim's avatar
      f2fs: fix conditions to remain recovery information in f2fs_sync_file · 88bd02c9
      Jaegeuk Kim authored
      This patch revisited whole the recovery information during the f2fs_sync_file.
      
      In this patch, there are three information to make a decision.
      
      a) IS_CHECKPOINTED,	/* is it checkpointed before? */
      b) HAS_FSYNCED_INODE,	/* is the inode fsynced before? */
      c) HAS_LAST_FSYNC,	/* has the latest node fsync mark? */
      
      And, the scenarios for our rule are based on:
      
      [Term] F: fsync_mark, D: dentry_mark
      
      1. inode(x) | CP | inode(x) | dnode(F)
      2. inode(x) | CP | inode(F) | dnode(F)
      3. inode(x) | CP | dnode(F) | inode(x) | inode(F)
      4. inode(x) | CP | dnode(F) | inode(F)
      5. CP | inode(x) | dnode(F) | inode(DF)
      6. CP | inode(DF) | dnode(F)
      7. CP | dnode(F) | inode(DF)
      8. CP | dnode(F) | inode(x) | inode(DF)
      
      For example, #3, the three conditions should be changed as follows.
      
         inode(x) | CP | dnode(F) | inode(x) | inode(F)
      a)    x       o      o          o          o
      b)    x       x      x          x          o
      c)    x       o      o          x          o
      
      If f2fs_sync_file stops   ------^,
       it should write inode(F)    --------------^
      
      So, the need_inode_block_update should return true, since
       c) get_nat_flag(e, HAS_LAST_FSYNC), is false.
      
      For example, #8,
            CP | alloc | dnode(F) | inode(x) | inode(DF)
      a)    o      x        x          x          x
      b)    x               x          x          o
      c)    o               o          x          o
      
      If f2fs_sync_file stops   -------^,
       it should write inode(DF)    --------------^
      
      Note that, the roll-forward policy should follow this rule, which means,
      if there are any missing blocks, we doesn't need to recover that inode.
      Signed-off-by: default avatarHuang Ying <ying.huang@intel.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      88bd02c9
    • Jaegeuk Kim's avatar
      f2fs: introduce a flag to represent each nat entry information · 7ef35e3b
      Jaegeuk Kim authored
      This patch introduces a flag in the nat entry structure to merge various
      information such as checkpointed and fsync_done marks.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      7ef35e3b
    • Jaegeuk Kim's avatar
      f2fs: use meta_inode cache to improve roll-forward speed · 4c521f49
      Jaegeuk Kim authored
      Previously, all the dnode pages should be read during the roll-forward recovery.
      Even worsely, whole the chain was traversed twice.
      This patch removes that redundant and costly read operations by using page cache
      of meta_inode and readahead function as well.
      Reviewed-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      4c521f49
  2. 16 Sep, 2014 5 commits
  3. 11 Sep, 2014 1 commit
  4. 09 Sep, 2014 10 commits
    • Jaegeuk Kim's avatar
      f2fs: fix negative value for lseek offset · 0b4c5afd
      Jaegeuk Kim authored
      If application throws negative value of lseek with SEEK_DATA|SEEK_HOLE,
      previous f2fs went into BUG_ON in get_dnode_of_data, which was reported
      by Tommi Rantala.
      
      He could make a simple code to detect this having:
      	lseek(fd, -17595150933902LL, SEEK_DATA);
      
      This patch should resolve that bug.
      Reported-by: default avatarTommi Rentala <tt.rantala@gmail.com>
      [Jaegeuk Kim: relocate the condition as suggested by Chao]
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      0b4c5afd
    • Huang Ying's avatar
      f2fs: avoid node page to be written twice in gc_node_segment · 9a01b56b
      Huang Ying authored
      In gc_node_segment, if node page gc is run concurrently with node page
      writeback, and check_valid_map and get_node_page run after page locked
      and before cur_valid_map is updated as below, it is possible for the
      page to be written twice unnecessarily.
      
      			sync_node_pages
      			  try_lock_page
      			  ...
      check_valid_map		  f2fs_write_node_page
      			    ...
      			    write_node_page
      			      do_write_page
      			        allocate_data_block
      				  ...
      				  refresh_sit_entry /* update cur_valid_map */
      				  ...
      			    ...
      			    unlock_page
      get_node_page
      ...
      set_page_dirty
      ...
      f2fs_put_page
        unlock_page
      
      This can be solved via calling check_valid_map after get_node_page again.
      Signed-off-by: default avatarHuang, Ying <ying.huang@intel.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      9a01b56b
    • Gu Zheng's avatar
      f2fs: use lock-less list(llist) to simplify the flush cmd management · 721bd4d5
      Gu Zheng authored
      We use flush cmd control to collect many flush cmds, and flush them
      together. In this case, we use two list to manage the flush cmds
      (collect and dispatch), and one spin lock is used to protect this.
      In fact, the lock-less list(llist) is very suitable to this case,
      and we use simplify this routine.
      
      -
      v2:
      -use llist_for_each_entry_safe to fix possible use-after-free issue.
      -remove the unused field from struct flush_cmd.
      Thanks for Yu's suggestion.
      -
      Signed-off-by: default avatarGu Zheng <guz.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      721bd4d5
    • Chao Yu's avatar
      f2fs: refactor flush_sit_entries codes for reducing SIT writes · 184a5cd2
      Chao Yu authored
      In commit aec71382 ("f2fs: refactor flush_nat_entries codes for reducing NAT
      writes"), we descripte the issue as below:
      
      "Although building NAT journal in cursum reduce the read/write work for NAT
      block, but previous design leave us lower performance when write checkpoint
      frequently for these cases:
      1. if journal in cursum has already full, it's a bit of waste that we flush all
         nat entries to page for persistence, but not to cache any entries.
      2. if journal in cursum is not full, we fill nat entries to journal util
         journal is full, then flush the left dirty entries to disk without merge
         journaled entries, so these journaled entries may be flushed to disk at next
         checkpoint but lost chance to flushed last time."
      
      Actually, we have the same problem in using SIT journal area.
      
      In this patch, firstly we will update sit journal with dirty entries as many as
      possible. Secondly if there is no space in sit journal, we will remove all
      entries in journal and walk through the whole dirty entry bitmap of sit,
      accounting dirty sit entries located in same SIT block to sit entry set. All
      entry sets are linked to list sit_entry_set in sm_info, sorted ascending order
      by count of entries in set. Later we flush entries in set which have fewest
      entries into journal as many as we can, and then flush dense set with merged
      entries to disk.
      
      In this way we can use sit journal area more effectively, also we will reduce
      SIT update, result in gaining in performance and saving lifetime of flash
      device.
      
      In my testing environment, it shows this patch can help to reduce SIT block
      update obviously.
      
      virtual machine + hard disk:
      fsstress -p 20 -n 400 -l 5
      		sit page num	cp count	sit pages/cp
      based		2006.50		1349.75		1.486
      patched		1566.25		1463.25		1.070
      
      Our latency of merging op is small when handling a great number of dirty SIT
      entries in flush_sit_entries:
      latency(ns)	dirty sit count
      36038		2151
      49168		2123
      37174		2232
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      184a5cd2
    • Chao Yu's avatar
      f2fs: remove unneeded sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO · d3a14afd
      Chao Yu authored
      sit_i in macro SIT_BLOCK_OFFSET/START_SEGNO is not used, remove it.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      d3a14afd
    • Jaegeuk Kim's avatar
      f2fs: need fsck.f2fs if the recovery was failed · b0c44f05
      Jaegeuk Kim authored
      If the roll-forward recovery was failed, we'd better conduct fsck.f2fs.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b0c44f05
    • Jaegeuk Kim's avatar
      f2fs: handle bug cases by letting fsck.f2fs initiate · ec325b52
      Jaegeuk Kim authored
      This patch adds to handle corner buggy cases for fsck.f2fs.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      ec325b52
    • Jaegeuk Kim's avatar
      f2fs: add BUG cases to initiate fsck.f2fs · 05796763
      Jaegeuk Kim authored
      This patch replaces BUG cases with f2fs_bug_on to remain fsck.f2fs information.
      And it implements some void functions to initiate fsck.f2fs too.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      05796763
    • Jaegeuk Kim's avatar
      f2fs: need fsck.f2fs when f2fs_bug_on is triggered · 9850cf4a
      Jaegeuk Kim authored
      If any f2fs_bug_on is triggered, fsck.f2fs is needed.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      9850cf4a
    • Jaegeuk Kim's avatar
      f2fs: retain inconsistency information to initiate fsck.f2fs · 2ae4c673
      Jaegeuk Kim authored
      This patch adds sbi->need_fsck to conduct fsck.f2fs later.
      This flag can only be removed by fsck.f2fs.
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      2ae4c673
  5. 04 Sep, 2014 1 commit
  6. 03 Sep, 2014 10 commits
  7. 02 Sep, 2014 2 commits
    • Jiri Kosina's avatar
      Revert "leds: convert blink timer to workqueue" · 9067359f
      Jiri Kosina authored
      This reverts commit 8b37e1be.
      
      It's broken as it changes led_blink_set() in a way that it can now sleep
      (while synchronously waiting for workqueue to be cancelled). That's a
      problem, because it's possible that this function gets called from atomic
      context (tpt_trig_timer() takes a readlock and thus disables preemption).
      
      This has been brought up 3 weeks ago already [1] but no proper fix has
      materialized, and I keep seeing the problem since 3.17-rc1.
      
      [1] https://lkml.org/lkml/2014/8/16/128
      
       BUG: sleeping function called from invalid context at kernel/workqueue.c:2650
       in_atomic(): 1, irqs_disabled(): 0, pid: 2335, name: wpa_supplicant
       5 locks held by wpa_supplicant/2335:
        #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff814c7c92>] rtnl_lock+0x12/0x20
        #1:  (&wdev->mtx){+.+.+.}, at: [<ffffffffc06e649c>] cfg80211_mgd_wext_siwessid+0x5c/0x180 [cfg80211]
        #2:  (&local->mtx){+.+.+.}, at: [<ffffffffc0817dea>] ieee80211_prep_connection+0x17a/0x9a0 [mac80211]
        #3:  (&local->chanctx_mtx){+.+.+.}, at: [<ffffffffc08081ed>] ieee80211_vif_use_channel+0x5d/0x2a0 [mac80211]
        #4:  (&trig->leddev_list_lock){.+.+..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211]
       CPU: 0 PID: 2335 Comm: wpa_supplicant Not tainted 3.17.0-rc3 #1
       Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008
        ffff8800360b5a50 ffff8800751f76d8 ffffffff8159e97f ffff8800360b5a30
        ffff8800751f76e8 ffffffff810739a5 ffff8800751f77b0 ffffffff8106862f
        ffffffff810685d0 0aa2209200000000 ffff880000000004 ffff8800361c59d0
       Call Trace:
        [<ffffffff8159e97f>] dump_stack+0x4d/0x66
        [<ffffffff810739a5>] __might_sleep+0xe5/0x120
        [<ffffffff8106862f>] flush_work+0x5f/0x270
        [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
        [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90
        [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100
        [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
        [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
        [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
        [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
        [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211]
        [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211]
        [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211]
        [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211]
        [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211]
        [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211]
        [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211]
        [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211]
        [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211]
        [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211]
        [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211]
        [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211]
        [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211]
        [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211]
        [<ffffffffc06e36c0>] ? cfg80211_wext_giwessid+0x50/0x50 [cfg80211]
        [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211]
        [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0
        [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0
        [<ffffffff81584fa0>] ? ioctl_standard_iw_point+0x3e0/0x3e0
        [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100
        [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0
        [<ffffffff814cfb29>] dev_ioctl+0x329/0x620
        [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0
        [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520
        [<ffffffff815a67fb>] ? sysret_check+0x1b/0x56
        [<ffffffff810946ed>] ? trace_hardirqs_on_caller+0xfd/0x1c0
        [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0
        [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f
       wlan0: send auth to 00:0b:6b:3c:8c:e4 (try 1/3)
       wlan0: authenticated
       wlan0: associate with 00:0b:6b:3c:8c:e4 (try 1/3)
       wlan0: RX AssocResp from 00:0b:6b:3c:8c:e4 (capab=0x431 status=0 aid=2)
       wlan0: associated
       IPv6: ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
       cfg80211: Calling CRDA for country: NA
       wlan0: Limiting TX power to 27 (27 - 0) dBm as advertised by 00:0b:6b:3c:8c:e4
      
       =================================
       [ INFO: inconsistent lock state ]
       3.17.0-rc3 #1 Not tainted
       ---------------------------------
       inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
       swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
        ((&(&led_cdev->blink_work)->work)){+.?...}, at: [<ffffffff810685d0>] flush_work+0x0/0x270
       {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff81094dbe>] __lock_acquire+0x30e/0x1a30
         [<ffffffff81096c81>] lock_acquire+0x91/0x110
         [<ffffffff81068608>] flush_work+0x38/0x270
         [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
         [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
         [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
         [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
         [<ffffffffc081ecdd>] ieee80211_mod_tpt_led_trig+0x9d/0x160 [mac80211]
         [<ffffffffc07e4278>] __ieee80211_recalc_idle+0x98/0x140 [mac80211]
         [<ffffffffc07e59ce>] ieee80211_idle_off+0xe/0x10 [mac80211]
         [<ffffffffc0804e5b>] ieee80211_add_chanctx+0x3b/0x220 [mac80211]
         [<ffffffffc08062e4>] ieee80211_new_chanctx+0x44/0xf0 [mac80211]
         [<ffffffffc080838a>] ieee80211_vif_use_channel+0x1fa/0x2a0 [mac80211]
         [<ffffffffc0817df8>] ieee80211_prep_connection+0x188/0x9a0 [mac80211]
         [<ffffffffc081c246>] ieee80211_mgd_auth+0x256/0x2e0 [mac80211]
         [<ffffffffc07eab33>] ieee80211_auth+0x13/0x20 [mac80211]
         [<ffffffffc06cb006>] cfg80211_mlme_auth+0x106/0x270 [cfg80211]
         [<ffffffffc06ce085>] cfg80211_conn_do_work+0x155/0x3b0 [cfg80211]
         [<ffffffffc06cf670>] cfg80211_connect+0x3f0/0x540 [cfg80211]
         [<ffffffffc06e6148>] cfg80211_mgd_wext_connect+0x158/0x1f0 [cfg80211]
         [<ffffffffc06e651e>] cfg80211_mgd_wext_siwessid+0xde/0x180 [cfg80211]
         [<ffffffffc06e36dd>] cfg80211_wext_siwessid+0x1d/0x40 [cfg80211]
         [<ffffffff81584d0c>] ioctl_standard_iw_point+0x14c/0x3e0
         [<ffffffff8158502a>] ioctl_standard_call+0x8a/0xd0
         [<ffffffff81584b76>] wireless_process_ioctl.constprop.10+0xb6/0x100
         [<ffffffff8158521d>] wext_handle_ioctl+0x5d/0xb0
         [<ffffffff814cfb29>] dev_ioctl+0x329/0x620
         [<ffffffff8149c7f2>] sock_ioctl+0x142/0x2e0
         [<ffffffff811b0140>] do_vfs_ioctl+0x300/0x520
         [<ffffffff811b03e1>] SyS_ioctl+0x81/0xa0
         [<ffffffff815a67d6>] system_call_fastpath+0x1a/0x1f
       irq event stamp: 493416
       hardirqs last  enabled at (493416): [<ffffffff81068a5f>] __cancel_work_timer+0x6f/0x100
       hardirqs last disabled at (493415): [<ffffffff81067e9f>] try_to_grab_pending+0x1f/0x160
       softirqs last  enabled at (493408): [<ffffffff81053ced>] _local_bh_enable+0x1d/0x50
       softirqs last disabled at (493409): [<ffffffff81054c75>] irq_exit+0xa5/0xb0
      
       other info that might help us debug this:
        Possible unsafe locking scenario:
      
              CPU0
              ----
         lock((&(&led_cdev->blink_work)->work));
         <Interrupt>
           lock((&(&led_cdev->blink_work)->work));
      
        *** DEADLOCK ***
      
       2 locks held by swapper/0/0:
        #0:  (((&tpt_trig->timer))){+.-...}, at: [<ffffffff810b4c50>] call_timer_fn+0x0/0x180
        #1:  (&trig->leddev_list_lock){.+.?..}, at: [<ffffffffc081e68c>] tpt_trig_timer+0xec/0x170 [mac80211]
      
       stack backtrace:
       CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.17.0-rc3 #1
       Hardware name: LENOVO 7470BN2/7470BN2, BIOS 6DET38WW (2.02 ) 12/19/2008
        ffffffff8246eb30 ffff88007c203b00 ffffffff8159e97f ffffffff81a194c0
        ffff88007c203b50 ffffffff81599c29 0000000000000001 ffffffff00000001
        ffff880000000000 0000000000000006 ffffffff81a194c0 ffffffff81093ad0
       Call Trace:
        <IRQ>  [<ffffffff8159e97f>] dump_stack+0x4d/0x66
        [<ffffffff81599c29>] print_usage_bug+0x1f4/0x205
        [<ffffffff81093ad0>] ? check_usage_backwards+0x140/0x140
        [<ffffffff810944d3>] mark_lock+0x223/0x2b0
        [<ffffffff81094d60>] __lock_acquire+0x2b0/0x1a30
        [<ffffffff81096c81>] lock_acquire+0x91/0x110
        [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
        [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
        [<ffffffff81068608>] flush_work+0x38/0x270
        [<ffffffff810685d0>] ? mod_delayed_work_on+0x80/0x80
        [<ffffffff810945ca>] ? mark_held_locks+0x6a/0x90
        [<ffffffff81068a5f>] ? __cancel_work_timer+0x6f/0x100
        [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
        [<ffffffff8109469d>] ? trace_hardirqs_on_caller+0xad/0x1c0
        [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
        [<ffffffff81068a6b>] __cancel_work_timer+0x7b/0x100
        [<ffffffff81068b0e>] cancel_delayed_work_sync+0xe/0x10
        [<ffffffff8147cf3b>] led_blink_set+0x1b/0x40
        [<ffffffffc081e6b0>] tpt_trig_timer+0x110/0x170 [mac80211]
        [<ffffffff810b4cc5>] call_timer_fn+0x75/0x180
        [<ffffffff810b4c50>] ? process_timeout+0x10/0x10
        [<ffffffffc081e5a0>] ? __ieee80211_get_rx_led_name+0x10/0x10 [mac80211]
        [<ffffffff810b50ac>] run_timer_softirq+0x1fc/0x2f0
        [<ffffffff81054805>] __do_softirq+0x115/0x2e0
        [<ffffffff81054c75>] irq_exit+0xa5/0xb0
        [<ffffffff810049b3>] do_IRQ+0x53/0xf0
        [<ffffffff815a74af>] common_interrupt+0x6f/0x6f
        <EOI>  [<ffffffff8147b56e>] ? cpuidle_enter_state+0x6e/0x180
        [<ffffffff8147b732>] cpuidle_enter+0x12/0x20
        [<ffffffff8108bba0>] cpu_startup_entry+0x330/0x360
        [<ffffffff8158fb51>] rest_init+0xc1/0xd0
        [<ffffffff8158fa90>] ? csum_partial_copy_generic+0x170/0x170
        [<ffffffff81af3ff2>] start_kernel+0x44f/0x45a
        [<ffffffff81af399c>] ? set_init_arg+0x53/0x53
        [<ffffffff81af35ad>] x86_64_start_reservations+0x2a/0x2c
        [<ffffffff81af36a0>] x86_64_start_kernel+0xf1/0xf4
      
      Cc: Vincent Donnefort <vdonnefort@gmail.com>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarJiri Kosina <jkosina@suse.cz>
      Signed-off-by: default avatarBryan Wu <cooloney@gmail.com>
      9067359f
    • Chao Yu's avatar
      f2fs: reposition unlock_new_inode to prevent accessing invalid inode · b73e5282
      Chao Yu authored
      As the race condition on the inode cache, following scenario can appear:
      [Thread a]				[Thread b]
      					->f2fs_mkdir
      					  ->f2fs_add_link
      					    ->__f2fs_add_link
      					      ->init_inode_metadata failed here
      ->gc_thread_func
        ->f2fs_gc
          ->do_garbage_collect
            ->gc_data_segment
              ->f2fs_iget
                ->iget_locked
                  ->wait_on_inode
      					  ->unlock_new_inode
              ->move_data_page
      					  ->make_bad_inode
      					  ->iput
      
      When we fail in create/symlink/mkdir/mknod/tmpfile, the new allocated inode
      should be set as bad to avoid being accessed by other thread. But in above
      scenario, it allows f2fs to access the invalid inode before this inode was set
      as bad.
      This patch fix the potential problem, and this issue was found by code review.
      
      change log from v1:
       o Add condition judgment in gc_data_segment() suggested by Changman Lee.
       o use iget_failed to simplify code.
      Signed-off-by: default avatarChao Yu <chao2.yu@samsung.com>
      Signed-off-by: default avatarJaegeuk Kim <jaegeuk@kernel.org>
      b73e5282