1. 25 Jul, 2013 19 commits
    • Eric Paris's avatar
      SELinux: change sbsec->behavior to short · f936c6e5
      Eric Paris authored
      We only have 6 options, so char is good enough, but use a short as that
      packs nicely.  This shrinks the superblock_security_struct just a little
      bit.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      f936c6e5
    • Eric Paris's avatar
      SELinux: renumber the superblock options · cfca0303
      Eric Paris authored
      Just to make it clear that we have mount time options and flags,
      separate them.  Since I decided to move the non-mount options above
      above 0x10, we need a short instead of a char.  (x86 padding says
      this takes up no additional space as we have a 3byte whole in the
      structure)
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      cfca0303
    • Eric Paris's avatar
      SELinux: do all flags twiddling in one place · eadcabc6
      Eric Paris authored
      Currently we set the initialize and seclabel flag in one place.  Do some
      unrelated printk then we unset the seclabel flag.  Eww.  Instead do the flag
      twiddling in one place in the code not seperated by unrelated printk.  Also
      don't set and unset the seclabel flag.  Only set it if we need to.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      eadcabc6
    • Eric Paris's avatar
      SELinux: rename SE_SBLABELSUPP to SBLABEL_MNT · 12f348b9
      Eric Paris authored
      Just a flag rename as we prepare to make it not so special.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      12f348b9
    • Eric Paris's avatar
      SELinux: use define for number of bits in the mnt flags mask · af8e50cc
      Eric Paris authored
      We had this random hard coded value of '8' in the code (I put it there)
      for the number of bits to check for mount options.  This is stupid.  Instead
      use the #define we already have which tells us the number of mount
      options.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      af8e50cc
    • Eric Paris's avatar
      SELinux: make it harder to get the number of mnt opts wrong · d355987f
      Eric Paris authored
      Instead of just hard coding a value, use the enum to out benefit.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      d355987f
    • Eric Paris's avatar
      SELinux: remove crazy contortions around proc · 40d3d0b8
      Eric Paris authored
      We check if the fsname is proc and if so set the proc superblock security
      struct flag.  We then check if the flag is set and use the string 'proc'
      for the fsname instead of just using the fsname.  What's the point?  It's
      always proc...  Get rid of the useless conditional.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      40d3d0b8
    • Eric Paris's avatar
      SELinux: fix selinuxfs policy file on big endian systems · b138004e
      Eric Paris authored
      The /sys/fs/selinux/policy file is not valid on big endian systems like
      ppc64 or s390.  Let's see why:
      
      static int hashtab_cnt(void *key, void *data, void *ptr)
      {
      	int *cnt = ptr;
      	*cnt = *cnt + 1;
      
      	return 0;
      }
      
      static int range_write(struct policydb *p, void *fp)
      {
      	size_t nel;
      [...]
      	/* count the number of entries in the hashtab */
      	nel = 0;
      	rc = hashtab_map(p->range_tr, hashtab_cnt, &nel);
      	if (rc)
      		return rc;
      	buf[0] = cpu_to_le32(nel);
      	rc = put_entry(buf, sizeof(u32), 1, fp);
      
      So size_t is 64 bits.  But then we pass a pointer to it as we do to
      hashtab_cnt.  hashtab_cnt thinks it is a 32 bit int and only deals with
      the first 4 bytes.  On x86_64 which is little endian, those first 4
      bytes and the least significant, so this works out fine.  On ppc64/s390
      those first 4 bytes of memory are the high order bits.  So at the end of
      the call to hashtab_map nel has a HUGE number.  But the least
      significant 32 bits are all 0's.
      
      We then pass that 64 bit number to cpu_to_le32() which happily truncates
      it to a 32 bit number and does endian swapping.  But the low 32 bits are
      all 0's.  So no matter how many entries are in the hashtab, big endian
      systems always say there are 0 entries because I screwed up the
      counting.
      
      The fix is easy.  Use a 32 bit int, as the hashtab_cnt expects, for nel.
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      b138004e
    • Stephen Smalley's avatar
      SELinux: Enable setting security contexts on rootfs inodes. · 5c73fceb
      Stephen Smalley authored
      rootfs (ramfs) can support setting of security contexts
      by userspace due to the vfs fallback behavior of calling
      the security module to set the in-core inode state
      for security.* attributes when the filesystem does not
      provide an xattr handler.  No xattr handler required
      as the inodes are pinned in memory and have no backing
      store.
      
      This is useful in allowing early userspace to label individual
      files within a rootfs while still providing a policy-defined
      default via genfs.
      Signed-off-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      5c73fceb
    • Waiman Long's avatar
      SELinux: Increase ebitmap_node size for 64-bit configuration · a767f680
      Waiman Long authored
      Currently, the ebitmap_node structure has a fixed size of 32 bytes. On
      a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being
      used as bitmaps. The overhead ratio is 1/4.
      
      On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes
      are left for bitmap purpose and the overhead ratio is 1/2. With a
      3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit()
      function to be called about 9 million times. The average number of
      ebitmap_node traversal is about 3.7.
      
      This patch increases the size of the ebitmap_node structure to 64
      bytes for 64-bit system to keep the overhead ratio at 1/4. This may
      also improve performance a little bit by making node to node traversal
      less frequent (< 2) as more bits are available in each node.
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      a767f680
    • Waiman Long's avatar
      SELinux: Reduce overhead of mls_level_isvalid() function call · fee71142
      Waiman Long authored
      While running the high_systime workload of the AIM7 benchmark on
      a 2-socket 12-core Westmere x86-64 machine running 3.10-rc4 kernel
      (with HT on), it was found that a pretty sizable amount of time was
      spent in the SELinux code. Below was the perf trace of the "perf
      record -a -s" of a test run at 1500 users:
      
        5.04%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
        1.96%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
        1.95%            ls  [kernel.kallsyms]     [k] find_next_bit
      
      The ebitmap_get_bit() was the hottest function in the perf-report
      output.  Both the ebitmap_get_bit() and find_next_bit() functions
      were, in fact, called by mls_level_isvalid(). As a result, the
      mls_level_isvalid() call consumed 8.95% of the total CPU time of
      all the 24 virtual CPUs which is quite a lot. The majority of the
      mls_level_isvalid() function invocations come from the socket creation
      system call.
      
      Looking at the mls_level_isvalid() function, it is checking to see
      if all the bits set in one of the ebitmap structure are also set in
      another one as well as the highest set bit is no bigger than the one
      specified by the given policydb data structure. It is doing it in
      a bit-by-bit manner. So if the ebitmap structure has many bits set,
      the iteration loop will be done many times.
      
      The current code can be rewritten to use a similar algorithm as the
      ebitmap_contains() function with an additional check for the
      highest set bit. The ebitmap_contains() function was extended to
      cover an optional additional check for the highest set bit, and the
      mls_level_isvalid() function was modified to call ebitmap_contains().
      
      With that change, the perf trace showed that the used CPU time drop
      down to just 0.08% (ebitmap_contains + mls_level_isvalid) of the
      total which is about 100X less than before.
      
        0.07%            ls  [kernel.kallsyms]     [k] ebitmap_contains
        0.05%            ls  [kernel.kallsyms]     [k] ebitmap_get_bit
        0.01%            ls  [kernel.kallsyms]     [k] mls_level_isvalid
        0.01%            ls  [kernel.kallsyms]     [k] find_next_bit
      
      The remaining ebitmap_get_bit() and find_next_bit() functions calls
      are made by other kernel routines as the new mls_level_isvalid()
      function will not call them anymore.
      
      This patch also improves the high_systime AIM7 benchmark result,
      though the improvement is not as impressive as is suggested by the
      reduction in CPU time spent in the ebitmap functions. The table below
      shows the performance change on the 2-socket x86-64 system (with HT
      on) mentioned above.
      
      +--------------+---------------+----------------+-----------------+
      |   Workload   | mean % change | mean % change  | mean % change   |
      |              | 10-100 users  | 200-1000 users | 1100-2000 users |
      +--------------+---------------+----------------+-----------------+
      | high_systime |     +0.1%     |     +0.9%      |     +2.6%       |
      +--------------+---------------+----------------+-----------------+
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Acked-by: default avatarStephen Smalley <sds@tycho.nsa.gov>
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      fee71142
    • Paul Moore's avatar
      selinux: remove the BUG_ON() from selinux_skb_xfrm_sid() · bed4d7ef
      Paul Moore authored
      Remove the BUG_ON() from selinux_skb_xfrm_sid() and propogate the
      error code up to the caller.  Also check the return values in the
      only caller function, selinux_skb_peerlbl_sid().
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      bed4d7ef
    • Paul Moore's avatar
      selinux: cleanup the XFRM header · d1b17b09
      Paul Moore authored
      Remove the unused get_sock_isec() function and do some formatting
      fixes.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      d1b17b09
    • Paul Moore's avatar
      selinux: cleanup selinux_xfrm_decode_session() · e2193695
      Paul Moore authored
      Some basic simplification.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      e2193695
    • Paul Moore's avatar
    • Paul Moore's avatar
      selinux: cleanup selinux_xfrm_sock_rcv_skb() and selinux_xfrm_postroute_last() · eef9b416
      Paul Moore authored
      Some basic simplification and comment reformatting.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      eef9b416
    • Paul Moore's avatar
      selinux: cleanup selinux_xfrm_policy_lookup() and selinux_xfrm_state_pol_flow_match() · 96484348
      Paul Moore authored
      Do some basic simplification and comment reformatting.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      96484348
    • Paul Moore's avatar
      selinux: cleanup and consolidate the XFRM alloc/clone/delete/free code · ccf17cc4
      Paul Moore authored
      The SELinux labeled IPsec code state management functions have been
      long neglected and could use some cleanup and consolidation.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      ccf17cc4
    • Paul Moore's avatar
      lsm: split the xfrm_state_alloc_security() hook implementation · 2e5aa866
      Paul Moore authored
      The xfrm_state_alloc_security() LSM hook implementation is really a
      multiplexed hook with two different behaviors depending on the
      arguments passed to it by the caller.  This patch splits the LSM hook
      implementation into two new hook implementations, which match the
      LSM hooks in the rest of the kernel:
      
       * xfrm_state_alloc
       * xfrm_state_alloc_acquire
      
      Also included in this patch are the necessary changes to the SELinux
      code; no other LSMs are affected.
      Signed-off-by: default avatarPaul Moore <pmoore@redhat.com>
      Signed-off-by: default avatarEric Paris <eparis@redhat.com>
      2e5aa866
  2. 30 Jun, 2013 6 commits
    • Linus Torvalds's avatar
      Linux 3.10 · 8bb495e3
      Linus Torvalds authored
      8bb495e3
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · f0277dce
      Linus Torvalds authored
      Pull another powerpc fix from Benjamin Herrenschmidt:
       "I mentioned that while we had fixed the kernel crashes, EEH error
        recovery didn't always recover...  It appears that I had a fix for
        that already in powerpc-next (with a stable CC).
      
        I cherry-picked it today and did a few tests and it seems that things
        now work quite well.  The patch is also pretty simple, so I see no
        reason to wait before merging it."
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/eeh: Fix fetching bus for single-dev-PE
      f0277dce
    • Linus Torvalds's avatar
      Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi · 4b483802
      Linus Torvalds authored
      Pull SCSI fixes from James Bottomley:
       "This is a set of seven bug fixes.  Several fcoe fixes for locking
        problems, initiator issues and a VLAN API change, all of which could
        eventually lead to data corruption, one fix for a qla2xxx locking
        problem which could lead to multiple completions of the same request
        (and subsequent data corruption) and a use after free in the ipr
        driver.  Plus one minor MAINTAINERS file update"
      
      (only six bugfixes in this pull, since I had already pulled the fcoe API
      fix directly from Robert Love)
      
      * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
        [SCSI] ipr: Avoid target_destroy accessing memory after it was freed
        [SCSI] qla2xxx: Fix for locking issue between driver ISR and mailbox routines
        MAINTAINERS: Fix fcoe mailing list
        libfc: extend ex_lock to protect all of fc_seq_send
        libfc: Correct check for initiator role
        libfcoe: Fix Conflicting FCFs issue in the fabric
      4b483802
    • Gavin Shan's avatar
      powerpc/eeh: Fix fetching bus for single-dev-PE · ea461abf
      Gavin Shan authored
      While running Linux as guest on top of phyp, we possiblly have
      PE that includes single PCI device. However, we didn't return
      its PCI bus correctly and it leads to failure on recovery from
      EEH errors for single-dev-PE. The patch fixes the issue.
      
      Cc: <stable@vger.kernel.org> # v3.7+
      Cc: Steve Best <sbest@us.ibm.com>
      Signed-off-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ea461abf
    • Linus Torvalds's avatar
      Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc · 6c355bea
      Linus Torvalds authored
      Pull powerpc fixes from Ben Herrenschmidt:
       "We discovered some breakage in our "EEH" (PCI Error Handling) code
        while doing error injection, due to a couple of regressions.  One of
        them is due to a patch (37f02195 "powerpc/pci: fix PCI-e devices
        rescan issue on powerpc platform") that, in hindsight, I shouldn't
        have merged considering that it caused more problems than it solved.
      
        Please pull those two fixes.  One for a simple EEH address cache
        initialization issue.  The other one is a patch from Guenter that I
        had originally planned to put in 3.11 but which happens to also fix
        that other regression (a kernel oops during EEH error handling and
        possibly hotplug).
      
        With those two, the couple of test machines I've hammered with error
        injection are remaining up now.  EEH appears to still fail to recover
        on some devices, so there is another problem that Gavin is looking
        into but at least it's no longer crashing the kernel."
      
      * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
        powerpc/pci: Improve device hotplug initialization
        powerpc/eeh: Add eeh_dev to the cache during boot
      6c355bea
    • Olof Johansson's avatar
      ARM: dt: Only print warning, not WARN() on bad cpu map in device tree · 8d5bc1a6
      Olof Johansson authored
      Due to recent changes and expecations of proper cpu bindings, there are
      now cases for many of the in-tree devicetrees where a WARN() will hit
      on boot due to badly formatted /cpus nodes.
      
      Downgrade this to a pr_warn() to be less alarmist, since it's not a
      new problem.
      
      Tested on Arndale, Cubox, Seaboard and Panda ES. Panda hits the WARN
      without this, the others do not.
      Acked-by: default avatarRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: default avatarOlof Johansson <olof@lixom.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8d5bc1a6
  3. 29 Jun, 2013 11 commits
  4. 28 Jun, 2013 4 commits
    • Akira Takeuchi's avatar
      mn10300: Use early_param() to parse "mem=" parameter · e3f12a53
      Akira Takeuchi authored
      This fixes the problem that "init=" options may not be passed to kernel
      correctly.
      
      parse_mem_cmdline() of mn10300 arch gets rid of "mem=" string from
      redboot_command_line. Then init_setup() parses the "init=" options from
      static_command_line, which is a copy of redboot_command_line, and keeps
      the pointer to the init options in execute_command variable.
      
      Since the commit 026cee00 upstream (params: <level>_initcall-like kernel
      parameters), static_command_line becomes overwritten by saved_command_line at
      do_initcall_level(). Notice that saved_command_line is a command line
      which includes "mem=" string.
      
      As a result, execute_command may point to weird string by the length of
      "mem=" parameter.
      I noticed this problem when using the command line like this:
      
          mem=128M console=ttyS0,115200 init=/bin/sh
      
      Here is the processing flow of command line parameters.
          start_kernel()
            setup_arch(&command_line)
               parse_mem_cmdline(cmdline_p)
                 * strcpy(boot_command_line, redboot_command_line);
                 * Remove "mem=xxx" from redboot_command_line.
                 * *cmdline_p = redboot_command_line;
            setup_command_line(command_line) <-- command_line is redboot_command_line
              * strcpy(saved_command_line, boot_command_line)
              * strcpy(static_command_line, command_line)
            parse_early_param()
              strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
              parse_early_options(tmp_cmdline);
                parse_args("early options", cmdline, NULL, 0, 0, 0, do_early_param);
            parse_args("Booting ..", static_command_line, ...);
              init_setup() <-- save the pointer in execute_command
            rest_init()
              kernel_thread(kernel_init, NULL, CLONE_FS | CLONE_SIGHAND);
      
      At this point, execute_command points to "/bin/sh" string.
      
          kernel_init()
            kernel_init_freeable()
              do_basic_setup()
                do_initcalls()
                  do_initcall_level()
                    (*) strcpy(static_command_line, saved_command_line);
      
      Here, execute_command gets to point to "200" string !!
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      e3f12a53
    • Akira Takeuchi's avatar
      mn10300: Allow to pass array name to get_user() · c6dc9f0a
      Akira Takeuchi authored
      This fixes the following compile error:
      
      CC block/scsi_ioctl.o
      block/scsi_ioctl.c: In function 'sg_scsi_ioctl':
      block/scsi_ioctl.c:449: error: invalid initializer
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      c6dc9f0a
    • Dave Airlie's avatar
    • Thadeu Lima de Souza Cascardo's avatar
      powerpc/eeh: Add eeh_dev to the cache during boot · 1abd6018
      Thadeu Lima de Souza Cascardo authored
      commit f8f7d63f ("powerpc/eeh: Trace eeh
      device from I/O cache") broke EEH on pseries for devices that were
      present during boot and have not been hotplugged/DLPARed.
      
      eeh_check_failure will get the eeh_dev from the cache, and will get
      NULL. eeh_addr_cache_build adds the addresses to the cache, but eeh_dev
      for the giving pci_device is not set yet. Just reordering the call to
      eeh_addr_cache_insert_dev works fine. The ordering is similar to the one
      in eeh_add_device_late.
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com>
      Acked-by: default avatarGavin Shan <shangw@linux.vnet.ibm.com>
      Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
      1abd6018