1. 12 Apr, 2004 40 commits
    • Andrew Morton's avatar
      [PATCH] Fix genksyms parsing · f17ea056
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From: Andreas Schwab <schwab@suse.de> I'm getting a warning when building
      for ia64 with MODVERSIONS enabled.  This is a bug in genksyms, it can't
      cope with some arguments of __typeof__.
      
      The following patch will fix that.  Actually the argument of __typeof__ is
      an abstract declarator, but the genksyms parser has no production for that;
      decl_specifier_seq also matches some invalid constructs, but I don't think
      this is a problem in practice, since the compiler will reject them.
      f17ea056
    • Andrew Morton's avatar
      [PATCH] Trivial Patch Monkey should be in MAINTAINERS · fa79e47b
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Petri Koistinen <petri.koistinen@iki.fi>
      fa79e47b
    • Andrew Morton's avatar
      [PATCH] Fix firmware loader docs · f333f50d
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Pavel Machek <pavel@ucw.cz>
      
      sysfs should be mounted on /sys these days.
      f333f50d
    • Andrew Morton's avatar
      [PATCH] i386 irq.c ifdef cleanup · bc344a64
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Josef 'Jeff' Sipek <jeffpc@optonline.net>
      
      I just noticed the nested ifdefs, and made it little more readable.
      bc344a64
    • Andrew Morton's avatar
      [PATCH] fix sch_ingress help · 387ec9eb
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  John Levon <levon@movementarian.org>
      387ec9eb
    • Andrew Morton's avatar
      [PATCH] SGML: close tag with ">" · bd9646e6
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Hans Ulrich Niedermann <linux-kernel@n-dimensional.de>
      
      doc patch: close tag with ">"
      bd9646e6
    • Andrew Morton's avatar
      [PATCH] Consistently use quotes for SGML attributes · c02dc9a8
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Hans Ulrich Niedermann <linux-kernel@n-dimensional.de>
      
      doc patch: Consistently use quotes for SGML attributes This makes it
      possible to process the SGML files without SHORTTAG YES.
      c02dc9a8
    • Andrew Morton's avatar
      [PATCH] document unused pte bits on i386 · 2b5f9408
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Ed L Cashin <ecashin@uga.edu>
      
      This small patch documents that bits 9, 10, and 11 are unused by the Linux
      kernel.  The IA-32 Intel Architecture Software Developer's Manual says that
      these bits are available for programmer use.
      2b5f9408
    • Andrew Morton's avatar
      [PATCH] Update CodingStyle hints for Emacs users. · b4ecf1b0
      Andrew Morton authored
      From: Trivial Patch Monkey <trivial@rustcorp.com.au>
      
      From:  Ben Greear <greearb@candelatech.com>
      
      Depending on one's default emacs settings, the suggestion in the
      CodingStyle may or may not work.  This patch adds a few more commands to
      ensure it works in more cases.
      b4ecf1b0
    • Andrew Morton's avatar
      [PATCH] ver_linux fix · 3bca5aa3
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      From:  Adrian Bunk <bunk@fs.tum.de>
      
      Some versions of ps print non-version lines when ps --version is invoked.
      grep them out.
      3bca5aa3
    • Andrew Morton's avatar
      [PATCH] Broken bitmap_parse for ncpus > 32 · f9511792
      Andrew Morton authored
      From: Joe Korty <joe.korty@ccur.com>
      
      This patch replaces the call to bitmap_shift_right() in bitmap_parse() with
      bitmap_shift_left().
      
      I also prepended comments to the bitmap_shift_* functions defining what
      'left' and 'right' means.  This is under the theory that if I and all the
      reviewers were bamboozled, others in the future occasionally might be too.
      f9511792
    • Andrew Morton's avatar
      [PATCH] Fix sys_time() to get subtick correction from the new xtime · 5362a354
      Andrew Morton authored
      From: "La Monte H.P. Yarroll" <piggy@timesys.com>
      
      This is a Scott Wood patch against 2.6.3.
      
      
      Use gettimeofday() rather than xtime.tv_sec in sys_time(), since
      sys_stime() uses settimeofday() and thus subtracts the subtick correction
      from the new xtime.
      
      stime() used settimeofday(), but time() did not use gettimeofday().  Since
      settimeofday() subtracts out the current intra-tick correction, and nsec
      was 0 (since stime() only allows seconds), this resulted in xtime being
      slightly earlier than the time that was set.
      
      If time() had used gettimeofday(), the correction would have been applied,
      and everything would be fine.  However, instead time just reads the current
      xtime.tv_sec, so if time() is called immediately after stime(), you'll
      usually get a value one second earlier.
      5362a354
    • Andrew Morton's avatar
      [PATCH] add file_operations.fcntl · cea39746
      Andrew Morton authored
      From: Chuck Lever <cel@citi.umich.edu>
      
      O_DIRECT|O_APPEND cannot possibly work on NFS, so NFS needs some way of
      preventing the user from setting this combination.  We felt that the best
      way of implementing this restriction is to allow the filesytem to implement
      its own fcntl() handler.
      
      This patch does, that, and provide the appropriate handler for NFS.
      
      Additional details from Chuck:
      
      Forgetting O_DIRECT for a moment, O_APPEND writes on NFS don't work in any
      case when multiple clients are writing to a file, since an NFS client can
      never guarantee it knows where the true end of file is 100% of the time.
      it works as expected iff only one client writes to an O_APPEND file at a
      time.
      
      Multi-client O_APPEND writing doesn't seem to be a problem for any
      application I'm aware of.  Since it can be made to behave in the
      multi-client case with careful application logic or by using file locking,
      I don't think we should disallow it.
      
      I want to drop the inode semaphore when doing NFS direct I/O because it is
      synchronous; holding the i_sem means we reduce direct I/O concurrency to
      one I/O per file at a time.  the important thing sct was worried about was
      the case where a single client is writing with O_APPEND and O_DIRECT, and
      we don't hold the i_sem during the write.
      
      We must at least hold the i_sem when determining where the end of file is
      to do the O_APPEND write.  In 2.6, I believe that is handled correctly in
      the VFS layer, so this is not an issue for 2.6, right?
      cea39746
    • Andrew Morton's avatar
      [PATCH] pmdisk: fix strcmp in sysfs store · 3f66b056
      Andrew Morton authored
      From: Herbert Xu <herbert@gondor.apana.org.au>
      
      This patch fixes the sysfs store functions for pmdisk when the input
      contains a trailing newline.
      3f66b056
    • Andrew Morton's avatar
      [PATCH] sb_mixer bounds checking · 77abb2f0
      Andrew Morton authored
      From: Muli Ben-Yehuda <mulix@mulix.org>
      
      This patch add proper bounds checking to the sb_mixer.c code, found by the
      stanford checker[0].  It fixes bugzilla bugs 252[1], 253[2] and 254[3]. 
      Patch is against 2.6.5-rc2.  It was tested by Rene Herman on SN AWE64 gold
      and sound still works.  The issue was previously discussed on lkml[4], but
      apparently no fix was applied.
      
      The patch is a bit more intrusive than I would've liked, but I don't think
      it can be helped without really intrusive changes.  sb_devc has a pointer
      to an array (iomap) that is set at run time to point to arrays of variable
      sizes.  The patch adds an 'iomap_sz' member to sb_devc that is set to the
      length of the array, and does bounds checking in sb_common_mixer_set() and
      smw_mixer_set() agains that.
      77abb2f0
    • Andrew Morton's avatar
      [PATCH] fs/proc/proc_tty.c comment fixes · 59b46ce5
      Andrew Morton authored
      From: Marc-Christian Petersen <m.c.p@wolk-project.de>
      59b46ce5
    • Andrew Morton's avatar
      [PATCH] set mod->waiter before calling stop_machine · 07ebe427
      Andrew Morton authored
      From: Rusty Russell <rusty@rustcorp.com.au>
      
      mod->waiter needs to be set before we try to stop the module: setting it in
      __try_stop_module means it gets set to the kthread, not rmmod.
      07ebe427
    • Andrew Morton's avatar
      [PATCH] slab: updates for per-arch alignments · b9e55f3d
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Description:
      
      Right now kmem_cache_create automatically decides about the alignment of
      allocated objects. The automatic decisions are sometimes wrong:
      
      - for some objects, it's better to keep them as small as possible to
        reduce the memory usage.  Ingo already added a parameter to
        kmem_cache_create for the sigqueue cache, but it wasn't implemented.
      
      - for s390, normal kmalloc must be 8-byte aligned.  With debugging
        enabled, the default allocation was 4-bytes.  This means that s390 cannot
        enable slab debugging.
      
      - arm26 needs 1 kB aligned objects.  Previously this was impossible to
        generate, therefore arm has its own allocator in
        arm26/machine/small_page.c
      
      - most objects should be cache line aligned, to avoid false sharing.  But
        the cache line size was set at compile time, often to 128 bytes for
        generic kernels.  This wastes memory.  The new code uses the runtime
        determined cache line size instead.
      
      - some caches want an explicit alignment.  One example are the pte_chain
        objects: they must find the start of the object with addr&mask.  Right
        now pte_chain objects are scaled to the cache line size, because that was
        the only alignment that could be generated reliably.
      
      The implementation reuses the "offset" parameter of kmem_cache_create and
      now uses it to pass in the requested alignment.  offset was ignored by the
      current implementation, and the only user I found is sigqueue, which
      intended to set the alignment.
      
      In the long run, it might be interesting for the main tree: due to the 128
      byte alignment, only 7 inodes fit into one page, with 64-byte alignment, 9
      inodes - 20% memory recovered for Athlon systems.
      
      
      
      For generic kernels  running on P6 cpus (i.e. 32 byte cachelines), it means
      
      Number of objects per page:
      
       ext2_inode_cache: 8 instead of 7
       ext3_inode_cache: 8 instead of 7
       fat_inode_cache: 9 instead of 7
       rpc_tasks: 24 instead of 15
       tcp_tw_bucket: 40 instead of 30
       arp_cache: 40 instead of 30
       nfs_write_data: 9 instead of 7
      b9e55f3d
    • Andrew Morton's avatar
      [PATCH] Fix scripts/kernel-doc to handle __attribute__ · 1aa6c0d1
      Andrew Morton authored
      From: Tom Rini <trini@kernel.crashing.org>
      
      The following patch is needed so that kernel-doc can handle functions which
      have __attribute__'s on them (such as __attribute__ ((weak))).
      1aa6c0d1
    • Andrew Morton's avatar
      [PATCH] readv/writev range checking fix · fb14ef35
      Andrew Morton authored
      do-readv_writev() is trying to fail if
      
      a) any of the segments have a length < 0 or
      
      b) the sum of the segments wraps negative.
      
      But it gets b) wrong because local variable tot_len is unsigned.
      
      Fix that up.
      fb14ef35
    • Andrew Morton's avatar
      [PATCH] jbd: fix I/O error handling · b1ee3fea
      Andrew Morton authored
      Fix a few buglets spotted by Jeff Mahoney <jeffm@suse.com>.  We're currently
      only checking for I/O errors against journal buffers if they were locked when
      they were first inspected.
      
      We need to check buffer_uptodate() even if the buffers were already unlocked.
      b1ee3fea
    • Andrew Morton's avatar
      [PATCH] JBD: ordered-data commit cleanup · 2b38960c
      Andrew Morton authored
      For data=ordered, kjournald at commit time has to write out and wait upon a
      long list of buffers.  It does this in a rather awkward way with a single
      list.  it causes complexity and long lock hold times, and makes the addition
      of rescheduling points quite hard
      
      So what we do instead (based on Chris Mason's suggestion) is to add a new
      buffer list (t_locked_list) to the journal.  It contains buffers which have
      been placed under I/O.
      
      So as we walk the t_sync_datalist list we move buffers over to t_locked_list
      as they are written out.
      
      When t_sync_datalist is empty we may then walk t_locked_list waiting for the
      I/O to complete.
      
      As a side-effect this means that we can remove the nasty synchronous wait in
      journal_dirty_data which is there to avoid the kjournald livelock which would
      otherwise occur when someone is continuously dirtying a buffer.
      2b38960c
    • Andrew Morton's avatar
      [PATCH] jbd: fix ordered-data writeout logic · 376fd482
      Andrew Morton authored
      There's some nasty code in commit which deals with a lock ranking problem. 
      Currently if it fails to get the lock when and local variable `bufs' is zero
      we forget to write out some ordered-data buffers.  So a subsequent
      crash+recovery could yield stale data in existing files.
      
      Fix it by correctly restarting the t_sync_datalist search.
      376fd482
    • Andrew Morton's avatar
      [PATCH] speed up ext2 fsync() and fdatasync() · 7176142a
      Andrew Morton authored
      ext2_sync_file() forgets to clear the inode's dirty bits, so we write the
      inode on every fsync(), even if it hasn't changed.
      
      Fix that up via the new sync_file() API which correctly manages the inode
      state bits and the superblock inode lists.
      
      When performing file overwrite on IDE with and without writeback caching
      enabled this patch approximately doubles fsync() speed, bringing it into line
      with O_SYNC writes.
      
      Also, fix up the return value handling in ext2_sync_file().
      
      Credit due to Jeffrey Siegal <jbs@quiotix.com> who noticed the performance
      discrepancy and wrote a test app.
      7176142a
    • Andrew Morton's avatar
      [PATCH] ext3 fsync() and fdatasync() speedup · a1ff5989
      Andrew Morton authored
      ext3's fsync/fdatasync implementation is currently syncing the inode via a
      full journal commit even if it was unaltered.
      
      Fix that up by exporting the core VFS's inode sync function to modules and
      calling it if the inode is dirty.  We need to do it this way so that the
      inode is moved to the appropriate superblock list and so that the i_state
      dirty flags are appropriately updated.
      
      This speeds up ext3 fsync() for file overwrites by a factor of four (disk
      non-writeback) to forty (disk in writeback mode).
      a1ff5989
    • Andrew Morton's avatar
      [PATCH] Fix page allocator lower zone protection for NUMA · af70f767
      Andrew Morton authored
      From: Martin Hicks <mort@wildopensource.com>
      
      This changes __alloc_pages() so it uses precalculated values for the "min".
      This should prevent the problem of min incrementing from zone to zone across
      many nodes on a NUMA machine.  The result of falling back to other nodes with
      the old incremental min calculations was that the min value became very
      large.
      af70f767
    • Andrew Morton's avatar
      [PATCH] move job control fields from task_struct to signal_struct · 7860b371
      Andrew Morton authored
      From: Roland McGrath <roland@redhat.com>
      
      This patch moves all the fields relating to job control from task_struct to
      signal_struct, so that all this info is properly per-process rather than
      being per-thread.
      7860b371
    • Andrew Morton's avatar
      [PATCH] IPMI driver updates · 0ab2d668
      Andrew Morton authored
      From: Corey Minyard <minyard@acm.org>
      
      - Add support for messaging through an IPMI LAN interface, which is
        required for some system software that already exists on other IPMI
        drivers.  It also does some renaming and a lot of little cleanups.
      
      - Add the "System Interface" driver.  The previous driver for system
        interfaces only supported the KCS interface, this driver supports all
        system interfaces defined in the IPMI standard.  It also does a much better
        job of handling ACPI and SMBIOS tables for detecting IPMI system
        interfaces.
      0ab2d668
    • Andrew Morton's avatar
      [PATCH] compat emulation for posix message queues · 87c22e84
      Andrew Morton authored
      From: Arnd Bergmann <arnd@arndb.de>
      
      I have tested the code with the open posix test suite and found the same
      four failures for both 64-bit and compat mode, most tests pass.  The patch
      is against -mc1, but I guess it also applies to the other trees around.
      
      What worries me more than mq_attr compatibility is the conversion of struct
      sigevent, which might turn out really hard when more fields in there are
      used.  AFAICS, the only other part in the kernel ABI is sys_timer_create(),
      so maybe it's not too late to deprecate the current structure and create a
      structure that can be used properly for compat syscalls.
      87c22e84
    • Andrew Morton's avatar
      [PATCH] posix message queues: send notifications via netlink · 34b98f22
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      SIGEV_THREAD means that a given callback should be called in the context on a
      new thread.  This must be done by the C library.  The kernel must deliver a
      notice of the event to the C library when the callback should be called.
      
      This patch switches to a new, simpler interface: User space creates a socket
      with socket(PF_NETLINK, SOCK_RAW,0) and passes the fd to the mq_notify call
      together with a cookie.  When the mq_notify() condition is satisfied, the
      kernel "writes" the cookie to the socket.  User space then reads the cookie
      and calls the appropriate callback.
      34b98f22
    • Andrew Morton's avatar
      [PATCH] split netlink_unicast · ed6dcf4a
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      The attached patch splits netlink_unicast into three steps:
      
      - netlink_getsock{bypid,byfilp}: lookup the destination socket.
      
      - netlink_attachskb: perform the nonblock checks, sleep if the socket
        queue is longer than the limit, etc.
      
      - netlink_sendskb: actually send the skb.
      
      jamal looked over it and didn't see a problem with the netlink change.  The
      actual use from ipc/mqueue.c is still open (just send back whatever the C
      library passed to mq_notify, add an nlmsghdr or perhaps even make it a
      specialized netlink protocol), but the attached patch is independant from
      the the message queue change.
      
      (acked by davem)
      ed6dcf4a
    • Andrew Morton's avatar
      [PATCH] security bugfix for mqueue · b06d7b4c
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      I found a security bug in the new mqueue code: a process that has only
      write permissions to a message queue could call mq_notify(SIGEV_THREAD) and
      use the returned notification file descriptor to read from the message
      queue.
      b06d7b4c
    • Andrew Morton's avatar
      [PATCH] posix message queue update · f3ca8d5d
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      My discussion with Ulrich had one result:
      
      - mq_setattr can accept implementation defined flags.  Right now we have
        none, but we might add some later (e.g.  switch to CLOCK_MONOTONIC for
        mq_timed{send,receive} or something similar).  When we add flags, we
        might need the fields for additional information.  And they don't hurt.
        Therefore add four __reserved fields to mq_attr.
      
      - fail mq_setattr if we get unknown flags - otherwise glibc can't detect
        if it's running on a future kernel that supports new features.
      
      - use memset to initialize the mq_attr structure - theoretically we could
        leak kernel memory.
      
      - Only set O_NONBLOCK in mq_attr, explicitely clear O_RDWR & friends.
        openposix uses getattr, attr |=O_NONBLOCK, setattr - a sane approach. 
        Without clearing O_RDWR, this fails.
      
      I've retested all openposix conformance tests with the new patch - the two
      new FAILED tests check undefined behavior.  Note that I won't have net
      access until Sunday - if the message queue patch breaks something important
      either ask Krzysztof or drop it.
      
      Ulrich had another good idea for SIGEV_THREAD, but I must think about it.
      It would mean less complexitiy in glibc, but more code in the kernel.  I'm
      not yet convinced that it's overall better.
      f3ca8d5d
    • Andrew Morton's avatar
      [PATCH] posix message queues: made user mountable · b95db642
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Make the posix message queue mountable by the user.  This replaces ipcs and
      ipcrm for posix message queue: The admin can check which queues exist with ls
      and remove stale queues with rm.
      
      I'd like a final confirmation from Ulrich that our SIGEV_THREAD approach is
      the right thing(tm): He's aware of the design and didn't object, but I think
      he hasn't seen the final API yet.
      b95db642
    • Andrew Morton's avatar
      [PATCH] posix message queues: linux-specific poll extension · 0301b50b
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Linux specific extension: make the message queue identifiers pollable.  It's
      simple and could be useful.
      0301b50b
    • Andrew Morton's avatar
      [PATCH] posix message queues: implementation · be94d44e
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Actual implementation of the posix message queues, written by Krzysztof
      Benedyczak and Michal Wronski.  The complete implementation is dependant on
      CONFIG_POSIX_MQUEUE.
      
      It passed the openposix test suite with two exceptions: one mq_unlink test
      was bad and tested undefined behavior.  And Linux succeeds
      mq_close(open(,,,)).  The spec mandates EBADF, but we have decided to ignore
      that: we would have to add a new syscall just for the right error code.
      
      The patch intentionally doesn't use all helpers from fs/libfs for kernel-only
      filesystems: step 5 allows user space mounts of the file system.
      
      
      
      Signal changes:
      
      The patch redefines SI_MESGQ using __SI_CODE: The generic Linux ABI uses
      a negative value (i.e.  from user) for SI_MESGQ, but the kernel internal
      value must be posive to pass check_kill_value.  Additionally, the patch
      adds support into copy_siginfo_to_user to copy the "new" signal type to
      user space.
      
      
      
      Changes in signal code caused by POSIX message queues patch:
      
      General & rationale:
      
        mqueues generated signals (only upon notification) must have si_code
        == SI_MESGQ.  In fact such a signal is send from one process which
        caused notification (== sent message to empty message queue) to
        another which requested it.  Both processes can be of course unrelated
        in terms of uids/euids.  So SI_MESGQ signals must be classified as
        SI_FROMKERNEL to pass check_kill_permissions (not need to say that
        this signals ARE from kernel).
      
        Signals generated by message queues notification need the same
        fields in siginfo struct's union _sifields as POSIX.1b signals and we
        can reuse its union entry.
      
        SI_MESGQ was previously defined to -3 in kernel and also in glibc. 
        So in userspace SI_MESGQ must be still visible as -3.
      
      Solution:
      
        SI_MESGQ is defined in the same style as SI_TIMER using __SI_CODE macro.
      
        Details:
      
          Fortunately copy_siginfo_to_user copies si_code as short.  So we
          can use remaining part of int value freely.  __SI_CODE does the
          work.  SI_MESGQ is in kernel:
      
       		6<<16 | (-3 & 0xffff) what is > 0
      
          but to userspace is copied
      
       		(short) SI_MESGQ == -3
      
      Actual changes:
      
        Changes in include/asm-generic/siginfo.h
      
        __SI_MESGQ added in signal.h to represent inside-kernel prefix of
        SI_MESGQ.  SI_MESGQ is redefined from -3 to __SI_CODE(__SI_MESGQ, -3)
      
        Except mips architecture those changes should be arch independent
        (asm-generic/siginfo.h is included in arch versions).  On mips
        SI_MESGQ is redefined to -4 in order to be compatible with IRIX.  But
        the same schema can be used.
      
        Change in copy_siginfo_to_user: We only add one line to order the
        same copy semantics as for _SI_RT.
      
        This change isn't very portable - some arch have its own
        copy_siginfo_to_user.  All those should have similar change (but
        possibly not one-line as _SI_RT case was sometimes ignored because i
        wasn't used yet, e.g.  see ia64 signal.c).
      
      Update:
      mq: only fail with invalid timespec if mq_timed{send,receive} needs to block
      From: Jakub Jelinek <jakub@redhat.com>
      
      POSIX requires EINVAL to be set if:
      "The process or thread would have blocked, and the abs_timeout parameter
      specified a nanoseconds field value less than zero or greater than or equal
      to 1000 million."
      but 2.6.5-mm3 returns -EINVAL even if the process or thread would not block
      (if the queue is not empty for timedreceive or not full for timedsend).
      be94d44e
    • Andrew Morton's avatar
      [PATCH] posix message queues: syscall stubs · c50142a5
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      Add -ENOSYS stubs for the posix message queue syscalls.  The API is a direct
      mapping of the api from the unix spec, with two exceptions:
      
      - mq_close() doesn't exist.  Message queue file descriptors can be closed
        with close().
      
      - mq_notify(SIGEV_THREAD) cannot be implemented in the kernel.  The kernel
        returns a pollable file descriptor .  User space must poll (or read) this
        descriptor and call the notifier function if the file descriptor is
        signaled.
      c50142a5
    • Andrew Morton's avatar
      [PATCH] posix message queues: code move · c334f752
      Andrew Morton authored
      From: Manfred Spraul <manfred@colorfullife.com>
      
      cleanup of sysv ipc as a preparation for posix message queues:
      
      - replace !CONFIG_SYSVIPC wrappers for copy_semundo and exit_sem with
        static inline wrappers.  Now the whole ipc/util.c file is only used if
        CONFIG_SYSVIPC is set, use makefile magic instead of #ifdef.
      
      - remove the prototypes for copy_semundo and exit_sem from kernel/fork.c
      
      - they belong into a header file.
      
      - create a new msgutil.c with the helper functions for message queues.
      
      - cleanup the helper functions: run Lindent, add __user tags.
      c334f752
    • Andrew Morton's avatar
      [PATCH] md: merge_bvec_fn needs to know about partitions. · 00d1b0e9
      Andrew Morton authored
      From: Neil Brown <neilb@cse.unsw.edu.au>
      
      Addresses http://bugme.osdl.org/show_bug.cgi?id=2355
      
      It seems that a merge_bvec_fn needs to be aware of partitioning...  who
      would have thought it :-(
      
      The following patch should fix the merge_bvec_fn for both linear and raid0.
      We teach linear and raid0 about partitions in the merge_bvec_fn.
      
      ->merge_bvec_fn needs to make decisions based on the physical geometry of the
      device.  For raid0, it needs to decide if adding the bvec to the bio will
      make the bio span two drives.
      
      To do this, it needs to know where the request is (what the sector number is)
      in the whole device.
      
      However when called from bio_add_page, bi_sector is the sector number
      relative to the current partition, as generic_make_request hasn't been called
      yet.
      
      So raid_mergeable_bvec needs to map bio->bi_sector (which is partition
      relative) to a bi_sector which is device relative, so it can perform proper
      calculations about when chunk boundaries are.
      00d1b0e9
    • Andrew Morton's avatar
      [PATCH] knfsd: Add data integrity to serve rside gss · 9abdc660
      Andrew Morton authored
      From: NeilBrown <neilb@cse.unsw.edu.au>
      
      From: "J. Bruce Fields" <bfields@fieldses.org>
      
      rpcsec_gss supports three security levels:
      
      1.  authentication only: sign the header of each rpc request and response.
      
      2. integrity: sign the header and body of each rpc request and response.
      
      3.  privacy: sign the header and encrypt the body of each rpc request and
         response.
      
      The first 2 are already supported on the client; this adds integrity support
      on the server.
      9abdc660