• Paul Moore's avatar
    audit: fix auditd/kernel connection state tracking · 5b52330b
    Paul Moore authored
    What started as a rather straightforward race condition reported by
    Dmitry using the syzkaller fuzzer ended up revealing some major
    problems with how the audit subsystem managed its netlink sockets and
    its connection with the userspace audit daemon.  Fixing this properly
    had quite the cascading effect and what we are left with is this rather
    large and complicated patch.  My initial goal was to try and decompose
    this patch into multiple smaller patches, but the way these changes
    are intertwined makes it difficult to split these changes into
    meaningful pieces that don't break or somehow make things worse for
    the intermediate states.
    
    The patch makes a number of changes, but the most significant are
    highlighted below:
    
    * The auditd tracking variables, e.g. audit_sock, are now gone and
    replaced by a RCU/spin_lock protected variable auditd_conn which is
    a structure containing all of the auditd tracking information.
    
    * We no longer track the auditd sock directly, instead we track it
    via the network namespace in which it resides and we use the audit
    socket associated with that namespace.  In spirit, this is what the
    code was trying to do prior to this patch (at least I think that is
    what the original authors intended), but it was done rather poorly
    and added a layer of obfuscation that only masked the underlying
    problems.
    
    * Big backlog queue cleanup, again.  In v4.10 we made some pretty big
    changes to how the audit backlog queues work, here we haven't changed
    the queue design so much as cleaned up the implementation.  Brought
    about by the locking changes, we've simplified kauditd_thread() quite
    a bit by consolidating the queue handling into a new helper function,
    kauditd_send_queue(), which allows us to eliminate a lot of very
    similar code and makes the looping logic in kauditd_thread() clearer.
    
    * All netlink messages sent to auditd are now sent via
    auditd_send_unicast_skb().  Other than just making sense, this makes
    the lock handling easier.
    
    * Change the audit_log_start() sleep behavior so that we never sleep
    on auditd events (unchanged) or if the caller is holding the
    audit_cmd_mutex (changed).  Previously we didn't sleep if the caller
    was auditd or if the message type fell between a certain range; the
    type check was a poor effort of doing what the cmd_mutex check now
    does.  Richard Guy Briggs originally proposed not sleeping the
    cmd_mutex owner several years ago but his patch wasn't acceptable
    at the time.  At least the idea lives on here.
    
    * A problem with the lost record counter has been resolved.  Steve
    Grubb and I both happened to notice this problem and according to
    some quick testing by Steve, this problem goes back quite some time.
    It's largely a harmless problem, although it may have left some
    careful sysadmins quite puzzled.
    
    Cc: <stable@vger.kernel.org> # 4.10.x-
    Reported-by: default avatarDmitry Vyukov <dvyukov@google.com>
    Signed-off-by: default avatarPaul Moore <paul@paul-moore.com>
    5b52330b
audit.c 61 KB