- 19 Jan, 2004 40 commits
-
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> while working on my mm patch for s390 I played with rmap a bit, adding BUG statements and the like. While doing so I noticed some room for improvement in rmap. Its minor stuff but anyway... The first observation is that the pte chain array doesn't have holes, meaning that from the pte_chain_idx() of the first array every slot of all following pte chain arrays are full. That is there can't be NULL pointers. The "if (!pte_paddr)" check in try_to_unmap() can be removed and if the loop in page_referenced() is started from pte_chain_idx(pc) then the "if (!pte_paddr)" in page_referenced() can be removed as well. The second observation is that the first pte array of a pte chain has at least one entry. Empty pte chain arrays are always freed immediatly after the last entry was removed. Because of that victim_i can be calculated in a simpler way. Instead of setting victim_i to -1 and then check in each loop iteration against -1 victim_i can just be set to the pte_chain_idx of the first pte chain array.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> I think I found a potential race in install_page/install_file_pte. The inline function zap_pte releases pages by calling page_remove_rmap and page_cache_release. If this was the last user of a page it can get purged from the page cache and then get immediatly reused. But there might still be a tlb for this page on another cpu. The tlb is removed in the callers of zap_pte, install_page and install_file_pte, but this is too late. I admit that its a very unlikely race but never the less.. I fixed this by using the new ptep_clear_flush function that is introduced with the tlb flush optimization patch for s/390.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> this is another s/390 related mm patch. It introduces the concept of physical dirty and referenced bits into the common mm code. I always had the nagging feeling that the pte functions for setting/clearing the dirty and referenced bits are not appropriate for s/390. It works but it is a bit of a hack. After the wake of rmap it is now possible to put a much better solution into place. The idea is simple: since there are not dirty/referenced bits in the pte make these function nops on s/390 and add operations on the physical page to the appropriate places. For the referenced bit this is the page_referenced() function. For the dirty bit there are two relevant spots: in page_remove_rmap after the last user of the page removed its reverse mapping and in try_to_unmap after the last user was unmapped. There are two new functions to accomplish this: * page_test_and_clear_dirty: Test and clear the dirty bit of a physical page. This function is analog to ptep_test_and_clear_dirty but gets a struct page as argument instead of a pte_t pointer. * page_test_and_clear_young: Test and clear the referenced bit of a physical page. This function is analog to ptep_test_and_clear_young but gets a struct page as argument instead of a pte_t pointer. Its pretty straightforward and with it the s/390 mm makes much more sense. You'll need the tls flush optimization patch for the patch. Comments ?
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> On the s/390 architecture we still have the issue with tlb flushing and the ipte instruction. We can optimize the tlb flushing a lot with some minor interface changes between the arch backend and the memory management core. In the end the whole thing is about the Invalidate Page Table Entry (ipte) instruction. The instruction sets the invalid bit in the pte and removes the tlb for the page on all cpus for the virtual to physical mapping of the page in a particular address space. The nice thing is that only the tlb for this page gets removed, all the other tlbs stay valid. The reason we can't use ipte to implement flush_tlb_page() is one of the requirements of the instruction: the pte that should get flushed needs to be *valid*. I'd like to add the following four functions to the mm interface: * ptep_establish: Establish a new mapping. This sets a pte entry to a page table and flushes the tlb of the old entry on all cpus if it exists. This is more or less what establish_pte in mm/memory.c does right now but without the update_mmu_cache call. * ptep_test_and_clear_and_flush_young. Do what ptep_test_and_clear_young does and flush the tlb. * ptep_test_and_clear_and_flush_dirty. Do what ptep_test_and_clear_dirty does and flush the tlb. * ptep_get_and_clear_and_flush: Do what ptep_get_and_clear does and flush the tlb. The s/390 specific functions in include/pgtable.h define their own optimized version of these four functions by use of the ipte. I avoid the definition of these function for every architecture I added them to include/asm-generic/pgtable.h. Since i386/x86 and others don't include this header yet and define their own version of the functions found there I #ifdef'd all functions in include/asm-generic/pgtable.h to be able to pick the ones that are needed for each architecture (see patch for details). With the new functions in place it is easy to do the optimization, e.g. the sequence ptep_get_and_clear(ptep); flush_tlb_page(vma, address); gets replace by ptep_get_and_clear_and_flush(vma, address, ptep); The old sequence still works but it is suboptimal on s/390.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - audit all 32 bit pointer accesses and make them use compat_ioctl(), because of the necessary conversion on s390 - introduce ULONG_IOCTL() which is used instead of COMPATIBLE_IOCTL() for all ioctls that have their argument encoded in 'arg' instead of the memory pointed to by arg. Same reason as above. - remove most #ifdefs in <linux/compat_ioctl.h>: They don't make any sense if the respective handlers in fs/compat_ioctl.c are not disabled as well and they are potentially harmful (the CONFIG_BLK_DEV_DM e.g. was insufficient). - comment out COMPATIBLE_IOCTL(SIOCSIFBR) and COMPATIBLE_IOCTL(SIOCGIFBR), they appear to require a handler - implement copy_in_user for s390, as needed for many handlers in fs/compat_ioctl.c - get rid of all duplicate stuff in arch/s390/kernel/compat_ioctl.c that is also in fs/compat_ioctl.c
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Add emulation for sys_fadvise64 and sys_fadvise64_64. - Use common code wrapper for sys_sched_setaffinity and sys_sched_getaffinity. - Remove unused put_rusage. - Add ssize_t checks for iovec lengths in do_readv_writev32. - Add emulation for posix timer system calls.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> New 3270 device driver.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> Remove all occurrences of __setup and #ifdef MODULE from the zfcp driver.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Adapt to notify api change in cio. - Add missing unregister_reboot_notifier() for error path. - Fix infinite error recovery escalation for certain port failures. - Fix reference counting. - Use GFP_ATOMIC for kmalloc while holding a spinlock. - Don't open adapter/port if unit/port is removed from configuration after it has already been closed. - Don't establish qdio queues if a port/unit is going to be removed. - Shutdown ports and units before removing them. - Use schedule_work for scsi_add_device. - Don't reopen nameserver port when an rscn was received. - Don't call scsi_done twice in zfcp_fsf_send_fcp_command_task_handler. - Get rid of scsi fake queue, scsi_reqs_active and scsi_reqs_active_wq. - Get rid of unused adapter status. - Allow enabling of scsi devices at boot time with zfcp_dev parameter. - Change name prefix from sg to sg_list for functions which work with the struct sg_list. - Don't call scsi_add_device from zfcp error recovery thread to avoid a deadlock if a scsi command sent during scsi_add_device fails. - Fix scsi i/o stall due to missing local-link-up event.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - ctc/iucv: Add module author/description/license to fsm.c - ctc/lcs/iucv/qeth: Remove dst_link_failure calls because they can trigger a BUG in icmp.c. - ctc/iucv/qeth: Use s390_root_dev_{register,unregister} to fix reference counting for the group device sysfs root object. - ctc/lcs/qeth: Fix ccwgroup behaviour, remove should not imply offline. - ctc: Adapt to notify api change in cio. - ctc: Remove duplicate put_user. - iucv: Fix oops with empty netiucv peer name. - iucv: Use GFP_ATOMIC for kmalloc from tasklet. - iucv: Fix removal of attritubes. - qeth: Use correct length in clearing of MAC address. - qeth: Queue multicast and broadcast packets into the last queue on HiperSocket. - qeth: Reenable send control data after i/o error. - qeth: Find correct recbuf in qeth_send_control_data. - qeth: Handle VM startlan disabled. - qeth: Set flags for vipa entries. - qeth: Correct netmask on vipa setting. - qeth: Fix spinlock problems ("scheduling while atomic"). - qeth: Avoid setting multicast IP addresses several times. - qeth: Fix /proc/qeth format. - qeth: Fix race on device removal.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Add module license gpl. - Add debug messages. - Make blocksize persistent after close. Limit blocksize to 64k. - Check tape state against TS_INIT/TS_UNUSED for special case of medium sense and assign. - Assign tape as soon as they are set online and unassign when set offline. - Correct implementation of MT_EOD. - Add backward compatible tape.agent hotplug support (to be removed as soon as a full blown tape class is implemented). - Add state to differentiate between character device and block device access. - Make tape block device usable. - Add 34xx seek speedup code. - Fix device reference counting. - Fix online-offline-online cycle. - Add timeout to standard assign function. - Correct calculation of device index in tape_get_device(). - Check idal buffer for fixed block size reads and writes. - Adapt to notify api change in cio. - Add sysfs attributes for tape state, first minor, current operation and current blocksize.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Fix interrupt status examination. - Make dasd device attributes dependent on the devmap structure instead of the device structure to make them persistent and to be able to modify them in the offline state. - Allow changing the readonly attribute while dasd is online. - Add (diag) option to dasd= paramter. - Add missing spin_lock_init call. - Increase ref_count in dasd_device_from_cdev and add matching dasd_put_device pairs. - Adapt to notify api change in cio. - Fix bug in 3990 error recovery for cable pulls on ESS. - Replace kmap by page_address (no highmem on s/390). - Set correct default cache mode on ESS for eckd devices. - Change dasd names from "dasdx" to "dasd_<busid>_".
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - 3215: Adapt to notify api change in cio. - 3215/sclp: move copy_from_user out of locked code.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Make blacklist busid-aware. Add "all" keyword and ! operator to cio_ignore kernel parameter. - Add state change notify function for ccw devices (not mandatory) and introduce the "device disconnected" state. - Remove auto offline from remove function for ccw devices to be able to distinguish between user initiated offline and implicit offline due to device removal. - Store pointer to subchannel structure in the (hardware) subchannel intparm and remove the ioinfo array (hurray...). Remove intparm parameter of cio_start. - Use busid instead of subchannel number for debug output. - Use an opm mask to track which paths are logically online for a subchannel. - Pathgroup every device it was requested for, even single path devices. - Give i/o on a logically switched off path a grace period to complete, then kill the i/o to get the path offline. - Correctly initialize all spin_locks with spin_lock_init. - Handle status pending/busy while disabling subchannel. - Set busid already in cio_validate_subchannel. - Add s390_root_dev_{register,unregister} functions. - Do stcrw() inside a kernel thread. Add crw overflow handling. - Use subchannel lock directly instead of ccw device lock pointer in ccw_device_recognition to avoid accessing an already free structure. - Take/release ccw device lock in ccw_device_console_enable. - Don't wipe out the busid field in ccw_device_console_enable. - Call ccw_device_unregister() directly on a notoper event - delaying it via queue_work is harmful (subchannel may be removed before ccw_device). - Handle not opertional condition in ccw_device_cancel_halt_clear. - Correct status pending handling: don't collect pending status directly but wait for the interrupt to show up. - Enable subchannel when trying a steal lock operation. - Introduce doverify bit for delayed path verification. - Fix locking in __ccw_device_retry_loop/read_conf_data/read_dev/chars. - Make SPID retry mechanism more obvious. - qdio: check return code of ccw_device_{halt,clear} in qdio_cleanup. Don't try to wait for an interrupt we won't get. - qdio: fix shared indicators. - qdio: add code to handle i/o killed by cio with active queues. - qdio: don't do a shutdown on timeout in interrupt context. - Update cio documentation.
-
Andrew Morton authored
From: Martin Schwidefsky <schwidefsky@de.ibm.com> - Add console_unblank in machine_{restart,halt,power_off} to get all messages on the screen. - Set console_irq to -1 if condev= parameter is present. - Fix write_trylock for 64 bit. - Fix svc restarting. - System call number on 64 bit is an int. Fix compare in entry64.S. - Fix tlb flush problem. - Use the idte instruction to flush tlbs of a particular mm. - Fix ptrace. - Add fadvise64_64 system call wrapper. - Fix pfault handling. - Do not clobber _PAGE_INVALID_NONE pages in pte_wrprotect. - Fix siginfo_t size problem (needs to be 128 for s390x, not 136). - Avoid direct assignment to tsk->state, use __set_task_state. - Always make any pending restarted system call return -EINTR. - Add panic_on_oops. - Display symbol for psw address in show_trace. - Don't discard sections .exit.text, .exit.data and .eh_frame, otherwise stabs information for kerntypes will get lost. - Add memory clobber to assembler inline in ip_fast_checksum for gcc 3.3. - Fix softirq_pending calls for the current cpu (cpu == smp_processor_id()). - Remove BUG_ON in irq_enter. Two irq_enters are possible.
-
Andrew Morton authored
From: gerg@snapgear.com Implement a null find_extend_vma() function for non-MMU architectures. It is called from a couple of places, so needs to be present.
-
Andrew Morton authored
From: gerg@snapgear.com Remove m68knommu types.h, use m68k types.h instead. At this level there is no difference between the basic m68k types and the m68knommu types, no point having 2 versions of the same file.
-
Andrew Morton authored
From: gerg@snapgear.com Fix cpu stats code to match changes to higher level kstat data structure for m68knommu ColdFire CPU architectures. This fixes all ColdFire sub-architecture CPU types.
-
Andrew Morton authored
From: gerg@snapgear.com Remove include of non-existant net/module.h in m68knommu architecture specific checksum code.
-
Andrew Morton authored
From: gerg@snapgear.com Implement the sched_clock() function for m68knommu architectures.
-
Andrew Morton authored
From: gerg@snapgear.com Add module support for m68knommu architecture.
-
Andrew Morton authored
From: gerg@snapgear.com Allow for building of module support for m68knommu architecture.
-
Andrew Morton authored
From: Trond Myklebust <trond.myklebust@fys.uio.no> Enabling rpc_debug can currently result in an Oops due to an incorrect pointer check.
-
Andrew Morton authored
From: Trond Myklebust <trond.myklebust@fys.uio.no> Fix a bug in the NFS write code whereby writepage() may end up deadlocking on clear_inode().
-
Andrew Morton authored
From: Trond Myklebust <trond.myklebust@fys.uio.no> The nfs_permission() code needs to check for "local" mount flags such as "ro" *before* it decides to optimize away any permissions tests.
-
Andrew Morton authored
From: Trond Myklebust <trond.myklebust@fys.uio.no> The following patch fixes a bug when initializing the intent structure in sys_uselib(): intents use the FMODE_READ convention rather than O_RDONLY. It also adds a missing open intent to open_exec(). This ensures that NFS clients will do the necessary close-to-open data cache consistency checking.
-
Andrew Morton authored
Currently, when calling nfs_commit_file(), we check the range argument, and only commit NFS write requests that fall within the given range. This is silly, since all servers use fsync(), to honour a COMMIT call, and so will sync all pending writes to stable storage. The following patch ensures that if at least one NFS write falls within the range specified by the call to nfs_commit_file(), then we commit all outstanding writes on that file. This fixes a sometimes severe inefficiency when combining reads and writes: nfs_wb_page() is used to clear out writes prior to scheduling a read(), and can end up calling COMMIT for each page to be read.
-
Andrew Morton authored
From: Trond Myklebust <trond.myklebust@fys.uio.no> If users set the execute bit on a file, and then write to it, remove_suid() causes a flood of SETATTR calls (one per write() syscall) with no arguments to be sent down the wire. The server will in any case clear the suid bit itself without any prompting from us, so the following patch simply filters away all SETATTR requests with empty or unsupported ia_valid fields.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch implements two new access controls for SELinux: SEND_MSG and RECV_MSG, providing mediation of network packets based on destination port (IPv4 only at this stage).
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch is a rework of the skb audit logging code in SELinux. Rather than relying on skb header pointers, it parses the skb for specific protocols (TCP and UDP for IPv4 at this stage). This is safer for the case of locally generated raw packets, which can be malformed. It also now takes fragmented skbs into account. The new code allows the caller to parse the skb so that parsed information can be more readily re-used.
-
Andrew Morton authored
From: Stephen Smalley <sds@epoch.ncsc.mil> Use obj-$(CONFIG_FOO) instead of `ifeq'.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds dname to audit output when a path cannot be generated. This makes analysis of SELinux audit logs easier. Patch by Stephen Smalley <sds@epoch.ncsc.mil>.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds a new option for Unix sockets, SO_PEERSEC, and an associated LSM hook, getpeersec. The SELinux handler is also included. The purpose of this is to allow applications to obtain each others security credentials, analagously to the existing SO_PEERCRED option. Examples of use are Security Enhanced D-BUS and Security Enhanced X. This patch was previously approved in principle by David, and has been updated with feedback from Chris Wright and extended to cover all architectures.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This is a cleanup for the SELinux code, which converts all remaining appropriate socket hooks over to using socket_has_perm().
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds a new SELinux access control, node_bind, which can be used to restrict the local IP address to which an application may bind.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds 'node' access controls for SELinux, which allows network traffic to be controlled on the basis of remote address. Like the previous patch, similar functionality was present in earlier SELinux implementations; this is a rework within the constraints of the LSM hooks present in the mainline kernel.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds netif access controls for SELinux, which allows network traffic to be controlled on the basis of associated network interface. Similar functionality was present in earlier SELinux implementations; this is a rework within the constraints of the LSM hooks present in the mainline kernel.
-
Andrew Morton authored
From: James Morris <jmorris@redhat.com> This patch adds controls to the SELinux module over the setting and inheritance of resource limits. With these controls, the ability to set hard limits can be limited to specific processes such as login, and when an untrusted process invokes a more trusted program, soft limits can be reset, thereby avoiding failures in the trusted program due to malicious setting of the soft limit by the untrusted process. Roland McGrath provided input and feedback on the patch, which was implemented by Stephen Smalley <sds@epoch.ncsc.mil>.
-
Andrew Morton authored
From: Muli Ben-Yehuda <mulix@mulix.org> Yet another sound/oss/trident cleanup patch. This one replace the TRDBG debugging macro with the standard pr_debug. Patch is from Eugene Teo <eugene.teo@eugeneteo.net>, slightly modified by me to apply against 2.6.0-rc1-mm1 with the other cleanup patches applied.
-
Andrew Morton authored
From: Muli Ben-Yehuda <mulix@mulix.org> - switch lock_set_fmt() and unlock_set_fmt() from macros to inline functions. Macros that call return() are EVIL. - simplify lock_set_fmt() and implement it via test_and_set_bit() rather than a spinlock protecting an int. - fix a bug wherein we would do an up() on a semaphore that hasn't been down()ed if a signal happened after timeout in trident_write(). - fix a bug where we would not release the open_sem on OOM. - make the arguments for prog_dmabuf clearer (int -> enum), and add two wrapper functions around it, one for record and one for playback. - fix a bug where we would call VALIDATE_STATE after lock_kernel(). Since VALIDATE_STATE does 'return' if validation fails, bad things can happen. Thanks to Dawson Engler <engler@stanford.edu> and the Stanford checker for spotting. - remove the calls to lock_kernel() from trident_release() and trident_mmap(). trident_release() appears to be covered by the open_sem, and trident_mmap() is covered by state->sem. - s/TRUE/1/, s/FALSE/0/
-