Commit 1f60ade2 authored by Linus Torvalds's avatar Linus Torvalds

Merge master.kernel.org:/home/mingo/bk-sched

into home.transmeta.com:/home/torvalds/v2.5/linux
parents 8509486a 3986594c
...@@ -50,27 +50,27 @@ prototypes: ...@@ -50,27 +50,27 @@ prototypes:
int (*removexattr) (struct dentry *, const char *); int (*removexattr) (struct dentry *, const char *);
locking rules: locking rules:
all may block all may block, none have BKL
BKL i_sem(inode) i_sem(inode)
lookup: no yes lookup: yes
create: no yes create: yes
link: no yes (both) link: yes (both)
mknod: no yes mknod: yes
symlink: no yes symlink: yes
mkdir: no yes mkdir: yes
unlink: no yes (both) unlink: yes (both)
rmdir: no yes (both) (see below) rmdir: yes (both) (see below)
rename: no yes (all) (see below) rename: yes (all) (see below)
readlink: no no readlink: no
follow_link: no no follow_link: no
truncate: no yes (see below) truncate: yes (see below)
setattr: no yes setattr: yes
permission: yes no permission: no
getattr: no no getattr: no
setxattr: no yes setxattr: yes
getxattr: no yes getxattr: yes
listxattr: no yes listxattr: yes
removexattr: no yes removexattr: yes
Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_sem on Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_sem on
victim. victim.
cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem. cross-directory ->rename() has (per-superblock) ->s_vfs_rename_sem.
......
...@@ -81,9 +81,9 @@ can relax your locking. ...@@ -81,9 +81,9 @@ can relax your locking.
[mandatory] [mandatory]
->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(), ->lookup(), ->truncate(), ->create(), ->unlink(), ->mknod(), ->mkdir(),
->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename() and ->readdir() ->rmdir(), ->link(), ->lseek(), ->symlink(), ->rename(), ->permission()
are called without BKL now. Grab it on the entry, drop upon return - that and ->readdir() are called without BKL now. Grab it on entry, drop upon return
will guarantee the same locking you used to have. If your method or its - that will guarantee the same locking you used to have. If your method or its
parts do not need BKL - better yet, now you can shift lock_kernel() and parts do not need BKL - better yet, now you can shift lock_kernel() and
unlock_kernel() so that they would protect exactly what needs to be unlock_kernel() so that they would protect exactly what needs to be
protected. protected.
......
...@@ -948,120 +948,43 @@ program to load modules on demand. ...@@ -948,120 +948,43 @@ program to load modules on demand.
----------------------------------------------- -----------------------------------------------
The files in this directory can be used to tune the operation of the virtual The files in this directory can be used to tune the operation of the virtual
memory (VM) subsystem of the Linux kernel. In addition, one of the files memory (VM) subsystem of the Linux kernel.
(bdflush) has some influence on disk usage.
bdflush dirty_background_ratio
------- ----------------------
This file controls the operation of the bdflush kernel daemon. It currently
contains nine integer values, six of which are actually used by the kernel.
They are listed in table 2-2.
Table 2-2: Parameters in /proc/sys/vm/bdflush
..............................................................................
Value Meaning
nfract Percentage of buffer cache dirty to activate bdflush
ndirty Maximum number of dirty blocks to write out per wake-cycle
nrefill Number of clean buffers to try to obtain each time we call refill
nref_dirt buffer threshold for activating bdflush when trying to refill
buffers.
dummy Unused
age_buffer Time for normal buffer to age before we flush it
age_super Time for superblock to age before we flush it
dummy Unused
dummy Unused
..............................................................................
nfract
------
This parameter governs the maximum number of dirty buffers in the buffer
cache. Dirty means that the contents of the buffer still have to be written to
disk (as opposed to a clean buffer, which can just be forgotten about).
Setting this to a higher value means that Linux can delay disk writes for a
long time, but it also means that it will have to do a lot of I/O at once when
memory becomes short. A lower value will spread out disk I/O more evenly.
ndirty
------
Ndirty gives the maximum number of dirty buffers that bdflush can write to the
disk at one time. A high value will mean delayed, bursty I/O, while a small
value can lead to memory shortage when bdflush isn't woken up often enough.
nrefill
-------
This is the number of buffers that bdflush will add to the list of free
buffers when refill_freelist() is called. It is necessary to allocate free
buffers beforehand, since the buffers are often different sizes than the
memory pages and some bookkeeping needs to be done beforehand. The higher the
number, the more memory will be wasted and the less often refill_freelist()
will need to run.
nref_dirt
---------
When refill_freelist() comes across more than nref_dirt dirty buffers, it will
wake up bdflush.
age_buffer and age_super
------------------------
Finally, the age_buffer and age_super parameters govern the maximum time Linux
waits before writing out a dirty buffer to disk. The value is expressed in
jiffies (clockticks), the number of jiffies per second is 100. Age_buffer is
the maximum age for data blocks, while age_super is for filesystems meta data.
buffermem
---------
The three values in this file control how much memory should be used for
buffer memory. The percentage is calculated as a percentage of total system
memory.
The values are:
min_percent
-----------
This is the minimum percentage of memory that should be spent on buffer Contains, as a percentage of total system memory, the number of pages at which
memory. the pdflush background writeback daemon will start writing out dirty data.
borrow_percent dirty_async_ratio
-------------- -----------------
When Linux is short on memory, and the buffer cache uses more than it has been Contains, as a percentage of total system memory, the number of pages at which
allotted, the memory management (MM) subsystem will prune the buffer cache a process which is generating disk writes will itself start writing out dirty
more heavily than other memory to compensate. data.
max_percent dirty_sync_ratio
----------- ----------------
This is the maximum amount of memory that can be used for buffer memory. Contains, as a percentage of total system memory, the number of pages at which
a process which is generating disk writes will itself start writing out dirty
data and waiting upon completion of that writeout.
freepages dirty_writeback_centisecs
--------- -------------------------
This file contains three values: min, low and high: The pdflush writeback daemons will periodically wake up and write `old' data
out to disk. This tunable expresses the interval between those wakeups, in
100'ths of a second.
min dirty_expire_centisecs
--- ----------------------
When the number of free pages in the system reaches this number, only the
kernel can allocate more memory.
low This tunable is used to define when dirty data is old enough to be eligible
--- for writeout by the pdflush daemons. It is expressed in 100'ths of a second.
If the number of free pages falls below this point, the kernel starts swapping Data which has been dirty in-memory for longer than this interval will be
aggressively. written out next time a pdflush daemon wakes up.
high
----
The kernel tries to keep up to this amount of memory free; if memory falls
below this point, the kernel starts gently swapping in the hopes that it never
has to do really aggressive swapping.
kswapd kswapd
------ ------
...@@ -1113,79 +1036,6 @@ On the other hand, enabling this feature can cause you to run out of memory ...@@ -1113,79 +1036,6 @@ On the other hand, enabling this feature can cause you to run out of memory
and thrash the system to death, so large and/or important servers will want to and thrash the system to death, so large and/or important servers will want to
set this value to 0. set this value to 0.
pagecache
---------
This file does exactly the same job as buffermem, only this file controls the
amount of memory allowed for memory mapping and generic caching of files.
You don't want the minimum level to be too low, otherwise your system might
thrash when memory is tight or fragmentation is high.
pagetable_cache
---------------
The kernel keeps a number of page tables in a per-processor cache (this helps
a lot on SMP systems). The cache size for each processor will be between the
low and the high value.
On a low-memory, single CPU system, you can safely set these values to 0 so
you don't waste memory. It is used on SMP systems so that the system can
perform fast pagetable allocations without having to acquire the kernel memory
lock.
For large systems, the settings are probably fine. For normal systems they
won't hurt a bit. For small systems ( less than 16MB ram) it might be
advantageous to set both values to 0.
swapctl
-------
This file contains no less than 8 variables. All of these values are used by
kswapd.
The first four variables
* sc_max_page_age,
* sc_page_advance,
* sc_page_decline and
* sc_page_initial_age
are used to keep track of Linux's page aging. Page aging is a bookkeeping
method to track which pages of memory are often used, and which pages can be
swapped out without consequences.
When a page is swapped in, it starts at sc_page_initial_age (default 3) and
when the page is scanned by kswapd, its age is adjusted according to the
following scheme:
* If the page was used since the last time we scanned, its age is increased
by sc_page_advance (default 3). Where the maximum value is given by
sc_max_page_age (default 20).
* Otherwise (meaning it wasn't used) its age is decreased by sc_page_decline
(default 1).
When a page reaches age 0, it's ready to be swapped out.
The variables sc_age_cluster_fract, sc_age_cluster_min, sc_pageout_weight and
sc_bufferout_weight, can be used to control kswapd's aggressiveness in
swapping out pages.
Sc_age_cluster_fract is used to calculate how many pages from a process are to
be scanned by kswapd. The formula used is
(sc_age_cluster_fract divided by 1024) times resident set size
So if you want kswapd to scan the whole process, sc_age_cluster_fract needs to
have a value of 1024. The minimum number of pages kswapd will scan is
represented by sc_age_cluster_min, which is done so that kswapd will also scan
small processes.
The values of sc_pageout_weight and sc_bufferout_weight are used to control
how many tries kswapd will make in order to swap out one page/buffer. These
values can be used to fine-tune the ratio between user pages and buffer/cache
memory. When you find that your Linux system is swapping out too many process
pages in order to satisfy buffer memory demands, you may want to either
increase sc_bufferout_weight, or decrease the value of sc_pageout_weight.
2.5 /proc/sys/dev - Device specific parameters 2.5 /proc/sys/dev - Device specific parameters
---------------------------------------------- ----------------------------------------------
......
...@@ -9,116 +9,28 @@ This file contains the documentation for the sysctl files in ...@@ -9,116 +9,28 @@ This file contains the documentation for the sysctl files in
/proc/sys/vm and is valid for Linux kernel version 2.2. /proc/sys/vm and is valid for Linux kernel version 2.2.
The files in this directory can be used to tune the operation The files in this directory can be used to tune the operation
of the virtual memory (VM) subsystem of the Linux kernel, and of the virtual memory (VM) subsystem of the Linux kernel and
one of the files (bdflush) also has a little influence on disk the writeout of dirty data to disk.
usage.
Default values and initialization routines for most of these Default values and initialization routines for most of these
files can be found in mm/swap.c. files can be found in mm/swap.c.
Currently, these files are in /proc/sys/vm: Currently, these files are in /proc/sys/vm:
- bdflush
- buffermem
- freepages
- kswapd - kswapd
- overcommit_memory - overcommit_memory
- page-cluster - page-cluster
- pagecache - dirty_async_ratio
- pagetable_cache - dirty_background_ratio
- dirty_expire_centisecs
- dirty_sync_ratio
- dirty_writeback_centisecs
============================================================== ==============================================================
bdflush: dirty_async_ratio, dirty_background_ratio, dirty_expire_centisecs,
dirty_sync_ratio dirty_writeback_centisecs:
This file controls the operation of the bdflush kernel
daemon. The source code to this struct can be found in
linux/fs/buffer.c. It currently contains 9 integer values,
of which 4 are actually used by the kernel.
From linux/fs/buffer.c:
--------------------------------------------------------------
union bdflush_param {
struct {
int nfract; /* Percentage of buffer cache dirty to
activate bdflush */
int dummy1; /* old "ndirty" */
int dummy2; /* old "nrefill" */
int dummy3; /* unused */
int interval; /* jiffies delay between kupdate flushes */
int age_buffer; /* Time for normal buffer to age */
int nfract_sync;/* Percentage of buffer cache dirty to
activate bdflush synchronously */
int dummy4; /* unused */
int dummy5; /* unused */
} b_un;
unsigned int data[N_PARAM];
} bdf_prm = {{30, 64, 64, 256, 5*HZ, 30*HZ, 60, 0, 0}};
--------------------------------------------------------------
int nfract:
The first parameter governs the maximum number of dirty
buffers in the buffer cache. Dirty means that the contents
of the buffer still have to be written to disk (as opposed
to a clean buffer, which can just be forgotten about).
Setting this to a high value means that Linux can delay disk
writes for a long time, but it also means that it will have
to do a lot of I/O at once when memory becomes short. A low
value will spread out disk I/O more evenly, at the cost of
more frequent I/O operations. The default value is 30%,
the minimum is 0%, and the maximum is 100%.
int interval:
The fifth parameter, interval, is the minimum rate at
which kupdate will wake and flush. The value is expressed in
jiffies (clockticks), the number of jiffies per second is
normally 100 (Alpha is 1024). Thus, x*HZ is x seconds. The
default value is 5 seconds, the minimum is 0 seconds, and the
maximum is 600 seconds.
int age_buffer:
The sixth parameter, age_buffer, governs the maximum time
Linux waits before writing out a dirty buffer to disk. The
value is in jiffies. The default value is 30 seconds,
the minimum is 1 second, and the maximum 6,000 seconds.
int nfract_sync:
The seventh parameter, nfract_sync, governs the percentage
of buffer cache that is dirty before bdflush activates
synchronously. This can be viewed as the hard limit before
bdflush forces buffers to disk. The default is 60%, the
minimum is 0%, and the maximum is 100%.
============================================================== See Documentation/filesystems/proc.txt
buffermem:
The three values in this file correspond to the values in
the struct buffer_mem. It controls how much memory should
be used for buffer memory. The percentage is calculated
as a percentage of total system memory.
The values are:
min_percent -- this is the minimum percentage of memory
that should be spent on buffer memory
borrow_percent -- UNUSED
max_percent -- UNUSED
==============================================================
freepages:
This file contains the values in the struct freepages. That
struct contains three members: min, low and high.
The meaning of the numbers is:
freepages.min When the number of free pages in the system
reaches this number, only the kernel can
allocate more memory.
freepages.low If the number of free pages gets below this
point, the kernel starts swapping aggressively.
freepages.high The kernel tries to keep up to this amount of
memory free; if memory comes below this point,
the kernel gently starts swapping in the hopes
that it never has to do real aggressive swapping.
============================================================== ==============================================================
...@@ -180,38 +92,3 @@ The number of pages the kernel reads in at once is equal to ...@@ -180,38 +92,3 @@ The number of pages the kernel reads in at once is equal to
2 ^ page-cluster. Values above 2 ^ 5 don't make much sense 2 ^ page-cluster. Values above 2 ^ 5 don't make much sense
for swap because we only cluster swap data in 32-page groups. for swap because we only cluster swap data in 32-page groups.
==============================================================
pagecache:
This file does exactly the same as buffermem, only this
file controls the struct page_cache, and thus controls
the amount of memory used for the page cache.
In 2.2, the page cache is used for 3 main purposes:
- caching read() data from files
- caching mmap()ed data and executable files
- swap cache
When your system is both deep in swap and high on cache,
it probably means that a lot of the swapped data is being
cached, making for more efficient swapping than possible
with the 2.0 kernel.
==============================================================
pagetable_cache:
The kernel keeps a number of page tables in a per-processor
cache (this helps a lot on SMP systems). The cache size for
each processor will be between the low and the high value.
On a low-memory, single CPU system you can safely set these
values to 0 so you don't waste the memory. On SMP systems it
is used so that the system can do fast pagetable allocations
without having to acquire the kernel memory lock.
For large systems, the settings are probably OK. For normal
systems they won't hurt a bit. For small systems (<16MB ram)
it might be advantageous to set both values to 0.
...@@ -48,6 +48,8 @@ ...@@ -48,6 +48,8 @@
#include "proto.h" #include "proto.h"
#include "irq_impl.h" #include "irq_impl.h"
u64 jiffies_64;
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
extern unsigned long wall_jiffies; /* kernel/timer.c */ extern unsigned long wall_jiffies; /* kernel/timer.c */
......
...@@ -32,6 +32,8 @@ ...@@ -32,6 +32,8 @@
#include <asm/irq.h> #include <asm/irq.h>
#include <asm/leds.h> #include <asm/leds.h>
u64 jiffies_64;
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
extern unsigned long wall_jiffies; extern unsigned long wall_jiffies;
......
...@@ -44,6 +44,8 @@ ...@@ -44,6 +44,8 @@
#include <asm/svinto.h> #include <asm/svinto.h>
u64 jiffies_64;
static int have_rtc; /* used to remember if we have an RTC or not */ static int have_rtc; /* used to remember if we have an RTC or not */
/* define this if you need to use print_timestamp */ /* define this if you need to use print_timestamp */
......
...@@ -360,8 +360,9 @@ void __global_cli(void) ...@@ -360,8 +360,9 @@ void __global_cli(void)
__save_flags(flags); __save_flags(flags);
if (flags & (1 << EFLAGS_IF_SHIFT)) { if (flags & (1 << EFLAGS_IF_SHIFT)) {
int cpu = smp_processor_id(); int cpu;
__cli(); __cli();
cpu = smp_processor_id();
if (!local_irq_count(cpu)) if (!local_irq_count(cpu))
get_irqlock(cpu); get_irqlock(cpu);
} }
...@@ -369,11 +370,12 @@ void __global_cli(void) ...@@ -369,11 +370,12 @@ void __global_cli(void)
void __global_sti(void) void __global_sti(void)
{ {
int cpu = smp_processor_id(); int cpu = get_cpu();
if (!local_irq_count(cpu)) if (!local_irq_count(cpu))
release_irqlock(cpu); release_irqlock(cpu);
__sti(); __sti();
put_cpu();
} }
/* /*
......
...@@ -65,6 +65,7 @@ ...@@ -65,6 +65,7 @@
*/ */
#include <linux/irq.h> #include <linux/irq.h>
u64 jiffies_64;
unsigned long cpu_khz; /* Detected as we calibrate the TSC */ unsigned long cpu_khz; /* Detected as we calibrate the TSC */
......
...@@ -9,6 +9,7 @@ ...@@ -9,6 +9,7 @@
O_TARGET := mm.o O_TARGET := mm.o
obj-y := init.o fault.o ioremap.o extable.o obj-y := init.o fault.o ioremap.o extable.o pageattr.o
export-objs := pageattr.o
include $(TOPDIR)/Rules.make include $(TOPDIR)/Rules.make
...@@ -10,12 +10,13 @@ ...@@ -10,12 +10,13 @@
#include <linux/vmalloc.h> #include <linux/vmalloc.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/slab.h>
#include <asm/io.h> #include <asm/io.h>
#include <asm/pgalloc.h> #include <asm/pgalloc.h>
#include <asm/fixmap.h> #include <asm/fixmap.h>
#include <asm/cacheflush.h> #include <asm/cacheflush.h>
#include <asm/tlbflush.h> #include <asm/tlbflush.h>
#include <asm/pgtable.h>
static inline void remap_area_pte(pte_t * pte, unsigned long address, unsigned long size, static inline void remap_area_pte(pte_t * pte, unsigned long address, unsigned long size,
unsigned long phys_addr, unsigned long flags) unsigned long phys_addr, unsigned long flags)
...@@ -155,6 +156,7 @@ void * __ioremap(unsigned long phys_addr, unsigned long size, unsigned long flag ...@@ -155,6 +156,7 @@ void * __ioremap(unsigned long phys_addr, unsigned long size, unsigned long flag
area = get_vm_area(size, VM_IOREMAP); area = get_vm_area(size, VM_IOREMAP);
if (!area) if (!area)
return NULL; return NULL;
area->phys_addr = phys_addr;
addr = area->addr; addr = area->addr;
if (remap_area_pages(VMALLOC_VMADDR(addr), phys_addr, size, flags)) { if (remap_area_pages(VMALLOC_VMADDR(addr), phys_addr, size, flags)) {
vfree(addr); vfree(addr);
...@@ -163,10 +165,71 @@ void * __ioremap(unsigned long phys_addr, unsigned long size, unsigned long flag ...@@ -163,10 +165,71 @@ void * __ioremap(unsigned long phys_addr, unsigned long size, unsigned long flag
return (void *) (offset + (char *)addr); return (void *) (offset + (char *)addr);
} }
/**
* ioremap_nocache - map bus memory into CPU space
* @offset: bus address of the memory
* @size: size of the resource to map
*
* ioremap_nocache performs a platform specific sequence of operations to
* make bus memory CPU accessible via the readb/readw/readl/writeb/
* writew/writel functions and the other mmio helpers. The returned
* address is not guaranteed to be usable directly as a virtual
* address.
*
* This version of ioremap ensures that the memory is marked uncachable
* on the CPU as well as honouring existing caching rules from things like
* the PCI bus. Note that there are other caches and buffers on many
* busses. In particular driver authors should read up on PCI writes
*
* It's useful if some control registers are in such an area and
* write combining or read caching is not desirable:
*
* Must be freed with iounmap.
*/
void *ioremap_nocache (unsigned long phys_addr, unsigned long size)
{
void *p = __ioremap(phys_addr, size, _PAGE_PCD);
if (!p)
return p;
if (phys_addr + size < virt_to_phys(high_memory)) {
struct page *ppage = virt_to_page(__va(phys_addr));
unsigned long npages = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
BUG_ON(phys_addr+size > (unsigned long)high_memory);
BUG_ON(phys_addr + size < phys_addr);
if (change_page_attr(ppage, npages, PAGE_KERNEL_NOCACHE) < 0) {
iounmap(p);
p = NULL;
}
}
return p;
}
void iounmap(void *addr) void iounmap(void *addr)
{ {
if (addr > high_memory) struct vm_struct *p;
return vfree((void *) (PAGE_MASK & (unsigned long) addr)); if (addr < high_memory)
return;
p = remove_kernel_area(addr);
if (!p) {
printk("__iounmap: bad address %p\n", addr);
return;
}
BUG_ON(p->phys_addr == 0); /* not allocated with ioremap */
vmfree_area_pages(VMALLOC_VMADDR(p->addr), p->size);
if (p->flags && p->phys_addr < virt_to_phys(high_memory)) {
change_page_attr(virt_to_page(__va(p->phys_addr)),
p->size >> PAGE_SHIFT,
PAGE_KERNEL);
}
kfree(p);
} }
void __init *bt_ioremap(unsigned long phys_addr, unsigned long size) void __init *bt_ioremap(unsigned long phys_addr, unsigned long size)
......
/*
* Copyright 2002 Andi Kleen, SuSE Labs.
* Thanks to Ben LaHaise for precious feedback.
*/
#include <linux/config.h>
#include <linux/mm.h>
#include <linux/sched.h>
#include <linux/highmem.h>
#include <linux/module.h>
#include <linux/slab.h>
#include <asm/uaccess.h>
#include <asm/processor.h>
static inline pte_t *lookup_address(unsigned long address)
{
pgd_t *pgd = pgd_offset_k(address);
pmd_t *pmd = pmd_offset(pgd, address);
if (pmd_large(*pmd))
return (pte_t *)pmd;
return pte_offset_kernel(pmd, address);
}
static struct page *split_large_page(unsigned long address, pgprot_t prot)
{
int i;
unsigned long addr;
struct page *base = alloc_pages(GFP_KERNEL, 0);
pte_t *pbase;
if (!base)
return NULL;
address = __pa(address);
addr = address & LARGE_PAGE_MASK;
pbase = (pte_t *)page_address(base);
for (i = 0; i < PTRS_PER_PTE; i++, addr += PAGE_SIZE) {
pbase[i] = pfn_pte(addr >> PAGE_SHIFT,
addr == address ? prot : PAGE_KERNEL);
}
return base;
}
static void flush_kernel_map(void *dummy)
{
/* Could use CLFLUSH here if the CPU supports it (Hammer,P4) */
if (boot_cpu_data.x86_model >= 4)
asm volatile("wbinvd":::"memory");
/* Flush all to work around Errata in early athlons regarding
* large page flushing.
*/
__flush_tlb_all();
}
static void set_pmd_pte(pte_t *kpte, unsigned long address, pte_t pte)
{
set_pte_atomic(kpte, pte); /* change init_mm */
#ifndef CONFIG_X86_PAE
{
struct list_head *l;
spin_lock(&mmlist_lock);
list_for_each(l, &init_mm.mmlist) {
struct mm_struct *mm = list_entry(l, struct mm_struct, mmlist);
pmd_t *pmd = pmd_offset(pgd_offset(mm, address), address);
set_pte_atomic((pte_t *)pmd, pte);
}
spin_unlock(&mmlist_lock);
}
#endif
}
/*
* No more special protections in this 2/4MB area - revert to a
* large page again.
*/
static inline void revert_page(struct page *kpte_page, unsigned long address)
{
pte_t *linear = (pte_t *)
pmd_offset(pgd_offset(&init_mm, address), address);
set_pmd_pte(linear, address,
pfn_pte((__pa(address) & LARGE_PAGE_MASK) >> PAGE_SHIFT,
PAGE_KERNEL_LARGE));
}
static int
__change_page_attr(struct page *page, pgprot_t prot, struct page **oldpage)
{
pte_t *kpte;
unsigned long address;
struct page *kpte_page;
#ifdef CONFIG_HIGHMEM
if (page >= highmem_start_page)
BUG();
#endif
address = (unsigned long)page_address(page);
kpte = lookup_address(address);
kpte_page = virt_to_page(((unsigned long)kpte) & PAGE_MASK);
if (pgprot_val(prot) != pgprot_val(PAGE_KERNEL)) {
if ((pte_val(*kpte) & _PAGE_PSE) == 0) {
pte_t old = *kpte;
pte_t standard = mk_pte(page, PAGE_KERNEL);
set_pte_atomic(kpte, mk_pte(page, prot));
if (pte_same(old,standard))
atomic_inc(&kpte_page->count);
} else {
struct page *split = split_large_page(address, prot);
if (!split)
return -ENOMEM;
set_pmd_pte(kpte,address,mk_pte(split, PAGE_KERNEL));
}
} else if ((pte_val(*kpte) & _PAGE_PSE) == 0) {
set_pte_atomic(kpte, mk_pte(page, PAGE_KERNEL));
atomic_dec(&kpte_page->count);
}
if (cpu_has_pse && (atomic_read(&kpte_page->count) == 1)) {
*oldpage = kpte_page;
revert_page(kpte_page, address);
}
return 0;
}
static inline void flush_map(void)
{
#ifdef CONFIG_SMP
smp_call_function(flush_kernel_map, NULL, 1, 1);
#endif
flush_kernel_map(NULL);
}
struct deferred_page {
struct deferred_page *next;
struct page *fpage;
};
static struct deferred_page *df_list; /* protected by init_mm.mmap_sem */
/*
* Change the page attributes of an page in the linear mapping.
*
* This should be used when a page is mapped with a different caching policy
* than write-back somewhere - some CPUs do not like it when mappings with
* different caching policies exist. This changes the page attributes of the
* in kernel linear mapping too.
*
* The caller needs to ensure that there are no conflicting mappings elsewhere.
* This function only deals with the kernel linear map.
*
* Caller must call global_flush_tlb() after this.
*/
int change_page_attr(struct page *page, int numpages, pgprot_t prot)
{
int err = 0;
struct page *fpage;
int i;
down_write(&init_mm.mmap_sem);
for (i = 0; i < numpages; i++, page++) {
fpage = NULL;
err = __change_page_attr(page, prot, &fpage);
if (err)
break;
if (fpage) {
struct deferred_page *df;
df = kmalloc(sizeof(struct deferred_page), GFP_KERNEL);
if (!df) {
flush_map();
__free_page(fpage);
} else {
df->next = df_list;
df->fpage = fpage;
df_list = df;
}
}
}
up_write(&init_mm.mmap_sem);
return err;
}
void global_flush_tlb(void)
{
struct deferred_page *df, *next_df;
down_read(&init_mm.mmap_sem);
df = xchg(&df_list, NULL);
up_read(&init_mm.mmap_sem);
flush_map();
for (; df; df = next_df) {
next_df = df->next;
if (df->fpage)
__free_page(df->fpage);
kfree(df);
}
}
EXPORT_SYMBOL(change_page_attr);
EXPORT_SYMBOL(global_flush_tlb);
...@@ -27,6 +27,8 @@ extern rwlock_t xtime_lock; ...@@ -27,6 +27,8 @@ extern rwlock_t xtime_lock;
extern unsigned long wall_jiffies; extern unsigned long wall_jiffies;
extern unsigned long last_time_offset; extern unsigned long last_time_offset;
u64 jiffies_64;
#ifdef CONFIG_IA64_DEBUG_IRQ #ifdef CONFIG_IA64_DEBUG_IRQ
unsigned long last_cli_ip; unsigned long last_cli_ip;
......
...@@ -24,6 +24,7 @@ ...@@ -24,6 +24,7 @@
#include <linux/timex.h> #include <linux/timex.h>
u64 jiffies_64;
static inline int set_rtc_mmss(unsigned long nowtime) static inline int set_rtc_mmss(unsigned long nowtime)
{ {
......
...@@ -32,6 +32,8 @@ ...@@ -32,6 +32,8 @@
#define USECS_PER_JIFFY (1000000/HZ) #define USECS_PER_JIFFY (1000000/HZ)
#define USECS_PER_JIFFY_FRAC ((1000000ULL << 32) / HZ & 0xffffffff) #define USECS_PER_JIFFY_FRAC ((1000000ULL << 32) / HZ & 0xffffffff)
u64 jiffies_64;
/* /*
* forward reference * forward reference
*/ */
......
...@@ -32,6 +32,8 @@ ...@@ -32,6 +32,8 @@
#include <asm/sysmips.h> #include <asm/sysmips.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
u64 jiffies_64;
extern asmlinkage void syscall_trace(void); extern asmlinkage void syscall_trace(void);
asmlinkage int sys_pipe(abi64_no_regargs, struct pt_regs regs) asmlinkage int sys_pipe(abi64_no_regargs, struct pt_regs regs)
......
...@@ -30,6 +30,8 @@ ...@@ -30,6 +30,8 @@
#include <linux/timex.h> #include <linux/timex.h>
u64 jiffies_64;
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
static int timer_value; static int timer_value;
......
...@@ -70,6 +70,9 @@ ...@@ -70,6 +70,9 @@
#include <asm/time.h> #include <asm/time.h>
/* XXX false sharing with below? */
u64 jiffies_64;
unsigned long disarm_decr[NR_CPUS]; unsigned long disarm_decr[NR_CPUS];
extern int do_sys_settimeofday(struct timeval *tv, struct timezone *tz); extern int do_sys_settimeofday(struct timeval *tv, struct timezone *tz);
......
...@@ -64,6 +64,8 @@ ...@@ -64,6 +64,8 @@
void smp_local_timer_interrupt(struct pt_regs *); void smp_local_timer_interrupt(struct pt_regs *);
u64 jiffies_64;
/* keep track of when we need to update the rtc */ /* keep track of when we need to update the rtc */
time_t last_rtc_update; time_t last_rtc_update;
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
......
...@@ -39,6 +39,8 @@ ...@@ -39,6 +39,8 @@
#define TICK_SIZE tick #define TICK_SIZE tick
u64 jiffies_64;
static ext_int_info_t ext_int_info_timer; static ext_int_info_t ext_int_info_timer;
static uint64_t init_timer_cc; static uint64_t init_timer_cc;
......
...@@ -39,6 +39,8 @@ ...@@ -39,6 +39,8 @@
#define TICK_SIZE tick #define TICK_SIZE tick
u64 jiffies_64;
static ext_int_info_t ext_int_info_timer; static ext_int_info_t ext_int_info_timer;
static uint64_t init_timer_cc; static uint64_t init_timer_cc;
......
...@@ -70,6 +70,8 @@ ...@@ -70,6 +70,8 @@
#endif /* CONFIG_CPU_SUBTYPE_ST40STB1 */ #endif /* CONFIG_CPU_SUBTYPE_ST40STB1 */
#endif /* __sh3__ or __SH4__ */ #endif /* __sh3__ or __SH4__ */
u64 jiffies_64;
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
extern unsigned long wall_jiffies; extern unsigned long wall_jiffies;
#define TICK_SIZE tick #define TICK_SIZE tick
......
...@@ -43,6 +43,8 @@ ...@@ -43,6 +43,8 @@
extern rwlock_t xtime_lock; extern rwlock_t xtime_lock;
u64 jiffies_64;
enum sparc_clock_type sp_clock_typ; enum sparc_clock_type sp_clock_typ;
spinlock_t mostek_lock = SPIN_LOCK_UNLOCKED; spinlock_t mostek_lock = SPIN_LOCK_UNLOCKED;
unsigned long mstk48t02_regs = 0UL; unsigned long mstk48t02_regs = 0UL;
......
...@@ -44,6 +44,8 @@ unsigned long mstk48t02_regs = 0UL; ...@@ -44,6 +44,8 @@ unsigned long mstk48t02_regs = 0UL;
unsigned long ds1287_regs = 0UL; unsigned long ds1287_regs = 0UL;
#endif #endif
u64 jiffies_64;
static unsigned long mstk48t08_regs = 0UL; static unsigned long mstk48t08_regs = 0UL;
static unsigned long mstk48t59_regs = 0UL; static unsigned long mstk48t59_regs = 0UL;
......
...@@ -43,15 +43,9 @@ CFLAGS += -mcmodel=kernel ...@@ -43,15 +43,9 @@ CFLAGS += -mcmodel=kernel
CFLAGS += -pipe CFLAGS += -pipe
# this makes reading assembly source easier # this makes reading assembly source easier
CFLAGS += -fno-reorder-blocks CFLAGS += -fno-reorder-blocks
# needed for later gcc 3.1
CFLAGS += -finline-limit=2000 CFLAGS += -finline-limit=2000
# needed for earlier gcc 3.1
#CFLAGS += -fno-strength-reduce
#CFLAGS += -g #CFLAGS += -g
# prevent gcc from keeping the stack 16 byte aligned (FIXME)
#CFLAGS += -mpreferred-stack-boundary=2
HEAD := arch/x86_64/kernel/head.o arch/x86_64/kernel/head64.o arch/x86_64/kernel/init_task.o HEAD := arch/x86_64/kernel/head.o arch/x86_64/kernel/head64.o arch/x86_64/kernel/init_task.o
SUBDIRS := arch/x86_64/tools $(SUBDIRS) arch/x86_64/kernel arch/x86_64/mm arch/x86_64/lib SUBDIRS := arch/x86_64/tools $(SUBDIRS) arch/x86_64/kernel arch/x86_64/mm arch/x86_64/lib
......
...@@ -21,10 +21,6 @@ ROOT_DEV := CURRENT ...@@ -21,10 +21,6 @@ ROOT_DEV := CURRENT
SVGA_MODE := -DSVGA_MODE=NORMAL_VGA SVGA_MODE := -DSVGA_MODE=NORMAL_VGA
# If you want the RAM disk device, define this to be the size in blocks.
RAMDISK := -DRAMDISK=512
# --------------------------------------------------------------------------- # ---------------------------------------------------------------------------
BOOT_INCL = $(TOPDIR)/include/linux/config.h \ BOOT_INCL = $(TOPDIR)/include/linux/config.h \
......
...@@ -47,8 +47,7 @@ define_bool CONFIG_EISA n ...@@ -47,8 +47,7 @@ define_bool CONFIG_EISA n
define_bool CONFIG_X86_IO_APIC y define_bool CONFIG_X86_IO_APIC y
define_bool CONFIG_X86_LOCAL_APIC y define_bool CONFIG_X86_LOCAL_APIC y
#currently broken: bool 'MTRR (Memory Type Range Register) support' CONFIG_MTRR
#bool 'MTRR (Memory Type Range Register) support' CONFIG_MTRR
bool 'Symmetric multi-processing support' CONFIG_SMP bool 'Symmetric multi-processing support' CONFIG_SMP
if [ "$CONFIG_SMP" = "n" ]; then if [ "$CONFIG_SMP" = "n" ]; then
bool 'Preemptible Kernel' CONFIG_PREEMPT bool 'Preemptible Kernel' CONFIG_PREEMPT
...@@ -226,6 +225,7 @@ if [ "$CONFIG_DEBUG_KERNEL" != "n" ]; then ...@@ -226,6 +225,7 @@ if [ "$CONFIG_DEBUG_KERNEL" != "n" ]; then
bool ' Spinlock debugging' CONFIG_DEBUG_SPINLOCK bool ' Spinlock debugging' CONFIG_DEBUG_SPINLOCK
bool ' Additional run-time checks' CONFIG_CHECKING bool ' Additional run-time checks' CONFIG_CHECKING
bool ' Debug __init statements' CONFIG_INIT_DEBUG bool ' Debug __init statements' CONFIG_INIT_DEBUG
bool ' Spinlock debugging' CONFIG_DEBUG_SPINLOCK
fi fi
endmenu endmenu
......
...@@ -9,8 +9,9 @@ export-objs := ia32_ioctl.o sys_ia32.o ...@@ -9,8 +9,9 @@ export-objs := ia32_ioctl.o sys_ia32.o
all: ia32.o all: ia32.o
O_TARGET := ia32.o O_TARGET := ia32.o
obj-$(CONFIG_IA32_EMULATION) := ia32entry.o sys_ia32.o ia32_ioctl.o ia32_signal.o \ obj-$(CONFIG_IA32_EMULATION) := ia32entry.o sys_ia32.o ia32_ioctl.o \
ia32_binfmt.o fpu32.o socket32.o ptrace32.o ia32_signal.o \
ia32_binfmt.o fpu32.o socket32.o ptrace32.o ipc32.o
clean:: clean::
......
This diff is collapsed.
This diff is collapsed.
...@@ -14,6 +14,7 @@ ...@@ -14,6 +14,7 @@
#include <linux/smp.h> #include <linux/smp.h>
#include <linux/smp_lock.h> #include <linux/smp_lock.h>
#include <linux/stddef.h> #include <linux/stddef.h>
#include <linux/slab.h>
/* Set EXTENT bits starting at BASE in BITMAP to value TURN_ON. */ /* Set EXTENT bits starting at BASE in BITMAP to value TURN_ON. */
static void set_bitmap(unsigned long *bitmap, short base, short extent, int new_value) static void set_bitmap(unsigned long *bitmap, short base, short extent, int new_value)
...@@ -61,27 +62,19 @@ asmlinkage int sys_ioperm(unsigned long from, unsigned long num, int turn_on) ...@@ -61,27 +62,19 @@ asmlinkage int sys_ioperm(unsigned long from, unsigned long num, int turn_on)
return -EINVAL; return -EINVAL;
if (turn_on && !capable(CAP_SYS_RAWIO)) if (turn_on && !capable(CAP_SYS_RAWIO))
return -EPERM; return -EPERM;
/*
* If it's the first ioperm() call in this thread's lifetime, set the if (!t->io_bitmap_ptr) {
* IO bitmap up. ioperm() is much less timing critical than clone(), t->io_bitmap_ptr = kmalloc((IO_BITMAP_SIZE+1)*4, GFP_KERNEL);
* this is why we delay this operation until now: if (!t->io_bitmap_ptr)
*/ return -ENOMEM;
if (!t->ioperm) { memset(t->io_bitmap_ptr,0xff,(IO_BITMAP_SIZE+1)*4);
/*
* just in case ...
*/
memset(t->io_bitmap,0xff,(IO_BITMAP_SIZE+1)*4);
t->ioperm = 1;
/*
* this activates it in the TSS
*/
tss->io_map_base = IO_BITMAP_OFFSET; tss->io_map_base = IO_BITMAP_OFFSET;
} }
/* /*
* do it in the per-thread copy and in the TSS ... * do it in the per-thread copy and in the TSS ...
*/ */
set_bitmap((unsigned long *) t->io_bitmap, from, num, !turn_on); set_bitmap((unsigned long *) t->io_bitmap_ptr, from, num, !turn_on);
set_bitmap((unsigned long *) tss->io_bitmap, from, num, !turn_on); set_bitmap((unsigned long *) tss->io_bitmap, from, num, !turn_on);
return 0; return 0;
......
This diff is collapsed.
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/reboot.h> #include <linux/reboot.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/ctype.h> #include <linux/ctype.h>
#include <linux/slab.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
#include <asm/pgtable.h> #include <asm/pgtable.h>
...@@ -320,9 +321,6 @@ void show_regs(struct pt_regs * regs) ...@@ -320,9 +321,6 @@ void show_regs(struct pt_regs * regs)
printk("CR2: %016lx CR3: %016lx CR4: %016lx\n", cr2, cr3, cr4); printk("CR2: %016lx CR3: %016lx CR4: %016lx\n", cr2, cr3, cr4);
} }
#define __STR(x) #x
#define __STR2(x) __STR(x)
extern void load_gs_index(unsigned); extern void load_gs_index(unsigned);
/* /*
...@@ -330,7 +328,13 @@ extern void load_gs_index(unsigned); ...@@ -330,7 +328,13 @@ extern void load_gs_index(unsigned);
*/ */
void exit_thread(void) void exit_thread(void)
{ {
/* nothing to do ... */ struct task_struct *me = current;
if (me->thread.io_bitmap_ptr) {
kfree(me->thread.io_bitmap_ptr);
me->thread.io_bitmap_ptr = NULL;
(init_tss + smp_processor_id())->io_map_base =
INVALID_IO_BITMAP_OFFSET;
}
} }
void flush_thread(void) void flush_thread(void)
...@@ -392,6 +396,14 @@ int copy_thread(int nr, unsigned long clone_flags, unsigned long rsp, ...@@ -392,6 +396,14 @@ int copy_thread(int nr, unsigned long clone_flags, unsigned long rsp,
unlazy_fpu(current); unlazy_fpu(current);
p->thread.i387 = current->thread.i387; p->thread.i387 = current->thread.i387;
if (unlikely(me->thread.io_bitmap_ptr != NULL)) {
p->thread.io_bitmap_ptr = kmalloc((IO_BITMAP_SIZE+1)*4, GFP_KERNEL);
if (!p->thread.io_bitmap_ptr)
return -ENOMEM;
memcpy(p->thread.io_bitmap_ptr, me->thread.io_bitmap_ptr,
(IO_BITMAP_SIZE+1)*4);
}
return 0; return 0;
} }
...@@ -491,21 +503,14 @@ void __switch_to(struct task_struct *prev_p, struct task_struct *next_p) ...@@ -491,21 +503,14 @@ void __switch_to(struct task_struct *prev_p, struct task_struct *next_p)
/* /*
* Handle the IO bitmap * Handle the IO bitmap
*/ */
if (unlikely(prev->ioperm || next->ioperm)) { if (unlikely(prev->io_bitmap_ptr || next->io_bitmap_ptr)) {
if (next->ioperm) { if (next->io_bitmap_ptr) {
/* /*
* 4 cachelines copy ... not good, but not that * 4 cachelines copy ... not good, but not that
* bad either. Anyone got something better? * bad either. Anyone got something better?
* This only affects processes which use ioperm(). * This only affects processes which use ioperm().
* [Putting the TSSs into 4k-tlb mapped regions
* and playing VM tricks to switch the IO bitmap
* is not really acceptable.]
* On x86-64 we could put multiple bitmaps into
* the GDT and just switch offsets
* This would require ugly special cases on overflow
* though -AK
*/ */
memcpy(tss->io_bitmap, next->io_bitmap, memcpy(tss->io_bitmap, next->io_bitmap_ptr,
IO_BITMAP_SIZE*sizeof(u32)); IO_BITMAP_SIZE*sizeof(u32));
tss->io_map_base = IO_BITMAP_OFFSET; tss->io_map_base = IO_BITMAP_OFFSET;
} else { } else {
......
...@@ -91,6 +91,9 @@ void pda_init(int cpu) ...@@ -91,6 +91,9 @@ void pda_init(int cpu)
pda->me = pda; pda->me = pda;
pda->cpudata_offset = 0; pda->cpudata_offset = 0;
pda->active_mm = &init_mm;
pda->mmu_state = 0;
asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0)); asm volatile("movl %0,%%fs ; movl %0,%%gs" :: "r" (0));
wrmsrl(MSR_GS_BASE, cpu_pda + cpu); wrmsrl(MSR_GS_BASE, cpu_pda + cpu);
} }
......
...@@ -84,7 +84,6 @@ struct rt_sigframe ...@@ -84,7 +84,6 @@ struct rt_sigframe
char *pretcode; char *pretcode;
struct ucontext uc; struct ucontext uc;
struct siginfo info; struct siginfo info;
struct _fpstate fpstate;
}; };
static int static int
...@@ -186,8 +185,7 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs regs) ...@@ -186,8 +185,7 @@ asmlinkage long sys_rt_sigreturn(struct pt_regs regs)
*/ */
static int static int
setup_sigcontext(struct sigcontext *sc, struct _fpstate *fpstate, setup_sigcontext(struct sigcontext *sc, struct pt_regs *regs, unsigned long mask)
struct pt_regs *regs, unsigned long mask)
{ {
int tmp, err = 0; int tmp, err = 0;
struct task_struct *me = current; struct task_struct *me = current;
...@@ -221,20 +219,17 @@ setup_sigcontext(struct sigcontext *sc, struct _fpstate *fpstate, ...@@ -221,20 +219,17 @@ setup_sigcontext(struct sigcontext *sc, struct _fpstate *fpstate,
err |= __put_user(mask, &sc->oldmask); err |= __put_user(mask, &sc->oldmask);
err |= __put_user(me->thread.cr2, &sc->cr2); err |= __put_user(me->thread.cr2, &sc->cr2);
tmp = save_i387(fpstate);
if (tmp < 0)
err = 1;
else
err |= __put_user(tmp ? fpstate : NULL, &sc->fpstate);
return err; return err;
} }
/* /*
* Determine which stack to use.. * Determine which stack to use..
*/ */
static inline struct rt_sigframe *
get_sigframe(struct k_sigaction *ka, struct pt_regs * regs) #define round_down(p, r) ((void *) ((unsigned long)((p) - (r) + 1) & ~((r)-1)))
static void *
get_stack(struct k_sigaction *ka, struct pt_regs *regs, unsigned long size)
{ {
unsigned long rsp; unsigned long rsp;
...@@ -247,22 +242,34 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs * regs) ...@@ -247,22 +242,34 @@ get_sigframe(struct k_sigaction *ka, struct pt_regs * regs)
rsp = current->sas_ss_sp + current->sas_ss_size; rsp = current->sas_ss_sp + current->sas_ss_size;
} }
rsp = (rsp - sizeof(struct _fpstate)) & ~(15UL); return round_down(rsp - size, 16);
rsp -= offsetof(struct rt_sigframe, fpstate);
return (struct rt_sigframe *) rsp;
} }
static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
sigset_t *set, struct pt_regs * regs) sigset_t *set, struct pt_regs * regs)
{ {
struct rt_sigframe *frame; struct rt_sigframe *frame = NULL;
struct _fpstate *fp = NULL;
int err = 0; int err = 0;
frame = get_sigframe(ka, regs); if (current->used_math) {
fp = get_stack(ka, regs, sizeof(struct _fpstate));
frame = round_down((char *)fp - sizeof(struct rt_sigframe), 16) - 8;
if (!access_ok(VERIFY_WRITE, fp, sizeof(struct _fpstate))) {
goto give_sigsegv;
}
if (save_i387(fp) < 0)
err |= -1;
}
if (!frame)
frame = get_stack(ka, regs, sizeof(struct rt_sigframe)) - 8;
if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) if (!access_ok(VERIFY_WRITE, frame, sizeof(*frame))) {
goto give_sigsegv; goto give_sigsegv;
}
if (ka->sa.sa_flags & SA_SIGINFO) { if (ka->sa.sa_flags & SA_SIGINFO) {
err |= copy_siginfo_to_user(&frame->info, info); err |= copy_siginfo_to_user(&frame->info, info);
...@@ -278,14 +285,10 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, ...@@ -278,14 +285,10 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
err |= __put_user(sas_ss_flags(regs->rsp), err |= __put_user(sas_ss_flags(regs->rsp),
&frame->uc.uc_stack.ss_flags); &frame->uc.uc_stack.ss_flags);
err |= __put_user(current->sas_ss_size, &frame->uc.uc_stack.ss_size); err |= __put_user(current->sas_ss_size, &frame->uc.uc_stack.ss_size);
err |= setup_sigcontext(&frame->uc.uc_mcontext, &frame->fpstate, err |= setup_sigcontext(&frame->uc.uc_mcontext, regs, set->sig[0]);
regs, set->sig[0]); err |= __put_user(fp, &frame->uc.uc_mcontext.fpstate);
err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set)); err |= __copy_to_user(&frame->uc.uc_sigmask, set, sizeof(*set));
if (err) {
goto give_sigsegv;
}
/* Set up to return from userspace. If provided, use a stub /* Set up to return from userspace. If provided, use a stub
already in userspace. */ already in userspace. */
/* x86-64 should always use SA_RESTORER. */ /* x86-64 should always use SA_RESTORER. */
...@@ -297,7 +300,6 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, ...@@ -297,7 +300,6 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
} }
if (err) { if (err) {
printk("fault 3\n");
goto give_sigsegv; goto give_sigsegv;
} }
...@@ -305,7 +307,6 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, ...@@ -305,7 +307,6 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
printk("%d old rip %lx old rsp %lx old rax %lx\n", current->pid,regs->rip,regs->rsp,regs->rax); printk("%d old rip %lx old rsp %lx old rax %lx\n", current->pid,regs->rip,regs->rsp,regs->rax);
#endif #endif
/* Set up registers for signal handler */ /* Set up registers for signal handler */
{ {
struct exec_domain *ed = current_thread_info()->exec_domain; struct exec_domain *ed = current_thread_info()->exec_domain;
...@@ -320,9 +321,10 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info, ...@@ -320,9 +321,10 @@ static void setup_rt_frame(int sig, struct k_sigaction *ka, siginfo_t *info,
next argument after the signal number on the stack. */ next argument after the signal number on the stack. */
regs->rsi = (unsigned long)&frame->info; regs->rsi = (unsigned long)&frame->info;
regs->rdx = (unsigned long)&frame->uc; regs->rdx = (unsigned long)&frame->uc;
regs->rsp = (unsigned long) frame;
regs->rip = (unsigned long) ka->sa.sa_handler; regs->rip = (unsigned long) ka->sa.sa_handler;
regs->rsp = (unsigned long)frame;
set_fs(USER_DS); set_fs(USER_DS);
regs->eflags &= ~TF_MASK; regs->eflags &= ~TF_MASK;
......
...@@ -25,8 +25,6 @@ ...@@ -25,8 +25,6 @@
/* The 'big kernel lock' */ /* The 'big kernel lock' */
spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED; spinlock_t kernel_flag __cacheline_aligned_in_smp = SPIN_LOCK_UNLOCKED;
struct tlb_state cpu_tlbstate[NR_CPUS] = {[0 ... NR_CPUS-1] = { &init_mm, 0 }};
/* /*
* the following functions deal with sending IPIs between CPUs. * the following functions deal with sending IPIs between CPUs.
* *
...@@ -147,9 +145,9 @@ static spinlock_t tlbstate_lock = SPIN_LOCK_UNLOCKED; ...@@ -147,9 +145,9 @@ static spinlock_t tlbstate_lock = SPIN_LOCK_UNLOCKED;
*/ */
static void inline leave_mm (unsigned long cpu) static void inline leave_mm (unsigned long cpu)
{ {
if (cpu_tlbstate[cpu].state == TLBSTATE_OK) if (read_pda(mmu_state) == TLBSTATE_OK)
BUG(); BUG();
clear_bit(cpu, &cpu_tlbstate[cpu].active_mm->cpu_vm_mask); clear_bit(cpu, &read_pda(active_mm)->cpu_vm_mask);
__flush_tlb(); __flush_tlb();
} }
...@@ -164,18 +162,18 @@ static void inline leave_mm (unsigned long cpu) ...@@ -164,18 +162,18 @@ static void inline leave_mm (unsigned long cpu)
* the other cpus, but smp_invalidate_interrupt ignore flush ipis * the other cpus, but smp_invalidate_interrupt ignore flush ipis
* for the wrong mm, and in the worst case we perform a superflous * for the wrong mm, and in the worst case we perform a superflous
* tlb flush. * tlb flush.
* 1a2) set cpu_tlbstate to TLBSTATE_OK * 1a2) set cpu mmu_state to TLBSTATE_OK
* Now the smp_invalidate_interrupt won't call leave_mm if cpu0 * Now the smp_invalidate_interrupt won't call leave_mm if cpu0
* was in lazy tlb mode. * was in lazy tlb mode.
* 1a3) update cpu_tlbstate[].active_mm * 1a3) update cpu active_mm
* Now cpu0 accepts tlb flushes for the new mm. * Now cpu0 accepts tlb flushes for the new mm.
* 1a4) set_bit(cpu, &new_mm->cpu_vm_mask); * 1a4) set_bit(cpu, &new_mm->cpu_vm_mask);
* Now the other cpus will send tlb flush ipis. * Now the other cpus will send tlb flush ipis.
* 1a4) change cr3. * 1a4) change cr3.
* 1b) thread switch without mm change * 1b) thread switch without mm change
* cpu_tlbstate[].active_mm is correct, cpu0 already handles * cpu active_mm is correct, cpu0 already handles
* flush ipis. * flush ipis.
* 1b1) set cpu_tlbstate to TLBSTATE_OK * 1b1) set cpu mmu_state to TLBSTATE_OK
* 1b2) test_and_set the cpu bit in cpu_vm_mask. * 1b2) test_and_set the cpu bit in cpu_vm_mask.
* Atomically set the bit [other cpus will start sending flush ipis], * Atomically set the bit [other cpus will start sending flush ipis],
* and test the bit. * and test the bit.
...@@ -188,7 +186,7 @@ static void inline leave_mm (unsigned long cpu) ...@@ -188,7 +186,7 @@ static void inline leave_mm (unsigned long cpu)
* runs in kernel space, the cpu could load tlb entries for user space * runs in kernel space, the cpu could load tlb entries for user space
* pages. * pages.
* *
* The good news is that cpu_tlbstate is local to each cpu, no * The good news is that cpu mmu_state is local to each cpu, no
* write/read ordering problems. * write/read ordering problems.
*/ */
...@@ -216,8 +214,8 @@ asmlinkage void smp_invalidate_interrupt (void) ...@@ -216,8 +214,8 @@ asmlinkage void smp_invalidate_interrupt (void)
* BUG(); * BUG();
*/ */
if (flush_mm == cpu_tlbstate[cpu].active_mm) { if (flush_mm == read_pda(active_mm)) {
if (cpu_tlbstate[cpu].state == TLBSTATE_OK) { if (read_pda(mmu_state) == TLBSTATE_OK) {
if (flush_va == FLUSH_ALL) if (flush_va == FLUSH_ALL)
local_flush_tlb(); local_flush_tlb();
else else
...@@ -335,7 +333,7 @@ static inline void do_flush_tlb_all_local(void) ...@@ -335,7 +333,7 @@ static inline void do_flush_tlb_all_local(void)
unsigned long cpu = smp_processor_id(); unsigned long cpu = smp_processor_id();
__flush_tlb_all(); __flush_tlb_all();
if (cpu_tlbstate[cpu].state == TLBSTATE_LAZY) if (read_pda(mmu_state) == TLBSTATE_LAZY)
leave_mm(cpu); leave_mm(cpu);
} }
......
...@@ -47,7 +47,7 @@ ...@@ -47,7 +47,7 @@
#define __vsyscall(nr) __attribute__ ((unused,__section__(".vsyscall_" #nr))) #define __vsyscall(nr) __attribute__ ((unused,__section__(".vsyscall_" #nr)))
#define NO_VSYSCALL 1 //#define NO_VSYSCALL 1
#ifdef NO_VSYSCALL #ifdef NO_VSYSCALL
#include <asm/unistd.h> #include <asm/unistd.h>
......
...@@ -189,3 +189,5 @@ EXPORT_SYMBOL_NOVERS(do_softirq_thunk); ...@@ -189,3 +189,5 @@ EXPORT_SYMBOL_NOVERS(do_softirq_thunk);
void out_of_line_bug(void); void out_of_line_bug(void);
EXPORT_SYMBOL(out_of_line_bug); EXPORT_SYMBOL(out_of_line_bug);
EXPORT_SYMBOL(init_level4_pgt);
...@@ -12,7 +12,7 @@ obj-y = csum-partial.o csum-copy.o csum-wrappers.o delay.o \ ...@@ -12,7 +12,7 @@ obj-y = csum-partial.o csum-copy.o csum-wrappers.o delay.o \
thunk.o io.o clear_page.o copy_page.o thunk.o io.o clear_page.o copy_page.o
obj-y += memcpy.o obj-y += memcpy.o
obj-y += memmove.o obj-y += memmove.o
#obj-y += memset.o obj-y += memset.o
obj-y += copy_user.o obj-y += copy_user.o
export-objs := io.o csum-wrappers.o csum-partial.o export-objs := io.o csum-wrappers.o csum-partial.o
......
/* Copyright 2002 Andi Kleen, SuSE Labs */ /* Copyright 2002 Andi Kleen */
// #define FIX_ALIGNMENT 1
/* /*
* ISO C memset - set a memory block to a byte value. * ISO C memset - set a memory block to a byte value.
...@@ -11,51 +9,51 @@ ...@@ -11,51 +9,51 @@
* *
* rax original destination * rax original destination
*/ */
.globl ____memset .globl __memset
.globl memset
.p2align .p2align
____memset: memset:
movq %rdi,%r10 /* save destination for return address */ __memset:
movq %rdx,%r11 /* save count */ movq %rdi,%r10
movq %rdx,%r11
/* expand byte value */ /* expand byte value */
movzbl %sil,%ecx /* zero extend char value */ movzbl %sil,%ecx
movabs $0x0101010101010101,%rax /* expansion pattern */ movabs $0x0101010101010101,%rax
mul %rcx /* expand with rax, clobbers rdx */ mul %rcx /* with rax, clobbers rdx */
#ifdef FIX_ALIGNMENT
/* align dst */ /* align dst */
movl %edi,%r9d movl %edi,%r9d
andl $7,%r9d /* test unaligned bits */ andl $7,%r9d
jnz bad_alignment jnz bad_alignment
after_bad_alignment: after_bad_alignment:
#endif
movq %r11,%rcx /* restore count */ movq %r11,%rcx
shrq $6,%rcx /* divide by 64 */ movl $64,%r8d
jz handle_tail /* block smaller than 64 bytes? */ shrq $6,%rcx
movl $64,%r8d /* CSE loop block size */ jz handle_tail
loop_64: loop_64:
movnti %rax,0*8(%rdi) movnti %rax,(%rdi)
movnti %rax,1*8(%rdi) movnti %rax,8(%rdi)
movnti %rax,2*8(%rdi) movnti %rax,16(%rdi)
movnti %rax,3*8(%rdi) movnti %rax,24(%rdi)
movnti %rax,4*8(%rdi) movnti %rax,32(%rdi)
movnti %rax,5*8(%rdi) movnti %rax,40(%rdi)
movnti %rax,6*8(%rdi) movnti %rax,48(%rdi)
movnti %rax,7*8(%rdi) /* clear 64 byte blocks */ movnti %rax,56(%rdi)
addq %r8,%rdi /* increase pointer by 64 bytes */ addq %r8,%rdi
loop loop_64 /* decrement rcx and if not zero loop */ loop loop_64
/* Handle tail in loops. The loops should be faster than hard /* Handle tail in loops. The loops should be faster than hard
to predict jump tables. */ to predict jump tables. */
handle_tail: handle_tail:
movl %r11d,%ecx movl %r11d,%ecx
andl $63,%ecx andl $63&(~7),%ecx
shrl $3,%ecx
jz handle_7 jz handle_7
shrl $3,%ecx
loop_8: loop_8:
movnti %rax,(%rdi) /* long words */ movnti %rax,(%rdi)
addq $8,%rdi addq $8,%rdi
loop loop_8 loop loop_8
...@@ -64,22 +62,20 @@ handle_7: ...@@ -64,22 +62,20 @@ handle_7:
andl $7,%ecx andl $7,%ecx
jz ende jz ende
loop_1: loop_1:
movb %al,(%rdi) /* bytes */ movb %al,(%rdi)
incq %rdi addq $1,%rdi
loop loop_1 loop loop_1
ende: ende:
movq %r10,%rax movq %r10,%rax
ret ret
#ifdef FIX_ALIGNMENT
bad_alignment: bad_alignment:
andq $-8,%r11 /* shorter than 8 bytes */ cmpq $7,%r11
jz handle_7 /* if yes handle it in the tail code */ jbe handle_7
movnti %rax,(%rdi) /* unaligned store of 8 bytes */ movnti %rax,(%rdi) /* unaligned store */
movq $8,%r8 movq $8,%r8
subq %r9,%r8 /* compute alignment (8-misalignment) */ subq %r9,%r8
addq %r8,%rdi /* fix destination */ addq %r8,%rdi
subq %r8,%r11 /* fix count */ subq %r8,%r11
jmp after_bad_alignment jmp after_bad_alignment
#endif
...@@ -28,6 +28,7 @@ ...@@ -28,6 +28,7 @@
#include <linux/types.h> #include <linux/types.h>
#include <linux/blk.h> #include <linux/blk.h>
#include <linux/blkdev.h> #include <linux/blkdev.h>
#include <linux/bio.h>
#include <linux/completion.h> #include <linux/completion.h>
#include <linux/delay.h> #include <linux/delay.h>
#include <linux/genhd.h> #include <linux/genhd.h>
......
...@@ -30,6 +30,7 @@ ...@@ -30,6 +30,7 @@
#include <linux/delay.h> #include <linux/delay.h>
#include <linux/major.h> #include <linux/major.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/bio.h>
#include <linux/blkpg.h> #include <linux/blkpg.h>
#include <linux/timer.h> #include <linux/timer.h>
#include <linux/proc_fs.h> #include <linux/proc_fs.h>
......
...@@ -24,6 +24,7 @@ ...@@ -24,6 +24,7 @@
#include <linux/version.h> #include <linux/version.h>
#include <linux/types.h> #include <linux/types.h>
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/bio.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/delay.h> #include <linux/delay.h>
......
...@@ -28,6 +28,7 @@ ...@@ -28,6 +28,7 @@
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/blkdev.h> #include <linux/blkdev.h>
#include <linux/elevator.h> #include <linux/elevator.h>
#include <linux/bio.h>
#include <linux/blk.h> #include <linux/blk.h>
#include <linux/config.h> #include <linux/config.h>
#include <linux/module.h> #include <linux/module.h>
......
...@@ -165,6 +165,7 @@ static int print_unex=1; ...@@ -165,6 +165,7 @@ static int print_unex=1;
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/bio.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/fcntl.h> #include <linux/fcntl.h>
#include <linux/delay.h> #include <linux/delay.h>
......
...@@ -18,6 +18,7 @@ ...@@ -18,6 +18,7 @@
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/config.h> #include <linux/config.h>
#include <linux/bio.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/swap.h> #include <linux/swap.h>
#include <linux/init.h> #include <linux/init.h>
...@@ -2002,8 +2003,8 @@ int __init blk_dev_init(void) ...@@ -2002,8 +2003,8 @@ int __init blk_dev_init(void)
queue_nr_requests = (total_ram >> 8) & ~15; /* One per quarter-megabyte */ queue_nr_requests = (total_ram >> 8) & ~15; /* One per quarter-megabyte */
if (queue_nr_requests < 32) if (queue_nr_requests < 32)
queue_nr_requests = 32; queue_nr_requests = 32;
if (queue_nr_requests > 512) if (queue_nr_requests > 256)
queue_nr_requests = 512; queue_nr_requests = 256;
/* /*
* Batch frees according to queue length * Batch frees according to queue length
......
...@@ -60,6 +60,7 @@ ...@@ -60,6 +60,7 @@
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/file.h> #include <linux/file.h>
#include <linux/bio.h>
#include <linux/stat.h> #include <linux/stat.h>
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/major.h> #include <linux/major.h>
...@@ -168,6 +169,15 @@ static void figure_loop_size(struct loop_device *lo) ...@@ -168,6 +169,15 @@ static void figure_loop_size(struct loop_device *lo)
} }
static inline int lo_do_transfer(struct loop_device *lo, int cmd, char *rbuf,
char *lbuf, int size, int rblock)
{
if (!lo->transfer)
return 0;
return lo->transfer(lo, cmd, rbuf, lbuf, size, rblock);
}
static int static int
do_lo_send(struct loop_device *lo, struct bio_vec *bvec, int bsize, loff_t pos) do_lo_send(struct loop_device *lo, struct bio_vec *bvec, int bsize, loff_t pos)
{ {
...@@ -454,20 +464,43 @@ static struct bio *loop_get_buffer(struct loop_device *lo, struct bio *rbh) ...@@ -454,20 +464,43 @@ static struct bio *loop_get_buffer(struct loop_device *lo, struct bio *rbh)
out_bh: out_bh:
bio->bi_sector = rbh->bi_sector + (lo->lo_offset >> 9); bio->bi_sector = rbh->bi_sector + (lo->lo_offset >> 9);
bio->bi_rw = rbh->bi_rw; bio->bi_rw = rbh->bi_rw;
spin_lock_irq(&lo->lo_lock);
bio->bi_bdev = lo->lo_device; bio->bi_bdev = lo->lo_device;
spin_unlock_irq(&lo->lo_lock);
return bio; return bio;
} }
static int loop_make_request(request_queue_t *q, struct bio *rbh) static int
bio_transfer(struct loop_device *lo, struct bio *to_bio,
struct bio *from_bio)
{
unsigned long IV = loop_get_iv(lo, from_bio->bi_sector);
struct bio_vec *from_bvec, *to_bvec;
char *vto, *vfrom;
int ret = 0, i;
__bio_for_each_segment(from_bvec, from_bio, i, 0) {
to_bvec = &to_bio->bi_io_vec[i];
kmap(from_bvec->bv_page);
kmap(to_bvec->bv_page);
vfrom = page_address(from_bvec->bv_page) + from_bvec->bv_offset;
vto = page_address(to_bvec->bv_page) + to_bvec->bv_offset;
ret |= lo_do_transfer(lo, bio_data_dir(to_bio), vto, vfrom,
from_bvec->bv_len, IV);
kunmap(from_bvec->bv_page);
kunmap(to_bvec->bv_page);
}
return ret;
}
static int loop_make_request(request_queue_t *q, struct bio *old_bio)
{ {
struct bio *bh = NULL; struct bio *new_bio = NULL;
struct loop_device *lo; struct loop_device *lo;
unsigned long IV; unsigned long IV;
int rw = bio_rw(rbh); int rw = bio_rw(old_bio);
int unit = minor(to_kdev_t(rbh->bi_bdev->bd_dev)); int unit = minor(to_kdev_t(old_bio->bi_bdev->bd_dev));
if (unit >= max_loop) if (unit >= max_loop)
goto out; goto out;
...@@ -489,60 +522,41 @@ static int loop_make_request(request_queue_t *q, struct bio *rbh) ...@@ -489,60 +522,41 @@ static int loop_make_request(request_queue_t *q, struct bio *rbh)
goto err; goto err;
} }
blk_queue_bounce(q, &rbh); blk_queue_bounce(q, &old_bio);
/* /*
* file backed, queue for loop_thread to handle * file backed, queue for loop_thread to handle
*/ */
if (lo->lo_flags & LO_FLAGS_DO_BMAP) { if (lo->lo_flags & LO_FLAGS_DO_BMAP) {
loop_add_bio(lo, rbh); loop_add_bio(lo, old_bio);
return 0; return 0;
} }
/* /*
* piggy old buffer on original, and submit for I/O * piggy old buffer on original, and submit for I/O
*/ */
bh = loop_get_buffer(lo, rbh); new_bio = loop_get_buffer(lo, old_bio);
IV = loop_get_iv(lo, rbh->bi_sector); IV = loop_get_iv(lo, old_bio->bi_sector);
if (rw == WRITE) { if (rw == WRITE) {
if (lo_do_transfer(lo, WRITE, bio_data(bh), bio_data(rbh), if (bio_transfer(lo, new_bio, old_bio))
bh->bi_size, IV))
goto err; goto err;
} }
generic_make_request(bh); generic_make_request(new_bio);
return 0; return 0;
err: err:
if (atomic_dec_and_test(&lo->lo_pending)) if (atomic_dec_and_test(&lo->lo_pending))
up(&lo->lo_bh_mutex); up(&lo->lo_bh_mutex);
loop_put_buffer(bh); loop_put_buffer(new_bio);
out: out:
bio_io_error(rbh); bio_io_error(old_bio);
return 0; return 0;
inactive: inactive:
spin_unlock_irq(&lo->lo_lock); spin_unlock_irq(&lo->lo_lock);
goto out; goto out;
} }
static int do_bio_blockbacked(struct loop_device *lo, struct bio *bio,
struct bio *rbh)
{
unsigned long IV = loop_get_iv(lo, rbh->bi_sector);
struct bio_vec *from;
char *vto, *vfrom;
int ret = 0, i;
bio_for_each_segment(from, rbh, i) {
vfrom = page_address(from->bv_page) + from->bv_offset;
vto = page_address(bio->bi_io_vec[i].bv_page) + bio->bi_io_vec[i].bv_offset;
ret |= lo_do_transfer(lo, bio_data_dir(bio), vto, vfrom,
from->bv_len, IV);
}
return ret;
}
static inline void loop_handle_bio(struct loop_device *lo, struct bio *bio) static inline void loop_handle_bio(struct loop_device *lo, struct bio *bio)
{ {
int ret; int ret;
...@@ -556,7 +570,7 @@ static inline void loop_handle_bio(struct loop_device *lo, struct bio *bio) ...@@ -556,7 +570,7 @@ static inline void loop_handle_bio(struct loop_device *lo, struct bio *bio)
} else { } else {
struct bio *rbh = bio->bi_private; struct bio *rbh = bio->bi_private;
ret = do_bio_blockbacked(lo, bio, rbh); ret = bio_transfer(lo, bio, rbh);
bio_endio(rbh, !ret); bio_endio(rbh, !ret);
loop_put_buffer(bio); loop_put_buffer(bio);
...@@ -588,10 +602,8 @@ static int loop_thread(void *data) ...@@ -588,10 +602,8 @@ static int loop_thread(void *data)
set_user_nice(current, -20); set_user_nice(current, -20);
spin_lock_irq(&lo->lo_lock);
lo->lo_state = Lo_bound; lo->lo_state = Lo_bound;
atomic_inc(&lo->lo_pending); atomic_inc(&lo->lo_pending);
spin_unlock_irq(&lo->lo_lock);
/* /*
* up sem, we are running * up sem, we are running
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/init.h> #include <linux/init.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/bio.h>
#include <linux/stat.h> #include <linux/stat.h>
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/file.h> #include <linux/file.h>
......
...@@ -45,6 +45,8 @@ ...@@ -45,6 +45,8 @@
#include <linux/config.h> #include <linux/config.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <asm/atomic.h>
#include <linux/bio.h>
#include <linux/module.h> #include <linux/module.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/devfs_fs_kernel.h> #include <linux/devfs_fs_kernel.h>
......
...@@ -37,6 +37,7 @@ ...@@ -37,6 +37,7 @@
#include <linux/config.h> #include <linux/config.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/bio.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/mman.h> #include <linux/mman.h>
......
...@@ -118,8 +118,8 @@ struct agp_bridge_data { ...@@ -118,8 +118,8 @@ struct agp_bridge_data {
int (*remove_memory) (agp_memory *, off_t, int); int (*remove_memory) (agp_memory *, off_t, int);
agp_memory *(*alloc_by_type) (size_t, int); agp_memory *(*alloc_by_type) (size_t, int);
void (*free_by_type) (agp_memory *); void (*free_by_type) (agp_memory *);
unsigned long (*agp_alloc_page) (void); void *(*agp_alloc_page) (void);
void (*agp_destroy_page) (unsigned long); void (*agp_destroy_page) (void *);
int (*suspend)(void); int (*suspend)(void);
void (*resume)(void); void (*resume)(void);
......
This diff is collapsed.
...@@ -252,6 +252,7 @@ ...@@ -252,6 +252,7 @@
#include <linux/poll.h> #include <linux/poll.h>
#include <linux/init.h> #include <linux/init.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/tqueue.h>
#include <asm/processor.h> #include <asm/processor.h>
#include <asm/uaccess.h> #include <asm/uaccess.h>
......
...@@ -345,7 +345,8 @@ int ata_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned ...@@ -345,7 +345,8 @@ int ata_ioctl(struct inode *inode, struct file *file, unsigned int cmd, unsigned
if (!arg) { if (!arg) {
if (ide_spin_wait_hwgroup(drive)) if (ide_spin_wait_hwgroup(drive))
return -EBUSY; return -EBUSY;
else /* Do nothing, just unlock */
spin_unlock_irq(drive->channel->lock);
return 0; return 0;
} }
......
...@@ -20,7 +20,7 @@ ...@@ -20,7 +20,7 @@
#include <linux/raid/md.h> #include <linux/raid/md.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/bio.h>
#include <linux/raid/linear.h> #include <linux/raid/linear.h>
#define MAJOR_NR MD_MAJOR #define MAJOR_NR MD_MAJOR
......
...@@ -224,7 +224,7 @@ static inline void invalidate_snap_cache(unsigned long start, unsigned long nr, ...@@ -224,7 +224,7 @@ static inline void invalidate_snap_cache(unsigned long start, unsigned long nr,
for (i = 0; i < nr; i++) for (i = 0; i < nr; i++)
{ {
bh = get_hash_table(dev, start++, blksize); bh = find_get_block(dev, start++, blksize);
if (bh) if (bh)
bforget(bh); bforget(bh);
} }
......
...@@ -209,6 +209,7 @@ ...@@ -209,6 +209,7 @@
#include <linux/hdreg.h> #include <linux/hdreg.h>
#include <linux/stat.h> #include <linux/stat.h>
#include <linux/fs.h> #include <linux/fs.h>
#include <linux/bio.h>
#include <linux/proc_fs.h> #include <linux/proc_fs.h>
#include <linux/blkdev.h> #include <linux/blkdev.h>
#include <linux/genhd.h> #include <linux/genhd.h>
......
...@@ -33,6 +33,7 @@ ...@@ -33,6 +33,7 @@
#include <linux/linkage.h> #include <linux/linkage.h>
#include <linux/raid/md.h> #include <linux/raid/md.h>
#include <linux/sysctl.h> #include <linux/sysctl.h>
#include <linux/bio.h>
#include <linux/raid/xor.h> #include <linux/raid/xor.h>
#include <linux/devfs_fs_kernel.h> #include <linux/devfs_fs_kernel.h>
......
...@@ -23,6 +23,7 @@ ...@@ -23,6 +23,7 @@
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/spinlock.h> #include <linux/spinlock.h>
#include <linux/raid/multipath.h> #include <linux/raid/multipath.h>
#include <linux/bio.h>
#include <linux/buffer_head.h> #include <linux/buffer_head.h>
#include <asm/atomic.h> #include <asm/atomic.h>
......
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/raid/raid0.h> #include <linux/raid/raid0.h>
#include <linux/bio.h>
#define MAJOR_NR MD_MAJOR #define MAJOR_NR MD_MAJOR
#define MD_DRIVER #define MD_DRIVER
......
...@@ -23,6 +23,7 @@ ...@@ -23,6 +23,7 @@
*/ */
#include <linux/raid/raid1.h> #include <linux/raid/raid1.h>
#include <linux/bio.h>
#define MAJOR_NR MD_MAJOR #define MAJOR_NR MD_MAJOR
#define MD_DRIVER #define MD_DRIVER
......
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
#include <linux/module.h> #include <linux/module.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/raid/raid5.h> #include <linux/raid/raid5.h>
#include <linux/bio.h>
#include <asm/bitops.h> #include <asm/bitops.h>
#include <asm/atomic.h> #include <asm/atomic.h>
......
...@@ -210,3 +210,4 @@ EXPORT_SYMBOL(pci_match_device); ...@@ -210,3 +210,4 @@ EXPORT_SYMBOL(pci_match_device);
EXPORT_SYMBOL(pci_register_driver); EXPORT_SYMBOL(pci_register_driver);
EXPORT_SYMBOL(pci_unregister_driver); EXPORT_SYMBOL(pci_unregister_driver);
EXPORT_SYMBOL(pci_dev_driver); EXPORT_SYMBOL(pci_dev_driver);
EXPORT_SYMBOL(pci_bus_type);
...@@ -20,6 +20,7 @@ ...@@ -20,6 +20,7 @@
#include <linux/init.h> #include <linux/init.h>
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/tqueue.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <pcmcia/ss.h> #include <pcmcia/ss.h>
......
...@@ -6,6 +6,7 @@ ...@@ -6,6 +6,7 @@
#include <linux/init.h> #include <linux/init.h>
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/tqueue.h>
#include <linux/interrupt.h> #include <linux/interrupt.h>
#include <linux/delay.h> #include <linux/delay.h>
#include <linux/module.h> #include <linux/module.h>
......
...@@ -2,7 +2,7 @@ This file contains brief information about the SCSI tape driver. ...@@ -2,7 +2,7 @@ This file contains brief information about the SCSI tape driver.
The driver is currently maintained by Kai M{kisara (email The driver is currently maintained by Kai M{kisara (email
Kai.Makisara@metla.fi) Kai.Makisara@metla.fi)
Last modified: Tue Jan 22 21:08:57 2002 by makisara Last modified: Tue Jun 18 18:13:50 2002 by makisara
BASICS BASICS
...@@ -105,15 +105,19 @@ The default is BSD semantics. ...@@ -105,15 +105,19 @@ The default is BSD semantics.
BUFFERING BUFFERING
The driver uses tape buffers allocated either at system initialization The driver uses tape buffers allocated at run-time when needed and it
or at run-time when needed. One buffer is used for each open tape is freed when the device file is closed. One buffer is used for each
device. The size of the buffers is selectable at compile and/or boot open tape device.
time. The buffers are used to store the data being transferred to/from
the SCSI adapter. The following buffering options are selectable at The size of the buffers is always at least one tape block. In fixed
compile time and/or at run time (via ioctl): block mode, the minimum buffer size is defined (in 1024 byte units) by
ST_FIXED_BUFFER_BLOCKS. With small block size this allows buffering of
several blocks and using one SCSI read or write to transfer all of the
blocks. Buffering of data across write calls in fixed block mode is
allowed if ST_BUFFER_WRITES is non-zero. Buffer allocation uses chunks of
memory having sizes 2^n * (page size). Because of this the actual
buffer size may be larger than the minimum allowable buffer size.
Buffering of data across write calls in fixed block mode (define
ST_BUFFER_WRITES).
Asynchronous writing. Writing the buffer contents to the tape is Asynchronous writing. Writing the buffer contents to the tape is
started and the write call returns immediately. The status is checked started and the write call returns immediately. The status is checked
...@@ -128,30 +132,6 @@ attempted even if the user does not want to get all of the data at ...@@ -128,30 +132,6 @@ attempted even if the user does not want to get all of the data at
this read command. Should be disabled for those drives that don't like this read command. Should be disabled for those drives that don't like
a filemark to truncate a read request or that don't like backspacing. a filemark to truncate a read request or that don't like backspacing.
The buffer size is defined (in 1024 byte units) by ST_BUFFER_BLOCKS or
at boot time. If this size is not large enough, the driver tries to
temporarily enlarge the buffer. Buffer allocation uses chunks of
memory having sizes 2^n * (page size). Because of this the actual
buffer size may be larger than the buffer size specified with
ST_BUFFER_BLOCKS.
A small number of buffers are allocated at driver initialisation. The
maximum number of these buffers is defined by ST_MAX_BUFFERS. The
maximum can be changed with kernel or module startup options. One
buffer is allocated for each drive detected when the driver is
initialized up to the maximum.
The driver tries to allocate new buffers at run-time if
necessary. These buffers are freed after use. If the maximum number of
initial buffers is set to zero, all buffer allocation is done at
run-time. The advantage of run-time allocation is that memory is not
wasted for buffers not being used. The disadvantage is that there may
not be memory available at the time when a buffer is needed for the
first time (once a buffer is allocated, it is not released). This risk
should not be big if the tape drive is connected to a PCI adapter that
supports scatter/gather (the allocation is not limited to "DMA memory"
and the buffer can be composed of several fragments).
The threshold for triggering asynchronous write in fixed block mode The threshold for triggering asynchronous write in fixed block mode
is defined by ST_WRITE_THRESHOLD. This may be optimized for each is defined by ST_WRITE_THRESHOLD. This may be optimized for each
use pattern. The default triggers asynchronous write after three use pattern. The default triggers asynchronous write after three
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/pci.h> #include <linux/pci.h>
#include <linux/delay.h> #include <linux/delay.h>
#include <linux/timer.h> #include <linux/timer.h>
#include <linux/init.h>
#include <linux/ioport.h> // request_region() prototype #include <linux/ioport.h> // request_region() prototype
#include <linux/vmalloc.h> // ioremap() #include <linux/vmalloc.h> // ioremap()
//#if LINUX_VERSION_CODE >= LinuxVersionCode(2,4,7) //#if LINUX_VERSION_CODE >= LinuxVersionCode(2,4,7)
......
...@@ -23,6 +23,7 @@ ...@@ -23,6 +23,7 @@
#include <linux/timer.h> #include <linux/timer.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/bio.h>
#include <linux/ioport.h> #include <linux/ioport.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/stat.h> #include <linux/stat.h>
......
...@@ -36,6 +36,7 @@ ...@@ -36,6 +36,7 @@
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/bio.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/hdreg.h> #include <linux/hdreg.h>
#include <linux/errno.h> #include <linux/errno.h>
......
...@@ -39,6 +39,7 @@ ...@@ -39,6 +39,7 @@
#include <linux/kernel.h> #include <linux/kernel.h>
#include <linux/sched.h> #include <linux/sched.h>
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/bio.h>
#include <linux/string.h> #include <linux/string.h>
#include <linux/errno.h> #include <linux/errno.h>
#include <linux/cdrom.h> #include <linux/cdrom.h>
......
This diff is collapsed.
...@@ -3,7 +3,7 @@ ...@@ -3,7 +3,7 @@
Copyright 1995-2000 Kai Makisara. Copyright 1995-2000 Kai Makisara.
Last modified: Tue Jan 22 21:52:34 2002 by makisara Last modified: Sun May 5 15:09:56 2002 by makisara
*/ */
#ifndef _ST_OPTIONS_H #ifndef _ST_OPTIONS_H
...@@ -30,22 +30,17 @@ ...@@ -30,22 +30,17 @@
SENSE. */ SENSE. */
#define ST_DEFAULT_BLOCK 0 #define ST_DEFAULT_BLOCK 0
/* The tape driver buffer size in kilobytes. Must be non-zero. */ /* The minimum tape driver buffer size in kilobytes in fixed block mode.
#define ST_BUFFER_BLOCKS 32 Must be non-zero. */
#define ST_FIXED_BUFFER_BLOCKS 32
/* The number of kilobytes of data in the buffer that triggers an /* The number of kilobytes of data in the buffer that triggers an
asynchronous write in fixed block mode. See also ST_ASYNC_WRITES asynchronous write in fixed block mode. See also ST_ASYNC_WRITES
below. */ below. */
#define ST_WRITE_THRESHOLD_BLOCKS 30 #define ST_WRITE_THRESHOLD_BLOCKS 30
/* The maximum number of tape buffers the driver tries to allocate at
driver initialisation. The number is also constrained by the number
of drives detected. If more buffers are needed, they are allocated
at run time and freed after use. */
#define ST_MAX_BUFFERS 4
/* Maximum number of scatter/gather segments */ /* Maximum number of scatter/gather segments */
#define ST_MAX_SG 16 #define ST_MAX_SG 64
/* The number of scatter/gather segments to allocate at first try (must be /* The number of scatter/gather segments to allocate at first try (must be
smaller or equal to the maximum). */ smaller or equal to the maximum). */
......
...@@ -17,6 +17,7 @@ ...@@ -17,6 +17,7 @@
* *
*/ */
#include <linux/mm.h> #include <linux/mm.h>
#include <linux/bio.h>
#include <linux/blk.h> #include <linux/blk.h>
#include <linux/slab.h> #include <linux/slab.h>
#include <linux/iobuf.h> #include <linux/iobuf.h>
...@@ -284,8 +285,8 @@ struct bio *bio_copy(struct bio *bio, int gfp_mask, int copy) ...@@ -284,8 +285,8 @@ struct bio *bio_copy(struct bio *bio, int gfp_mask, int copy)
vto = kmap(bbv->bv_page); vto = kmap(bbv->bv_page);
} else { } else {
local_irq_save(flags); local_irq_save(flags);
vfrom = kmap_atomic(bv->bv_page, KM_BIO_IRQ); vfrom = kmap_atomic(bv->bv_page, KM_BIO_SRC_IRQ);
vto = kmap_atomic(bbv->bv_page, KM_BIO_IRQ); vto = kmap_atomic(bbv->bv_page, KM_BIO_DST_IRQ);
} }
memcpy(vto + bbv->bv_offset, vfrom + bv->bv_offset, bv->bv_len); memcpy(vto + bbv->bv_offset, vfrom + bv->bv_offset, bv->bv_len);
...@@ -293,8 +294,8 @@ struct bio *bio_copy(struct bio *bio, int gfp_mask, int copy) ...@@ -293,8 +294,8 @@ struct bio *bio_copy(struct bio *bio, int gfp_mask, int copy)
kunmap(bbv->bv_page); kunmap(bbv->bv_page);
kunmap(bv->bv_page); kunmap(bv->bv_page);
} else { } else {
kunmap_atomic(vto, KM_BIO_IRQ); kunmap_atomic(vto, KM_BIO_DST_IRQ);
kunmap_atomic(vfrom, KM_BIO_IRQ); kunmap_atomic(vfrom, KM_BIO_SRC_IRQ);
local_irq_restore(flags); local_irq_restore(flags);
} }
} }
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
...@@ -1650,7 +1650,7 @@ ext3_clear_blocks(handle_t *handle, struct inode *inode, struct buffer_head *bh, ...@@ -1650,7 +1650,7 @@ ext3_clear_blocks(handle_t *handle, struct inode *inode, struct buffer_head *bh,
struct buffer_head *bh; struct buffer_head *bh;
*p = 0; *p = 0;
bh = sb_get_hash_table(inode->i_sb, nr); bh = sb_find_get_block(inode->i_sb, nr);
ext3_forget(handle, 0, inode, bh, nr); ext3_forget(handle, 0, inode, bh, nr);
} }
} }
......
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment