Commit 1b2569fb authored by Andrew Morton's avatar Andrew Morton Committed by Linus Torvalds

[PATCH] slab: use order 0 for vfs caches

We have interesting deadlocks when slab decides to use order-1 allocations for
ext3_inode_cache.  This is because ext3_alloc_inode() needs to perform a
GFP_NOFS 1-order allocation.

Sometimes the 1-order allocation needs to free a huge number of pages (tens of
megabytes) before a 1-order grouping becomes available.  But the GFP_NOFS
allocator cannot free dcache (and hence icache) due to the deadlock problems
identified in shrink_dcache_memory().

So change slab so that it will force 0-order allocations for shrinkable VFS
objects.  We can handle those OK.
parent b3f25c2b
...@@ -1220,41 +1220,55 @@ kmem_cache_create (const char *name, size_t size, size_t align, ...@@ -1220,41 +1220,55 @@ kmem_cache_create (const char *name, size_t size, size_t align,
size = ALIGN(size, align); size = ALIGN(size, align);
/* Cal size (in pages) of slabs, and the num of objs per slab. if ((flags & SLAB_RECLAIM_ACCOUNT) && size <= PAGE_SIZE) {
* This could be made much more intelligent. For now, try to avoid /*
* using high page-orders for slabs. When the gfp() funcs are more * A VFS-reclaimable slab tends to have most allocations
* friendly towards high-order requests, this should be changed. * as GFP_NOFS and we really don't want to have to be allocating
*/ * higher-order pages when we are unable to shrink dcache.
do { */
unsigned int break_flag = 0; cachep->gfporder = 0;
cal_wastage:
cache_estimate(cachep->gfporder, size, align, flags, cache_estimate(cachep->gfporder, size, align, flags,
&left_over, &cachep->num); &left_over, &cachep->num);
if (break_flag) } else {
break;
if (cachep->gfporder >= MAX_GFP_ORDER)
break;
if (!cachep->num)
goto next;
if (flags & CFLGS_OFF_SLAB && cachep->num > offslab_limit) {
/* Oops, this num of objs will cause problems. */
cachep->gfporder--;
break_flag++;
goto cal_wastage;
}
/* /*
* Large num of objs is good, but v. large slabs are currently * Calculate size (in pages) of slabs, and the num of objs per
* bad for the gfp()s. * slab. This could be made much more intelligent. For now,
* try to avoid using high page-orders for slabs. When the
* gfp() funcs are more friendly towards high-order requests,
* this should be changed.
*/ */
if (cachep->gfporder >= slab_break_gfp_order) do {
break; unsigned int break_flag = 0;
cal_wastage:
cache_estimate(cachep->gfporder, size, align, flags,
&left_over, &cachep->num);
if (break_flag)
break;
if (cachep->gfporder >= MAX_GFP_ORDER)
break;
if (!cachep->num)
goto next;
if (flags & CFLGS_OFF_SLAB &&
cachep->num > offslab_limit) {
/* This num of objs will cause problems. */
cachep->gfporder--;
break_flag++;
goto cal_wastage;
}
if ((left_over*8) <= (PAGE_SIZE<<cachep->gfporder)) /*
break; /* Acceptable internal fragmentation. */ * Large num of objs is good, but v. large slabs are
* currently bad for the gfp()s.
*/
if (cachep->gfporder >= slab_break_gfp_order)
break;
if ((left_over*8) <= (PAGE_SIZE<<cachep->gfporder))
break; /* Acceptable internal fragmentation. */
next: next:
cachep->gfporder++; cachep->gfporder++;
} while (1); } while (1);
}
if (!cachep->num) { if (!cachep->num) {
printk("kmem_cache_create: couldn't create cache %s.\n", name); printk("kmem_cache_create: couldn't create cache %s.\n", name);
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment