How to conserve battery power using laptop-mode
-----------------------------------------------

Document Author: Bart Samwel (bart@samwel.tk)
Date created: January 2, 2004
Last modified: April 3, 2004

Introduction
------------

Laptopmode is used to minimize the time that the hard disk needs to be spun up,
to conserve battery power on laptops. It has been reported to cause significant
power savings.

Contents
--------

* Introduction
* The short story
* Caveats
* The details
* Tips & Tricks
* Control script
* ACPI integration
* Monitoring tool


The short story
---------------

If you just want to use it, run the laptop_mode control script (which is included
at the end of this document) as follows:

# laptop_mode start

Then set your harddisk spindown time to a relatively low value with hdparm:

hdparm -S 4 /dev/hda

The value -S 4 means 20 seconds idle time before spindown. Your harddisk will
now only spin up when a disk cache miss occurs, or at least once every 10
minutes to write back any pending changes.

To stop laptop_mode, run "laptop_mode stop".


Caveats
-------

* The downside of laptop mode is that you have a chance of losing up
  to 10 minutes of work. If you cannot afford this, don't use it! It's
  wise to turn OFF laptop mode when you're almost out of battery --
  although this will make the battery run out faster, at least you'll
  lose less work when it actually runs out. I'm still looking for someone
  to submit instructions on how to turn off laptop mode when battery is low,
  e.g., using ACPI events. I don't have a laptop myself, so if you do and
  you care to contribute such instructions, please do.

* Most desktop hard drives have a very limited lifetime measured in spindown
  cycles, typically about 50.000 times (it's usually listed on the spec sheet).
  Check your drive's rating, and don't wear down your drive's lifetime if you
  don't need to.

* If you mount some of your ext3/reiserfs filesystems with the -n option, then
  the control script will not be able to remount them correctly. You must set
  DO_REMOUNTS=0 in the control script, otherwise it will remount them with the
  wrong options -- or it will fail because it cannot write to /etc/mtab.

* If you have your filesystems listed as type "auto" in fstab, like I did, then
  the control script will not recognize them as filesystems that need remounting.

* It has been reported that some versions of the mutt mail client use file access
  times to determine whether a folder contains new mail. If you use mutt and
  experience this, you must disable the noatime remounting in the control script
  by setting DO_REMOUNT_NOATIME=0.


The details
-----------

Laptop-mode is controlled by the flag /proc/sys/vm/laptop_mode. When this
flag is set, any physical disk read operation (that might have caused the
hard disk to spin up) causes Linux to flush all dirty blocks. The result
of this is that after a disk has spun down, it will not be spun up anymore
to write dirty blocks, because those blocks had already been written
immediately after the most recent read operation

To increase the effectiveness of the laptop_mode strategy, the laptop_mode
control script increases dirty_expire_centisecs and dirty_writeback_centisecs in
/proc/sys/vm to about 10 minutes (by default), which means that pages that are
dirtied are not forced to be written to disk as often. The control script also
changes the dirty background ratio, so that background writeback of dirty pages
is not done anymore. Combined with a higher commit value (also 10 minutes) for
ext3 or ReiserFS filesystems (also done automatically by the control script),
this results in concentration of disk activity in a small time interval which
occurs only once every 10 minutes, or whenever the disk is forced to spin up by
a cache miss. The disk can then be spun down in the periods of inactivity.

If you want to find out which process caused the disk to spin up, you can
gather information by setting the flag /proc/sys/vm/block_dump. When this flag
is set, Linux reports all disk read and write operations that take place, and
all block dirtyings done to files. This makes it possible to debug why a disk
needs to spin up, and to increase battery life even more. The output of
block_dump is written to the kernel output, and it can be retrieved using
"dmesg". When you use block_dump, you may want to turn off klogd, otherwise
the output of block_dump will be logged, causing disk activity that is not
normally there.

If 10 minutes is too much or too little downtime for you, you can configure
this downtime as follows. In the control script, set the MAX_AGE value to the
maximum number of seconds of disk downtime that you would like. You should
then set your filesystem's commit interval to the same value. The dirty ratio
is also configurable from the control script.

If you don't like the idea of the control script remounting your filesystems
for you, you can change DO_REMOUNTS to 0 in the script.

Thanks to Kiko Piris, the control script can be used to enable laptop mode on
both the Linux 2.4 and 2.6 series.


Tips & Tricks
-------------

* Bartek Kania reports getting up to 50 minutes of extra battery life (on top
  of his regular 3 to 3.5 hours) using very aggressive power management (hdparm
  -B1) and a spindown time of 5 seconds (hdparm -S1).

* You can spin down the disk while playing MP3, by setting the disk readahead
  to 8MB (hdparm -a 16384). Effectively, the disk will read a complete MP3 at
  once, and will then spin down while the MP3 is playing. (Thanks to Bartek
  Kania.)

* Drew Scott Daniels observed: "I don't know why, but when I decrease the number
  of colours that my display uses it consumes less battery power. I've seen
  this on powerbooks too. I hope that this is a piece of information that
  might be useful to the Laptop Mode patch or it's users."

* One thing which will cause disks to spin up is not-present application
  and dynamic library text pages.  The kernel will load program text off disk
  on-demand, so each time you invoke an application feature for the first
  time, the kernel needs to spin the disk up to go and fetch that part of the
  application.

  So it is useful to increase the disk readahead parameter greatly, so that
  the kernel will pull all of the executable's pages into memory on the first
  pagefault.

  The supplied script does this.

* In syslog.conf, you can prefix entries with a dash ``-'' to omit syncing the
  file after every logging. When you're using laptop-mode and your disk doesn't
  spin down, this is a likely culprit.

* Richard Atterer observed that laptop mode does not work well with noflushd
  (http://noflushd.sourceforge.net/), it seems that noflushd prevents laptop-mode
  from doing its thing.


Control script
--------------

Please note that this control script works for the Linux 2.4 and 2.6 series.

--------------------CONTROL SCRIPT BEGIN------------------------------------------
#! /bin/sh

# start or stop laptop_mode, best run by a power management daemon when
# ac gets connected/disconnected from a laptop
#
# install as /sbin/laptop_mode
#
# Contributors to this script:   Kiko Piris
#				 Bart Samwel
#				 Micha Feigin
#				 Andrew Morton
#				 Dax Kelson
#
# Original Linux 2.4 version by: Jens Axboe

# Remove an option (the first parameter) of the form option=<number> from
# a mount options string (the rest of the parameters).
parse_mount_opts () {
	OPT="$1"
	shift
	echo "$*"			| \
	sed 's/.*/,&,/'			| \
	sed 's/,'"$OPT"'=[0-9]*,/,/g'	| \
	sed 's/,,*/,/g'			| \
	sed 's/^,//'			| \
	sed 's/,$//'			| \
	cat -
}

# Remove an option (the first parameter) without any arguments from
# a mount option string (the rest of the parameters).
parse_nonumber_mount_opts () {
	OPT="$1"
	shift
	echo "$*" 			| \
	sed 's/.*/,&,/'			| \
	sed 's/,'"$OPT"',/,/g'		| \
	sed 's/,,*/,/g'			| \
	sed 's/^,//'			| \
	sed 's/,$//'			| \
	cat -
}

# Find out the state of a yes/no option (e.g. "atime"/"noatime") in
# fstab for a given filesystem, and use this state to replace the
# value of the option in another mount options string. The device
# is the first argument, the option name the second, and the default
# value the third. The remainder is the mount options string.
#
# Example:
# parse_yesno_opts_wfstab /dev/hda1 atime atime defaults,noatime
#
# If fstab contains, say, "rw" for this filesystem, then the result
# will be "defaults,atime".
parse_yesno_opts_wfstab () {
	L_DEV=$1
	shift
	OPT=$1
	shift
	DEF_OPT=$1
	shift
	L_OPTS="$*"
	PARSEDOPTS1="$(parse_nonumber_mount_opts $OPT $L_OPTS)"
	PARSEDOPTS1="$(parse_nonumber_mount_opts no$OPT $PARSEDOPTS1)"
	# Watch for a default atime in fstab
	FSTAB_OPTS="$(cat /etc/fstab | sed 's/  / /g' | grep ^\ *"$L_DEV " | awk '{ print $4 }')"
	if [ -z "$(echo "$FSTAB_OPTS" | grep "$OPT")" ] ; then
		# option not specified in fstab -- choose the default.
		echo "$PARSEDOPTS1,$DEF_OPT"
	else
		# option specified in fstab: extract the value and use it
		if [ -z "$(echo "$FSTAB_OPTS" | grep "no$OPT")" ] ; then
			# no$OPT not found -- so we must have $OPT.
			echo "$PARSEDOPTS1,$OPT"
		else
			echo "$PARSEDOPTS1,no$OPT"
		fi
	fi
}

# Find out the state of a numbered option (e.g. "commit=NNN") in
# fstab for a given filesystem, and use this state to replace the
# value of the option in another mount options string. The device
# is the first argument, and the option name the second. The
# remainder is the mount options string in which the replacement
# must be done.
#
# Example:
# parse_mount_opts_wfstab /dev/hda1 commit defaults,commit=7
#
# If fstab contains, say, "commit=3,rw" for this filesystem, then the
# result will be "rw,commit=3".
parse_mount_opts_wfstab () {
	L_DEV=$1
	shift
	OPT=$1
	shift
	L_OPTS="$*"

	PARSEDOPTS1="$(parse_mount_opts $OPT $L_OPTS)"
	# Watch for a default commit in fstab
	FSTAB_OPTS="$(cat /etc/fstab | sed 's/	/ /g' | grep ^\ *"$L_DEV " | awk '{ print $4 }')"
	if [ -z "$(echo "$FSTAB_OPTS" | grep "$OPT=")" ] ; then
		# option not specified in fstab: set it to 0
		echo "$PARSEDOPTS1,$OPT=0"
	else
		# option specified in fstab: extract the value, and use it
		echo -n "$PARSEDOPTS1,$OPT="
		echo "$FSTAB_OPTS"	| \
		sed 's/.*/,&,/'		| \
		sed 's/.*,'"$OPT"'=//'	| \
		sed 's/,.*//'		| \
		cat -
	fi
}

KLEVEL="$(uname -r | cut -c1-3)"
case "$KLEVEL" in
	"2.4"|"2.6")
		true
		;;
	*)
		echo "Unhandled kernel version: $KLEVEL ('uname -r' = '$(uname -r)')"
		exit 1
		;;
esac

# Shall we remount journaled fs. with appropiate commit interval? (1=yes)
DO_REMOUNTS=1

# And shall we add the "noatime" option to that as well? (1=yes)
DO_REMOUNT_NOATIME=1

# age time, in seconds. should be put into a sysconfig file
MAX_AGE=600

# Dirty synchronous ratio.  At this percentage of dirty pages the process which
# calls write() does its own writeback
DIRTY_RATIO=40

#
# Allowed dirty background ratio, in percent.  Once DIRTY_RATIO has been
# exceeded, the kernel will wake pdflush which will then reduce the amount
# of dirty memory to dirty_background_ratio.  Set this nice and low, so once
# some writeout has commenced, we do a lot of it.
#
DIRTY_BACKGROUND_RATIO=5

READAHEAD=4096		# kilobytes

# kernel default dirty buffer age
DEF_AGE=30
DEF_UPDATE=5
DEF_DIRTY_BACKGROUND_RATIO=10
DEF_DIRTY_RATIO=40
DEF_XFS_AGE_BUFFER=15
DEF_XFS_SYNC_INTERVAL=30

# This must be adjusted manually to the value of HZ in the running kernel
# on 2.4, until the XFS people change their 2.4 external interfaces to work in
# centisecs. This can be automated, but it's a work in progress that still needs
# some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for external
# interfaces, and that is currently always set to 100. So you don't need to
# change this on 2.6.
XFS_HZ=100

if [ ! -e /proc/sys/vm/laptop_mode ]; then
	echo "Kernel is not patched with laptop_mode patch."
	exit 1
fi

if [ ! -w /proc/sys/vm/laptop_mode ]; then
	echo "You do not have enough privileges to enable laptop_mode."
	exit 1
fi

if [ $DO_REMOUNT_NOATIME -eq 1 ]; then
	NOATIME_OPT=",noatime"
fi

case "$1" in
	start)
		AGE=$((100*$MAX_AGE))
		XFS_AGE=$(($XFS_HZ*$MAX_AGE))
		echo -n "Starting laptop_mode"

		if [ -d /proc/sys/vm/pagebuf ] ; then
			# (For 2.4 and early 2.6.)
			# This only needs to be set, not reset -- it is only used when
			# laptop mode is enabled.
			echo $XFS_AGE > /proc/sys/vm/pagebuf/lm_flush_age
			echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval
		elif [ -f /proc/sys/fs/xfs/lm_age_buffer ] ; then
			# (A couple of early 2.6 laptop mode patches had these.)
			# The same goes for these.
			echo $XFS_AGE > /proc/sys/fs/xfs/lm_age_buffer
			echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval
		elif [ -f /proc/sys/fs/xfs/age_buffer ] ; then
			# (2.6.6)
			# But not for these -- they are also used in normal
			# operation.
			echo $XFS_AGE > /proc/sys/fs/xfs/age_buffer
			echo $XFS_AGE > /proc/sys/fs/xfs/sync_interval
		elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then
			# (2.6.7 upwards)
			# And not for these either. These are in centisecs,
			# not USER_HZ, so we have to use $AGE, not $XFS_AGE.
			echo $AGE > /proc/sys/fs/xfs/age_buffer_centisecs
			echo $AGE > /proc/sys/fs/xfs/xfssyncd_centisecs
		fi

		case "$KLEVEL" in
			"2.4")
				echo "1"				> /proc/sys/vm/laptop_mode
				echo "30 500 0 0 $AGE $AGE 60 20 0"	> /proc/sys/vm/bdflush
				;;
			"2.6")
				echo "5"				> /proc/sys/vm/laptop_mode
				echo "$AGE"				> /proc/sys/vm/dirty_writeback_centisecs
				echo "$AGE"				> /proc/sys/vm/dirty_expire_centisecs
				echo "$DIRTY_RATIO"			> /proc/sys/vm/dirty_ratio
				echo "$DIRTY_BACKGROUND_RATIO"		> /proc/sys/vm/dirty_background_ratio
				;;
		esac
		if [ $DO_REMOUNTS -eq 1 ]; then
			cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do
				PARSEDOPTS="$(parse_mount_opts "$OPTS")"
				case "$FST" in
					"ext3"|"reiserfs")
						PARSEDOPTS="$(parse_mount_opts commit "$OPTS")"
						mount $DEV -t $FST $MP -o remount,$PARSEDOPTS,commit=$MAX_AGE$NOATIME_OPT
						;;
					"xfs")
						mount $DEV -t $FST $MP -o remount,$OPTS$NOATIME_OPT
						;;
				esac
				if [ -b $DEV ] ; then
					blockdev --setra $(($READAHEAD * 2)) $DEV
				fi
			done
		fi
		echo "."
		;;
	stop)
		U_AGE=$((100*$DEF_UPDATE))
		B_AGE=$((100*$DEF_AGE))
		echo -n "Stopping laptop_mode"
		echo "0" > /proc/sys/vm/laptop_mode
		if [ -f /proc/sys/fs/xfs/age_buffer ] && [ ! -f /proc/sys/fs/xfs/lm_age_buffer ] ; then
			# These need to be restored, if there are no lm_*.
			echo "$(($XFS_HZ*$DEF_XFS_AGE_BUFFER))" 	> /proc/sys/fs/xfs/age_buffer
			echo "$(($XFS_HZ*$DEF_XFS_SYNC_INTERVAL))" 	> /proc/sys/fs/xfs/sync_interval
		elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then
			# These need to be restored as well.
			echo "$((100*$DEF_XFS_AGE_BUFFER))" > /proc/sys/fs/xfs/age_buffer_centisecs
			echo "$((100*$DEF_XFS_SYNC_INTERVAL))" > /proc/sys/fs/xfs/xfssyncd_centisecs
		fi
		case "$KLEVEL" in
			"2.4")
				echo "30 500 0 0 $U_AGE $B_AGE 60 20 0"	> /proc/sys/vm/bdflush
				;;
			"2.6")
				echo "$U_AGE"				> /proc/sys/vm/dirty_writeback_centisecs
				echo "$B_AGE"				> /proc/sys/vm/dirty_expire_centisecs
				echo "$DEF_DIRTY_RATIO"			> /proc/sys/vm/dirty_ratio
				echo "$DEF_DIRTY_BACKGROUND_RATIO"	> /proc/sys/vm/dirty_background_ratio
				;;
		esac
		if [ $DO_REMOUNTS -eq 1 ]; then
			cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do
				# Reset commit and atime options to defaults.
				case "$FST" in
					"ext3"|"reiserfs")
						PARSEDOPTS="$(parse_mount_opts_wfstab $DEV commit $OPTS)"
						PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $PARSEDOPTS)"
						mount $DEV -t $FST $MP -o remount,$PARSEDOPTS
						;;
					"xfs")
						PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $OPTS)"
						mount $DEV -t $FST $MP -o remount,$PARSEDOPTS
						;;
				esac
				if [ -b $DEV ] ; then
					blockdev --setra 256 $DEV
				fi
			done
		fi
		echo "."
		;;
	*)
		echo "Usage: $0 {start|stop}"
		;;

esac

exit 0

--------------------CONTROL SCRIPT END--------------------------------------------


ACPI integration
----------------

Dax Kelson submitted this so that the ACPI acpid daemon will
kick off the laptop_mode script and run hdparm.

---------------------------/etc/acpi/events/ac_adapter BEGIN-------------------------------------------
event=ac_adapter
action=/etc/acpi/actions/battery.sh
---------------------------/etc/acpi/events/ac_adapter END-------------------------------------------

---------------------------/etc/acpi/actions/battery.sh BEGIN-------------------------------------------
#!/bin/sh

# cpu throttling
# cat /proc/acpi/processor/CPU0/throttling for more info
ACAD_THR=0
BATT_THR=2

# spindown time for HD (man hdparm for valid values)
# I prefer 2 hours for acad and 20 seconds for batt
ACAD_HD=244
BATT_HD=4

# ac/battery event handler

status=`awk '/^state: / { print $2 }' /proc/acpi/ac_adapter/AC/state`

case $status in
        "on-line")
                echo "Setting HD spindown to 2 hours"
                /sbin/laptop_mode stop
                /sbin/hdparm -S $ACAD_HD /dev/hda > /dev/null 2>&1
                /sbin/hdparm -B 255 /dev/hda > /dev/null 2>&1
                #echo -n $ACAD_CPU:$ACAD_THR > /proc/acpi/processor/CPU0/limit
                exit 0
        ;;
        "off-line")
                echo "Setting HD spindown to 20 seconds"
                /sbin/laptop_mode start
                /sbin/hdparm -S $BATT_HD /dev/hda > /dev/null 2>&1
                /sbin/hdparm -B 1 /dev/hda > /dev/null 2>&1
                #echo -n $BATT_CPU:$BATT_THR > /proc/acpi/processor/CPU0/limit
                exit 0
        ;;
esac
---------------------------/etc/acpi/actions/battery.sh END-------------------------------------------

Monitoring tool
---------------

Bartek Kania submitted this, it can be used to measure how much time your disk
spends spun up/down.

---------------------------dslm.c BEGIN-------------------------------------------
/*
 * Simple Disk Sleep Monitor
 *  by Bartek Kania
 * Licenced under the GPL
 */
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <fcntl.h>
#include <errno.h>
#include <time.h>
#include <string.h>
#include <signal.h>
#include <sys/ioctl.h>
#include <linux/hdreg.h>

#ifdef DEBUG
#define D(x) x
#else
#define D(x)
#endif

int endit = 0;

/* Check if the disk is in powersave-mode
 * Most of the code is stolen from hdparm.
 * 1 = active, 0 = standby/sleep, -1 = unknown */
int check_powermode(int fd)
{
    unsigned char args[4] = {WIN_CHECKPOWERMODE1,0,0,0};
    int state;

    if (ioctl(fd, HDIO_DRIVE_CMD, &args)
	&& (args[0] = WIN_CHECKPOWERMODE2) /* try again with 0x98 */
	&& ioctl(fd, HDIO_DRIVE_CMD, &args)) {
	if (errno != EIO || args[0] != 0 || args[1] != 0) {
	    state = -1; /* "unknown"; */
	} else
	    state = 0; /* "sleeping"; */
    } else {
	state = (args[2] == 255) ? 1 : 0;
    }
    D(printf(" drive state is:  %d\n", state));

    return state;
}

char *state_name(int i)
{
    if (i == -1) return "unknown";
    if (i == 0) return "sleeping";
    if (i == 1) return "active";

    return "internal error";
}

char *myctime(time_t time)
{
    char *ts = ctime(&time);
    ts[strlen(ts) - 1] = 0;

    return ts;
}

void measure(int fd)
{
    time_t start_time;
    int last_state;
    time_t last_time;
    int curr_state;
    time_t curr_time = 0;
    time_t time_diff;
    time_t active_time = 0;
    time_t sleep_time = 0;
    time_t unknown_time = 0;
    time_t total_time = 0;
    int changes = 0;
    float tmp;

    printf("Starting measurements\n");

    last_state = check_powermode(fd);
    start_time = last_time = time(0);
    printf("  System is in state %s\n\n", state_name(last_state));

    while(!endit) {
	sleep(1);
	curr_state = check_powermode(fd);

	if (curr_state != last_state || endit) {
	    changes++;
	    curr_time = time(0);
	    time_diff = curr_time - last_time;

	    if (last_state == 1) active_time += time_diff;
	    else if (last_state == 0) sleep_time += time_diff;
	    else unknown_time += time_diff;

	    last_state = curr_state;
	    last_time = curr_time;

	    printf("%s: State-change to %s\n", myctime(curr_time),
		   state_name(curr_state));
	}
    }
    changes--; /* Compensate for SIGINT */

    total_time = time(0) - start_time;
    printf("\nTotal running time:  %lus\n", curr_time - start_time);
    printf(" State changed %d times\n", changes);

    tmp = (float)sleep_time / (float)total_time * 100;
    printf(" Time in sleep state:   %lus (%.2f%%)\n", sleep_time, tmp);
    tmp = (float)active_time / (float)total_time * 100;
    printf(" Time in active state:  %lus (%.2f%%)\n", active_time, tmp);
    tmp = (float)unknown_time / (float)total_time * 100;
    printf(" Time in unknown state: %lus (%.2f%%)\n", unknown_time, tmp);
}

void ender(int s)
{
    endit = 1;
}

void usage()
{
    puts("usage: dslm [-w <time>] <disk>");
    exit(0);
}

int main(int ac, char **av)
{
    int fd;
    char *disk = 0;
    int settle_time = 60;

    /* Parse the simple command-line */
    if (ac == 2)
	disk = av[1];
    else if (ac == 4) {
	settle_time = atoi(av[2]);
	disk = av[3];
    } else
	usage();

    if (!(fd = open(disk, O_RDONLY|O_NONBLOCK))) {
	printf("Can't open %s, because: %s\n", disk, strerror(errno));
	exit(-1);
    }

    if (settle_time) {
	printf("Waiting %d seconds for the system to settle down to "
	       "'normal'\n", settle_time);
	sleep(settle_time);
    } else
	puts("Not waiting for system to settle down");

    signal(SIGINT, ender);

    measure(fd);

    close(fd);

    return 0;
}
---------------------------dslm.c END---------------------------------------------