Re: [Bug 1806] New: disks stats not kept for DM (device mapper) devices

From: Kevin Corry (kevcorry_at_us.ibm.com)
Date: 01/07/04

  • Next message: Matt Mackall: "Re: 2.6.1-rc1-tiny2"
    To: linux-kernel <linux-kernel@vger.kernel.org>, DevMapper <dm-devel@sistina.com>
    Date:	Wed, 7 Jan 2004 14:04:03 -0600
    
    

    > http://bugme.osdl.org/show_bug.cgi?id=1806
    >
    > Summary: disks stats not kept for DM (device mapper) devices
    > Kernel Version: 2.6.0
    > Status: NEW
    > Severity: normal
    > Owner: axboe@suse.de
    > Submitter: slpratt@us.ibm.com
    >
    >
    > Distribution:all
    > Hardware Environment:all
    > Software Environment:all
    > Problem Description:
    > Disk stats as reported through sysfs are empty for all DM (device mapper)
    > devices. This appears to be due to the fact that the stats are traced via
    > request structs which are not generated until below the device mapper
    > layer. It seems it would be possible to add code to device mapper to track
    > the stats since the actual location of the stats is in the gendisk entry
    > which does exsist for DM deivices. Only problem I see is in tracking ticks
    > for IO since in the non DM case this is done by storing a start time in the
    > request struct on driving the request. Since DM has no request struct
    > (only the BIO) it has no place to record the start time.
    >
    > Steps to reproduce:
    > create a DM device using dmsetup, lvm2 or EVMS. Do IO to device, look at
    > /sys/block/dm-xxx/stat.

    Steve and I noticed this behavior this morning. I poked around in ll_rw_blk.c
    and genhd.[ch] and some of the individual block drivers to get an idea for
    how I/O statistics are managed. I eventually came up with this patch for
    Device-Mapper.

    Some things to note:

    - In the lower-level drivers (IDE, SCIS, etc), statistics are calculated based
    on request structs. DM never uses/sees request structs, so these statistics
    are based on bio structs. Meaning...a "reads" count of 1 means DM completed 1
    bio, not 1 request.

    - The statistics calculations are done when a bio is completed. As some brief
    background, when DM receives a bio, it holds on to that bio, and creates one
    or more clones (in case the original bio needs to be split across some
    internal DM boundary). DM manually submits each of those clones, and when all
    clones complete, DM completes the original bio. I've added the statistics
    calculations at this completion point. It could just as easily be done before
    the clones are submitted. I'm not sure if this is the desired behavior, but
    it seemed to be based on examining the other places where the statistics are
    updated. One side-effect of this decision is that an I/O that causes an error
    within DM may not show up at all in the statistics. E.g., if the DM device
    currently has no active mapping, DM will simply call bio_io_error() on that
    bio and thus never update the stats for the device - which may actually be
    the desired behavior.

    - Device-Mapper doesn't do much of anything with the request_queue struct
    that's attached to its gendisk entry. However, all of the code that updates
    the I/O stats has comments that say the request_queue must be locked before
    the stats can be updated. So this patch allocates a spinlock for each DM
    request_queue, and this lock is taken while updating the stats. This
    introduces some new contention on DM's I/O completion path. I don't know yet
    whether it will be a significant amount.

    I'm not sure if this is the best way to do this, but it's probably the
    simplest. On a related note, MD/Software-RAID also doesn't currently track
    I/O statistics. If we decide if/how to track statistics for DM, doing the
    same in MD ought to be pretty easy.

    This patch is against 2.6.1-rc2. I also have a slightly different version for
    Joe's 2.6.0-udm3 patchset.

    -- 
    Kevin Corry
    kevcorry@us.ibm.com
    http://evms.sourceforge.net/
    Update I/O statistics before completing the original, incoming bio.
    --- a/drivers/md/dm.c	2004-01-07 13:53:55.000000000 -0600
    +++ b/drivers/md/dm.c	2004-01-07 13:59:30.000000000 -0600
    @@ -25,6 +25,7 @@
     	int error;
     	struct bio *bio;
     	atomic_t io_count;
    +	unsigned long start_time;
     };
     
     struct deferred_io {
    @@ -44,7 +45,7 @@
     
     	unsigned long flags;
     
    -	request_queue_t *queue;
    +	struct request_queue *queue;
     	struct gendisk *disk;
     
     	/*
    @@ -243,6 +244,29 @@
     	return sector << SECTOR_SHIFT;
     }
     
    +static inline void update_io_stats(struct dm_io *io)
    +{
    +	unsigned long flags;
    +	unsigned long duration = jiffies - io->start_time;
    +
    +	spin_lock_irqsave(io->md->queue->queue_lock, flags);
    +
    +	switch (bio_data_dir(io->bio)) {
    +	case READ:
    +		disk_stat_inc(dm_disk(io->md), reads);
    +		disk_stat_add(dm_disk(io->md), read_sectors, bio_sectors(io->bio));
    +		disk_stat_add(dm_disk(io->md), read_ticks, duration);
    +		break;
    +	case WRITE:
    +		disk_stat_inc(dm_disk(io->md), writes);
    +		disk_stat_add(dm_disk(io->md), write_sectors, bio_sectors(io->bio));
    +		disk_stat_add(dm_disk(io->md), write_ticks, duration);
    +		break;
    +	}
    +
    +	spin_unlock_irqrestore(io->md->queue->queue_lock, flags);
    +}
    +
     /*
      * Decrements the number of outstanding ios that a bio has been
      * cloned into, completing the original io if necc.
    @@ -259,6 +283,8 @@
     	}
     
     	if (atomic_dec_and_test(&io->io_count)) {
    +		update_io_stats(io);
    +
     		if (atomic_dec_and_test(&io->md->pending))
     			/* nudge anyone waiting on suspend queue */
     			wake_up(&io->md->wait);
    @@ -462,6 +488,7 @@
     	atomic_set(&ci.io->io_count, 1);
     	ci.io->bio = bio;
     	ci.io->md = md;
    +	ci.io->start_time = jiffies;
     	ci.sector = bio->bi_sector;
     	ci.sector_count = bio_sectors(bio);
     	ci.idx = bio->bi_idx;
    @@ -607,6 +634,14 @@
     		return NULL;
     	}
     
    +	md->queue->queue_lock = kmalloc(sizeof(spinlock_t), GFP_KERNEL);
    +	if (!md->queue->queue_lock) {
    +		free_minor(minor);
    +		blk_put_queue(md->queue);
    +		kfree(md);
    +		return NULL;
    +	}
    +
     	md->queue->queuedata = md;
     	blk_queue_make_request(md->queue, dm_request);
     
    @@ -614,6 +649,7 @@
     				     mempool_free_slab, _io_cache);
     	if (!md->io_pool) {
     		free_minor(minor);
    +		kfree(md->queue->queue_lock);
     		blk_put_queue(md->queue);
     		kfree(md);
     		return NULL;
    @@ -623,6 +659,7 @@
     	if (!md->disk) {
     		mempool_destroy(md->io_pool);
     		free_minor(minor);
    +		kfree(md->queue->queue_lock);
     		blk_put_queue(md->queue);
     		kfree(md);
     		return NULL;
    @@ -649,6 +686,7 @@
     	mempool_destroy(md->io_pool);
     	del_gendisk(md->disk);
     	put_disk(md->disk);
    +	kfree(md->queue->queue_lock);
     	blk_put_queue(md->queue);
     	kfree(md);
     }
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at  http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at  http://www.tux.org/lkml/
    

  • Next message: Matt Mackall: "Re: 2.6.1-rc1-tiny2"

    Relevant Pages

    • Re: We have the pitching, what about offense?
      ... "As an engineer I've always enjoyed ... suggest you look it up in a statistics book. ... although I took a course in it in college and have used/read stats ... Pedroia will do next year. ...
      (alt.sports.baseball.bos-redsox)
    • RE: How to link tables via the field names themselves, not their c
      ... you can query the database to do what you want easily. ... There are approximately 30 different aspects (or statistics) which are ... Beacuse there are so many stats per contract, ...
      (microsoft.public.access.queries)
    • RE: How to link tables via the field names themselves, not their c
      ... you can query the database to do what you want easily. ... There are approximately 30 different aspects (or statistics) which are ... Mulitple audits are made per contract. ... Beacuse there are so many stats per contract, ...
      (microsoft.public.access.queries)
    • Re: Just Arod being Arod
      ... they will lead to better or worse statistics. ... Player B has very little heart, guts, and nerve. ... ARod has a hell of a long way to go to match Reggie in Yankee lore ... pitches, it's in his stats. ...
      (alt.sports.baseball.ny-yankees)
    • Re: OT: response to Jim
      ... Lizzardwoman wrote: ... | Yes but most of them are statistical (I can do statistics:) so aren't ... I'm not a bio type but I think the scaling law between body size or heart ...
      (rec.equestrian)