Re: [bug] block subsystem related crash with latest -git
- From: Jens Axboe <jens.axboe@xxxxxxxxxx>
- Date: Wed, 17 Oct 2007 19:29:32 +0200
On Wed, Oct 17 2007, Jens Axboe wrote:
On Wed, Oct 17 2007, Jens Axboe wrote:
On Wed, Oct 17 2007, Jens Axboe wrote:
On Wed, Oct 17 2007, Linus Torvalds wrote:
On Wed, 17 Oct 2007, Ingo Molnar wrote:
Jens, just got this crash on a testbox:
The code in question is:
mov %edx,0xc(%esp)
mov (%ebx),%edi
mov %edi,%edx
sub %eax,%edx
mov %edx,%eax
sar $0x5,%eax
shl $0xc,%eax
add 0x8(%ebx),%eax
cmp %eax,0xc(%esp)
je +126
mov 0x10(%esi),%eax <----- Oops
lea 0x10(%esi),%edx
test $0x1,%al
jne +76
mov %edi,(%esi)
mov %ebp,0xc(%esi)
mov 0x8(%ebx),%eax
mov %eax,0x4(%esi)
and it looks like %esi is overflowing from one page to the next one, ie:
BUG: unable to handle kernel paging request at virtual address 7ca76000
ESI: 7ca75ff0
and you caught this thanks to page-alloc debugging again.
I think I can match that up with the source code: that's "sg_next()". It's
doing:
sg++;
if (unlikely(sg_is_chain(sg)))
sg = sg_chain_ptr(sg);
return sg;
and the oopsing instruction is that load of "sg->page" in the assembly
code:
mov 0x10(%esi),%eax # %eax = sg->page
lea 0x10(%esi),%edx # %edx = sg+1;
test $0x1,%al # if (unlikely(sg_is_chain()))
jne +76
Jens?
Yep, that's what I came up with as well - I asked Ingo for a dump in
private, but ended up just using ksymoops to decode the line.
The way blk_rq_map_sg() operates is that it ends up doing a
next_sg = sg_next(sg);
even though sg may be the last entry. Perhaps this is crapping out,
although if sg is a valid address, then sg + 1 should be as well.
next_sg may end up being crap, in fact it will, but we'll never use that
unless there are more entries to fill. And if there is, then both sg and
next_sg were valid.
So nothing in for-linus should fix it, I'll try and come up with an
alternate way to assign next_sg so it's always valid.
OK, the below should actually be safe, I don't know why I talked myself
into the next_sg stuff in the beginning. It's always safe to zero sg,
since it's a valid entry - nothing to save in ->page. Ingo, does this
work for you?
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 9e3f3cc..3935469 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1322,8 +1322,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
struct scatterlist *sglist)
{
struct bio_vec *bvec, *bvprv;
- struct scatterlist *next_sg, *sg;
struct req_iterator iter;
+ struct scatterlist *sg;
int nsegs, cluster;
nsegs = 0;
@@ -1333,7 +1333,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
* for each bio in rq
*/
bvprv = NULL;
- sg = next_sg = &sglist[0];
+ sg = NULL;
rq_for_each_segment(bvec, rq, iter) {
int nbytes = bvec->bv_len;
@@ -1349,8 +1349,10 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
sg->length += nbytes;
} else {
new_segment:
- sg = next_sg;
- next_sg = sg_next(sg);
+ if (!sg)
+ sg = sglist;
+ else
+ sg = sg_next(sg);
memset(sg, 0, sizeof(*sg));
sg->page = bvec->bv_page;
Scratch that, it cannot work... I'll think up a different approach.
OK, it is fine, as long as the sglist is cleared initially. And I don't
think there's anyway around that, clearly I didn't think long enough
before including the memset() removal from Tomo.
Ingo, please try this rolled up version.
Linus, this should work. It would probably be best if you first did a
git revert on f5c0dde4c66421a3a2d7d6fa604a712c9b0744e5 and then applied
the ll_rw_blk.c bit alone. Do you want me to stuff that (revert + patch)
into a branch for you to pull?
diff --git a/block/ll_rw_blk.c b/block/ll_rw_blk.c
index 9e3f3cc..3935469 100644
--- a/block/ll_rw_blk.c
+++ b/block/ll_rw_blk.c
@@ -1322,8 +1322,8 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
struct scatterlist *sglist)
{
struct bio_vec *bvec, *bvprv;
- struct scatterlist *next_sg, *sg;
struct req_iterator iter;
+ struct scatterlist *sg;
int nsegs, cluster;
nsegs = 0;
@@ -1333,7 +1333,7 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
* for each bio in rq
*/
bvprv = NULL;
- sg = next_sg = &sglist[0];
+ sg = NULL;
rq_for_each_segment(bvec, rq, iter) {
int nbytes = bvec->bv_len;
@@ -1349,8 +1349,10 @@ int blk_rq_map_sg(struct request_queue *q, struct request *rq,
sg->length += nbytes;
} else {
new_segment:
- sg = next_sg;
- next_sg = sg_next(sg);
+ if (!sg)
+ sg = sglist;
+ else
+ sg = sg_next(sg);
memset(sg, 0, sizeof(*sg));
sg->page = bvec->bv_page;
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 0c86be7..aac8a02 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -764,6 +764,8 @@ struct scatterlist *scsi_alloc_sgtable(struct scsi_cmnd *cmd, gfp_t gfp_mask)
if (unlikely(!sgl))
goto enomem;
+ memset(sgl, 0, sizeof(*sgl) * sgp->size);
+
/*
* first loop through, set initial index and return value
*/
--
Jens Axboe
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: [bug] block subsystem related crash with latest -git
- From: Luca Tettamanti
- Re: [bug] block subsystem related crash with latest -git
- From: Linus Torvalds
- Re: [bug] block subsystem related crash with latest -git
- From: Ingo Molnar
- Re: [bug] block subsystem related crash with latest -git
- References:
- [bug] block subsystem related crash with latest -git
- From: Ingo Molnar
- Re: [bug] block subsystem related crash with latest -git
- From: Linus Torvalds
- Re: [bug] block subsystem related crash with latest -git
- From: Jens Axboe
- Re: [bug] block subsystem related crash with latest -git
- From: Jens Axboe
- Re: [bug] block subsystem related crash with latest -git
- From: Jens Axboe
- [bug] block subsystem related crash with latest -git
- Prev by Date: Re: [bug] block subsystem related crash with latest -git
- Next by Date: Re: [bug] block subsystem related crash with latest -git
- Previous by thread: Re: [bug] block subsystem related crash with latest -git
- Next by thread: Re: [bug] block subsystem related crash with latest -git
- Index(es):
Relevant Pages
|