Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"




On Wed, 2008-01-02 at 12:45 -0800, Linus Torvalds wrote:

On Wed, 2 Jan 2008, James Bottomley wrote:

OK ... I'll revert it. However, I still think it's the wrong course of
action, because as far as my analysis goes, this code is functionally
equivalent to what went before with the exception that we now rely on
the request->cmd_type information in the post processing (previously we
just relied on the cmnd->done pointer).

To say that another way:

"the code is functionally equivalent, EXCEPT IT ISN'T, and it's
known to be broken".

wouldn't you say my version is more honest and correct?

No. Just because a bug appears when a particular piece of code is in
and disappears when it is reverted doesn't automatically equate to the
code in question being buggy. We seem to get a lot of these second
order effect type things; sometimes its just a problem caused by a
particular routine compiling to a longer byte sequence and pushing
something else out.

Do give us credit for thinking "functional equivalency problem" when
this bug report first came in ... I've had myself and several other
people over the code. If there's an inequivalency somewhere I'm damned
if I can spot it. The most promising other failure mode we tried was
request type changes over the lifetime of the command, but we can't make
that one fly either.

Look at the taxonomy of the bug. This is the form of the error:

buffer I/O error on device sr0, logical block 20304
attempt to access beyond end of device
sr0: rw=0, want=81224, limit=40944

The last limit is the most suggestive, that comes straight from
bdev->bd_inode->i_size>>9 and is supposed to be the size of the block
device in 512 byte blocks. For a 4.7GB DVD, it's a little small.
Nothing in the sr code sets this directly (although it does come from
get_blkdev() for the first opener). pktcdvd does set it, though ... and
probably wrongly if the drive in question isn't UDF formatted.

I have also tried on many occasions to reproduce this without success
(there's a simple recipe in the bug report, but it just doesn't work for
me). My setup is with an aic94xxx->expander->SATAPI DVD, whereas the
original reporter is ata_piix -> PATA DVD, so it could be stack
differences---but again, if it is, the bug itself can't be a simple one
in the generic code. The fact that there are no other reporters of
problems like this also indicate to me that it isn't a widespread
problem (again, pointing to something more specific in the setup of the
reporter).

The unreproduceability coupled with the lack of other error reports
leads me to be about 90% confident the problem isn't in the code you
want reverted. However, I grant that we cannot seem to find the root
cause, and reverting the code will cause our bug metrics to go down by
one (at least until something else causes it to reappear), so it is the
corporate thing to do, I suppose. I'll send in a reversion with the
sr_mod removal bug fix.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
    ... Noone knows how many thousand bug reports have never reached lkml ... filing or get back to terminate the report. ... But I would like kernel people to become less egocentric ... Send _one_ email to lkml and you'll get forever spam to this address. ...
    (Linux-Kernel)
  • Re: 2.6.25-rc8: FTP transfer errors
    ... Yes, Mark, we used to do things that way for every bug in the kernel. ... We should be very careful about git-bisect. ... the developers, because when they think they might have fixed it, ... But I know that a report is a report, and even if I have a ...
    (Linux-Kernel)
  • Re: [patch] scsi: revert "[SCSI] Get rid of scsi_cmnd->done"
    ... Noone knows how many thousand bug reports have never reached lkml ... filing or get back to terminate the report. ... But I would like kernel people to become less egocentric ... Send _one_ email to lkml and you'll get forever spam to this address. ...
    (Linux-Kernel)
  • Re: Linux 2.6.21
    ... The kernel Bugzilla currently contains 1600 open bugs. ... Adrian, why do you keep harping on this, and ignoring reality? ... I suspect some bug reports get ignored deliberately. ... engage some developers on a bug report. ...
    (Linux-Kernel)
  • Bugfix(59/8=APNIC), math jobs (was: JDEE/CGI/flashcards ...)
    ... bug report so I could fix the problem quickly. ... > Note that I said it "looks" incomplete and buggy, ... > high math skills. ...
    (comp.lang.lisp)

Loading