Re: [PATCH] mmc_spi: do propper retry managment in the block layer - 3rd try



(I haven't compared to your latest versions some some of my comments
here might not be relevant for the existing patch)

On Thu, 28 May 2009 10:28:29 +0200
Wolfgang Mües <wolfgang.mues@xxxxxxxxxxxx> wrote:

Am Mittwoch, 27. Mai 2009 schrieb Pierre Ossman:
FYI, the situation that we must avoid to not screw up the assumptions
of the filesystems is to end up with a situation like this on the card:

NNNNBNNNOOOO

N = block with new data
O = block with previous data
B = block with damaged data

This could in theory happen when the card has some internal problem
(e.g. bad flash cell) as well go through these states:

NNNNNNNNOOOO (first write, response to stop command is lost)
NNNNBNNNOOOO (second write, card has internal problem halfway through)

For SPI and for any controller which gives you a correct bytes_xfered, this
will only be NNNNNNNNNNNNBOOOOOOOOOOO, because after each transfered sector,
you get a response from the card. So you will not start from the beginning
any more.

If a host controller does not give a correct bytes_xfered, you might end in
the scenario you have pointed out, only after exhausting all retries. This is
equivalent to a bad of multiple sectors. NBNBNBNB000 == NBBBBBBBOOO.


Hmm... unfortunately this is the norm as I don't think any of the other
drivers provide a proper bytes_xfered. That also means that the verdict
for this patch is NAK for now.

I think we should code according to the expectations of the user.
A nonrecoverable write results in a data loss for the user. And I think that
the wish of the user to get the newly aquired data down to the card has
preceedence against the possible loss of some old data on the card. Remember
that the "old data" will be unused sectors most of the time. In embedded
systems, the user often has no chance to repeat the storage process or choose
another medium, and the data ist lost if the write to SD card fails.


The argument was that this method probably worked fine for classical
file systems, but it wrecked havoc if it was done in the log area of
journaling file systems.

I'd like to reiterate that I like the idea and I really think the
kernel should be able to do this. If you can get the VFS and block layer
people to look at this and sign off on it, then we can move forward and
include this. I don't hive time for that particular crusade myself though.


I think we can keep the logic where it uses single blocks for the
remainder of the request.

But this will increase wear leveling problems for the rest of the request.
And this is for a mean value of 64 sectors!

Why on earth should we do this if it is so easy to avoid?

As transmission errors are seldom, we get a request splitted out in 3 blocks:
n blocks before the error, one single-block transfer, m blocks after the
error. And you almost will not notice a slowdown in transmission speed.


That code was not written for transmission errors but for media errors.
That means that if block 8 is the bad one, your method (provided
bytex_xfered is useless, which is the norm) means that it will transfer
n blocks, 1 block, n-1 blocks, 1 block, etc. failing every second
attempt until it gets to the actually damaged sector.


Perhaps this fix can be moved into its own patch? I think it might keep
things cleaner.

Which fix? The code was only moved and restructured. No new behaviour.


Never mind, I wasn't paying proper attention. :)

+
+ /* Wait until card is ready for new data. This information is
+ * only available in non-SPI mode. In SPI mode, the busy
+ * handling is done in the SPI driver.
+ */

Strictly speaking it should be handled in all drivers. Some hardware
doesn't do it properly though so we have this safeguard.

IMHO no driver and framework is doing this correct. It should not be done
after a write request, but before the _next_ request. All I have done here
was to add some documentation about SPI, and leave the previous code intact.

Why on the next request? For performance reasons? The downside to that
is increased complexity. The current system means that the card is
always in an idle state after each request.

(the hardware handling also gets trickier as the host controller also
is kept in some "active" state until busy ends)

+ /* Do retries for all sort of transmission errors */
+ switch (error) {

- /*
- * A block was successfully transferred.
+ case 0: /* no error: continue, reset error variables */
+ disable_multi = 0;
+ retries = 3;
+ break;
+
+ /* Card has not understand command. As we do only send
+ * valid commands, this must be a transmission error. */
+ case -EPROTO: /* fall through */

This indicates a layering problem. The host driver should not be aware
of anything but pure bit errors.

Also, this special meaning should be documented in core.h should we
decide to keep it.

This is not MY error code. EPROTO is send from mmc_spi_writeblock(), and I
have listed it here because it is a transmission error.

So there seems to be contrary objections: Matt Flemming has requested that the
exact cause of error is reported by the driver (because otherwise the caller
will loose information), and you requested to distinguish only between
transmission errors and the rest.

Matt Flemming has pointed out that further changes in the code will request to
get the exact cause of error in the block layer, and that error codes might
have interpreted different according to the command class, so IMHO it is
better to transport the exact error cause into block.c and do the error
handling here, according to the type of request.


Right. As much as possible of the result should be parsed by the same
layer that assembles the request. So I think EPROTO should be left out
here and we can look at getting better integration with mmc_spi
responses in another patch.


This looks wrong. What happened to the original logic of retrying reads
on any error?

You mean error codes like EINVAL, ENOMEDIUM, ENODEV?
I there any sense in retrying non-recoverable errors? Or should I list the
non-recoverable errors inside the switch (and do nothing), and do a retry for
read calls on any other error code, so a new coded driver with a new error
code will get a read retry (but no write retry!) for this new code regardless
of making sense?


Yes. E.g. some limited hardware might not be able to report a timeout
properly so the driver will report EIO for many types of errors. We
should be robust enough to handle that.

And chunk seems unrelated to the rest of the patch.

Ah... no. This is part of sending the STOP command in case of an error.
And sending the STOP command in case of an error is needed for a working retry
of write commands in SPI mode. Without a STOP command, you can not retry a
broken write sequence because the card missed the STOP token.


Ok, "unrelated" might be a bit strong. But it isn't necessary to commit
them together. For the sake of keeping the patch small, I think this
should be a separate patch which will be committed before the retry
changes.

Could you also move those modifications to its own patch?

Why? It is meaningless without a retry logic for block writes.


See above. Reduces the size of this patch.

Rgds
--
-- Pierre Ossman

WARNING: This correspondence is being monitored by the
Swedish government. Make sure your server uses encryption
for SMTP traffic and consider using PGP for end-to-end
encryption.

Attachment: signature.asc
Description: PGP signature



Relevant Pages

  • Re: [PATCH] mmc_spi: do propper retry managment in the block layer - rewritten
    ... This patch was tested with induced transmission errors ... When you say "proper retry management", ... Writes were not retried at all, so every transmission error created a data ... the card will miss the stop token. ...
    (Linux-Kernel)
  • Re: [PATCH] mmc_spi: do propper retry managment in the block layer - 3rd try
    ... This could in theory happen when the card has some internal problem ... response to stop command is lost) ... But this will increase wear leveling problems for the rest of the request. ... didn't get a lot of noise in dmesg for the cases where a retry solved ...
    (Linux-Kernel)
  • Re: [PATCH] Correctly release and allocate a new request on TUR retries
    ... Jens Axboe wrote: ... I/O request being re-used, without being re-initialized properly. ... Prior to this patch, boots were failing consistently as in: ... repeated of this retry usage model. ...
    (Linux-Kernel)
  • Re: ath patch+hal for test
    ... couple of days ago I tried this exact process and wasn't successfull ... Also, the card I am using is probably a pretty common one, so I'd ... The patch worked EXCELLENT, not a single error ... laptop and use it all over the house ...
    (freebsd-current)
  • [RFC PATCH 0/8] rqbased-dm: request-based device-mapper
    ... I'm working on device-mapper multipath. ... and enables mapping at request level instead of bio level. ... The patch could be a basis of better dynamic load balancing. ... I/O mapping after bio merged is needed for better ...
    (Linux-Kernel)