Re: pci error recovery procedure



On Thu, 2006-09-07 at 04:39, Linas Vepstas wrote:
On Wed, Sep 06, 2006 at 10:04:31AM +0800, Zhang, Yanmin wrote:
On Wed, 2006-09-06 at 03:17, Linas Vepstas wrote:
On Mon, Sep 04, 2006 at 01:47:30PM +0800, Zhang, Yanmin wrote:

Again, consider the multi-function cards. On pSeries, I can only enable
DMA on a per-slot basis, not a per-function basis. So if one driver
enables DMA before some other driver has reset appropriately, everything
breaks.
Does here 'reset' mean hardware slot reset?

I should have said: If one driver of a multi-function card enables DMA before
another driver has stabilized its harware, then everything breaks.
What's another driver's hardware? A function of the previous multi-function
card? Or a function of another device?

Yes. Either. Both. Doesn't matter. Enabling DMA is "granular" at a
different size scale than pci functions, and possibly even pci devices
or slots, dependeing on the architecture. Before DMA can be enabled,
*all* affected device drivers have to be approve, and have to be ready
for it.
If we enabled both DMA and MMIO at the same time, there are many cases
where the card will immediately trap again -- for example, if its
DMA'ing to some crazy address. Thus, typically, one wants DMA disabled
until after the card reset. Without the mmio_enabled() reset, there
is no way of doing this.

Did you asume the card reset is executed by callback mmio_enabled?

I am assuming that, when a driver receives the mmio_enabled() callback,
it will perform some sort of register i/o. For example, I am currently
planning to modify the e1000 driver to do the following:

-- The error_occurred() callback returns PCI_ERS_RESULT_CAN_RECOVER
-- The arch enables mmio, and then calls the mmio_enabled() callback.
-- The mmio_enabled() callback in the driver takes a full dump of all
of the regsters on the card. It then returns PCI_ERS_RESULT_NEED_RESET
Such dumping are random data and might be useless. The error recovery procedures
are to process pci hardware errors instead of device driver bug.

-- The arch performs the full electrical #RST of device. Recovery from
this point proceeds as before.
The steps are exquisite. Scenario:

The e1000 NIC and another device (maybe a function) are on the same bus. The
error_detected of the second device returns PCI_ERS_RESULT_NEED_RESET, so although
error_detected of e1000 returns PCI_ERS_RESULT_CAN_RECOVER, the slot will
be reset immediately, then error recovery will go to call slot_reset callback
directly. The mmio_enabled is not called.

My above scenario is just to say something is easy to be out of control if the steps
are complicated.


Again, consider the multi-function cards. On pSeries, I can only enable
DMA on a per-slot basis, not a per-function basis. So if one driver
enables DMA before some other driver has reset appropriately, everything
breaks.

What does 'I' above stand for? The platform error recovery procedure

Yes. The pSeries platform error recovery procedure can only enable DMA
on a per-slot basis.

I guess it means platform, that is,
only platform enables DMA for the whole slot.

Yes.

But why does the last sentence
become driver enables DMA?

In your proposal, you were suggesting that MMIO and DMA be enabled with
one and the same routine, and I was attempting to explain why that can't
work.
Thanks for your explanations. My point is that if driver could enable DMA,
it could do so in the new error_resume. Driver should do more checking before
enabling DMA.

Your scenario only exists when:
1) Only platform could enable DMA and enable it per-slot instead of per-function.
2) And at least one device doesn't want a hard slot reset to recover while
all other impacted devices also don't want a hard slot; Because if one device want a
hard reset, mmio_enabled of all impacted drivers won't be called.
3) And at least one device's DMA is crazy.

If using my new API, I just need destroy one condition above. My requirement is:
Only if a device uses DMA and the driver is not sure or sure if DMA is pending,
its error_detected should return PCI_ERS_RESULT_NEED_RESET. Otherwise, error_detected
is allowed to return whatever.


Could driver enable DMA for a function?

No, not on pSeries hardware.

If mmio_enabled is not used currently, I think we could delete it firstly. Later on,
if a platform really need it, we could add it, so we could keep the simplied codes.

It would be very difficult to add it later. And it would be especially
silly, given that someone would find this discussion in the mailing list
archives.
You stick to keep mmio_enabled which is not used currently, but if there will be
a new platform who uses a more fine-grained steps to recover pci/pci-e, would
you say 'it would be very difficut' and refuse add new callbacks?

Yes.
It's not fare to such other platforms although I have no such example now.


It doesn't prevent software from merging some steps. And, we want
to implement pci/pci-e error recovery for more platforms instead of just
pSeries.

Yes. The API was designed so that it could be supported on every
current and future platform we could think of. You haven't yet
claimed that "pci-e can't be supported".
Current error handler infrastructure could support pci-e, but I want a better
solution to faciliate driver developers to add error handlers more easily. My
startpoint is driver developer. If they are not willing to add error handlers,
it's impossible to do so for all drivers by you and me.


Based on what
I understand, changing the API wouldn't make the implementation
any easier. (It is very easy to call a callback, and then
examine its return value.
It's not easy. Just like above scenario, mmio_enabled might be jumped over when
coordinating 2 more devices.
Checking current e100/e1000/ipr error handlers, they look ugly.

Removing a few callbacks does not
materially simplify the recovery mechanism. Managing these
callbacks is *not* the hard part of implementing this thing.)
Above comments are totally from error recovery design point of view. No considering
for driver developers.

BTW, most discussion is about if mmio_enabled should be deleted. As for merging
slot_reset and resume, my reason is that there is no platform specific operation
between calling slot_reset and resume.

Yanmin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • dma timeout
    ... my *old hdd* got up with dma mode. ... PIIX4: IDE controller at PCI slot 00:04.1 ... ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 ... hda: attached ide-disk driver. ...
    (Debian-User)
  • Re: [RFC v2 2/5] dmaengine: Add slave DMA interface
    ... DMA engine driver is free to decide on its own. ... with, than slave transfers, which may be quite slow depending on the ...
    (Linux-Kernel)
  • Re: [PATCH] Blackfin: blackfin on-chip SPI controller driver
    ... patch, rather than initial-plus-cleanups. ... Please put this in Kconfig up with the other SPI controller drivers, ... relevant points in the driver. ... place to reverse any DMA mappings ... ...
    (Linux-Kernel)
  • [PATCH RFC v5] net: add PCINet driver
    ... PCI bus as its transport mechanism. ... RFC v4 -> RFC v5: ... use seperate DMA channels for RX and TX ... Thanks to all of those who have posted comments about the driver. ...
    (Linux-Kernel)
  • Re: can device drivers return non-ram via vm_ops->nopage?
    ... so for them mmap'd PIO is easy. ... Don't give driver writers the opportunity to think ... If there are special DMA requirements of a particular bus or platform, ...
    (Linux-Kernel)

Loading