Re: [PATCH] 8250 UART backup timer



Alex Williamson wrote:
The patch below works around a minor bug found in the UART of the
remote management card used in many HP ia64 and parisc servers (aka the
Diva UARTs). The problem is that the UART does not reassert the THRE
interrupt if it has been previously cleared and the IIR THRI bit is
re-enabled. This can produce a very annoying failure mode when used as
a serial console, allowing a boot/reboot to hang indefinitely until an
RX interrupt kicks it into working again (ie. an unattended reboot could
stall).

To solve this problem, a backup timer is introduced that runs
alongside the standard interrupt driven mechanism. This timer wakes up
periodically, checks for a hang condition and gets characters moving
again. This backup mechanism is only enabled if the UART is detected as
having this problem, so systems without these UARTs will have no
additional overhead.

This version of the patch incorporates previous comments from Pavel
and removes races in the bug detection code. The test is now done
before the irq linking to prevent races with interrupt handler clearing
the THRE interrupt. Short delays and syncs are also added to ensure the
device is able to update register state before the result is tested.
Comments? Thanks,


I have seen this same bug in soft UART IP from "a major vendor."
did you had chance to test this patch on these machines to see if it
solves the problem?

Thanks,

--
Aristeu

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [PATCH] serial 8250: tighten test for using backup timer
    ... to enable the backup timer workaround. ... to catch UARTs that don't re-assert the THRE interrupt. ... break an old DesignWare UART I have in an SoC here. ... If the "interrupt" for this port doesn't correspond with any ...
    (Linux-Kernel)
  • Re: DoModal isnt reentrant but failure mode could be improved
    ... The parent receives a Windows Message that causes it to inform the user that a serial overrun occured. ... The reason such a Windows Message reaches the parent even while the first modal dialog is still on the screen is that it's not a keyboard or mouse input message, and CDialog's message loop retrieves it and dispatches it. ... Temporarily I seemed to understand the idea of trying to make the UART send an additional interrupt when the buffer's 8th byte gets filled but we still really want the UART to continue buffering while it's still necessary. ...
    (microsoft.public.vc.mfc)
  • Re: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git mast er
    ... serial driver PMC MSP71xx, ... interrupt, ... methods and not UART types. ... THRI interrupt -- it signifies that the TX shift register is empty, ...
    (Linux-Kernel)
  • [PATCH] 8250 UART backup timer
    ... The patch below works around a minor bug found in the UART of the ... The problem is that the UART does not reassert the THRE ... RX interrupt kicks it into working again (ie. an unattended reboot could ... static void serial8250_timeout ...
    (Linux-Kernel)
  • RE: [PATCH] serial driver PMC MSP71xx, kernel linux-mips.git mast er
    ... serial driver PMC MSP71xx, ... +#ifdef CONFIG_PMC_MSP ... write will cause an interrupt, ... of UART registers, it's not specific to the DesignWare UART. ...
    (Linux-Kernel)