Madwifi 'stuck beacon' problem

'Madwifi' is a partially open source driver for various 802.11
chipsets produced by Atheros. There are actually three drivers for
these chips, with Madwifi being the oldest and most
feature-complete. Specifically, it is the only one which supports
using such a card as wireless access point. The hardware itself is
fairly 'simplistic', with most of the AP-functionality provided by the
device driver, periodic transmission of beacon frames being among
this. The way this is (presumably) supposed to work is that the card
provides three different timers, the TBTT-timer (target beacon
transmission time), the DBA-timer (DMA beacon alert) and the
SWBA-timer (software beacon alert). All of these are programmed to
fire once every beacon period, with the DBA-timer starting a little
before the TBTT-timer (2us) and the SWBA-timer a little more (10us).
When the SWBA-timer fires, the card generates an interrupt, upon
reception of which the driver prepares the next beacon frame for
transmission. When the DBA-timer expires, the card (presumably) starts
to DMA the beacon data from RAM to the TX-FIFO of the NIC (I have no
real idea what the TBTT-timer is used for, except that it must be
active). The driver uses a specific hardware transmission queue for
beacons (9 for 'my' card). The beacon send routine contains some logic
to detect if the last beacon was actually sent before queueing the
next one, being comprised of checking that the pending frames count
for the beacon queue is zero and that the 'DMA enable' flag associated
with it is clear. When either of both tests fails, a 'stuck beacon' is
reported and the current beacon period skipped. After eleven 'stuck
beacons' in sequence, a reset of the card is done in order to get
things going again. A well-known issue with these chipsets and the
driver is that the 'stuck beacon' condition will persist indefinitely,
which causes the card to effectively cease to function as an AP (cf

Judging from

* Is it possible that this could fix the terrible stuck beacon nightmare?

answer: in our own integration madwifi version we dont have
this stuck beacon nightmare anymore, patches for madwifi are
commited by nbd as far as i know

the only people (I know of) having access to the source code of the
Atheros hardware abstraction layer believe the issue to be fictious.
One of the Atheros chipsets supported by the driver (AR5416) is
supposed to provide 802.11n support for the 'next generation' hardware
of my employer (which I believe to be fictious :->>). The persistent
'stuck beacon' condition can reliably be triggered by trying to use
the wlan interface of an IPhone with this AR5416-based AP, decreasing
'commercial viability' of this approach somewhat.

It is possible (for the card I have here, the two IPhones used for
testing and a random laptop with an integrated Atheros 802.11 NIC
causing the same issue to occur) to work around the problem by not
using the DBA-timer (which only adds an unspecified delay to the
beacon transmission for a reason I cannot presently imagine), ie not
configuring it during 'beaconinit' (name of the routine programming
the timers) and clearing the AR_Q_MISC_FSP_DBA_GATED flag of the
beacon transmission queue after the the ath_beaconq_config-routine
(if_ath.c) has finished configuring it (after the call to
ath_hal_resettxqueue near its end has returned). This causes beacon
transmissions to start immediatly after the ath_beacon_send routine has
enabled DMA for the beacon queue in response to a SWBA-interrupt. The
modified driver has survived having two IPhones associated with it
(and actually being used) for five to six hours.

NB: This information has been determined by perusing the publically
available 'hardware abstraction code' in the open source ath9k-linux
driver and experiments with the card. The modified driver has had very
little testing and if someone manages to, eg, fry his NIC by trying to
use my suggestion above, 'let his blood be on his own hands'. The
necessary register addresses and constants to do so can be found in
the ath9k source code being part of the 'wireless testing' Linux
kernel tree.