Re: acpi_evaluate_integer broken by design



On Wed, 26 Nov 2008 17:37:29 -0500 (EST)
Len Brown <lenb@xxxxxxxxxx> wrote:



Now I know why I had strange "scheduling in atomic" problems:
acpi_evaluate_integer() does malloc(..., irqs_disabled() ? GFP_ATOMIC
: GFP_KERNEL)... which is (of course) broken.

That is kinda weird. When did this all start happening?

There's no way to reliably tell if we need GFP_ATOMIC or not from
code, this one for example fails to detect spinlocks held.

Len, this looks like 2.6.28 material. But given the poor quality of
the changelog it is hard to be sure about this. Why isn't everyone
seeing these warnings? What did Pavel do to provoke these alleged
warnings? Nobody knows...

I don't know know why pavel sees this and nobody else --
maybe something unusual he's doing with suspend?

The reason that the ACPI code is littered with bogus
irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL)
is because, like boot, resume starts life with interrupts off.

yes, suspend's disabling of interrupts causes problems all over the place.

If we really do need to inspect global state to handle this, it would
be much better to create a new

bool resume_is_running_so_you_cant_sleep;

and to test that. Something which is clear, highly specific and which
cannot be accidentally triggered via other means.

I would prefer that resume and boot handle this the same way,
with system_state. However, a few years ago when I suggested
using system_state for resume, Andrew thought that was a very
bad idea. Andrew, do you still feel that way?

yep. We've had problems in the past with system_state, where people add
new states, and old code breaks. Plus as we add more dependencies on
system_state, that reduces our ability to add new states and makes it
harder to alter (ie: fix) system_state transitions, etc. Just run
away!

As I said above, if we have a piece of code which wants to know when a
separate piece of code is in a particualr state, it is better to add a
new state flag just for that application, rather than trying to infer
things from the heavily overloaded system_state.


Obviously, it is even better to do neither. It is a basic and
oft-reoccurring design mistake for a low-level piece of code to
hardwire its GFP_foo decision. The gfp_t should be passed in from the
callers, all the way down the chain from the top-level code which
actually knows what state this thread of control is in.

-Len

ps. I'll put this particular fix in my tree now.

bewdy. It's a good change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: Paragraph alignment
    ... and, six months after its release, not have a fix for it yet. ... Steve Rindsberg, PPT MVP ... have I been able to force it to occur no matter what fonts or other changes ... Nobody has included version specifics for OS ...
    (microsoft.public.mac.office.powerpoint)
  • Re: 463 kernel developers missing!
    ... Andrew was getting credit for Len Brown's patches, ... patches to fix the errors. ...
    (Linux-Kernel)
  • Re: Help!!!!! on FRG-7700
    ... Here is an actual *shortwave* post! ... either nobody will fix it cheap, or maybe they think it is ready to give ...
    (rec.radio.shortwave)
  • 2.6.2-rc2 Hangs on boot (was: [patch] Re: Kernels > 2.6.1-mm3 do not boot. - SOLVED)
    ... Andrew> Does this fix the test_wp_bit oops? ... nor does this fix my booting problemwith 2.6.2-rc2. ... hm, page 000fe000 reserved twice. ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)
  • Re: Iyonix upgrades
    ... groups163@xxxxxxxxxxxxxx (Andrew) wrote: ... Hardware mods - to fix known issues ... Firmware mod - To fix known screen/crash issues ... this is called the PAL upgrade. ...
    (comp.sys.acorn.hardware)