Re: [RFC] killing the NR_IRQS arrays.



Arnd Bergmann <arnd@xxxxxxxx> writes:


I have to admit I still don't really understand how this works
at all. Can a driver that uses msi-x have different handlers
for each of those interrupts registered simultaneously?

Yes, and the irqs can be routed at different cpus independently.
However not all hardware supports all 4K irqs. 4K is the implementation
defined maximu. Although infiniband HCA's are rumored to support a
lot of irqs and it isn't uncommon for simpler nics to support 4 or so.

Conceptually think of it as having an irq controller embedded in your
pci device.

The MSI messages are writes to special addresses that are then
converted into CPU interrupts.

I would expect that instead there should be only one 'struct irq'
for the device, with the handler getting a 12 bit number argument.

No. That would be unnecessary coalescing. Even if that was what the
hardware layer gave us the (and it doesn't) the generic layers should
demux these things as much as is reasonable.


s390: got rid of irq numbers already

Yes. I should really look at that more and see if I could bring
s390 into the generic irq code with my planned changes.

I don't think there is much point in changing the s390 code, but
the way it is solved there may be interesting for other buses
as well. The interrupt handler there is not being registered
explicitly, but is part of the driver (in case of subchannel)
or of the device (in case of ccw_device) data structure.

Similarly, in a pci device, one could imagine that the
struct pci_driver contains a irq_handler_t member that
is registered from the pci_device_probe() function
if present.

Yes. There is some potential there. Although we would have to go
through an extra hoop to make it a pci specific handler type.


Note that we can even start converting device drivers first, before
moving away from irq numbers. A typical PCI driver should get
somewhat simpler by the conversion, and when they are all converted,
we can replace pci_dev->irq with a struct irq* under the covers.

Reasonable if it is easy and straight forward.
Something like pci_request_irq(dev,....) and the helper looks at
dev->irq under the covers and calls request_irq or whatever makes
sense. Is this what you are thinking. Examples would help me here.

Ok, I had an example in on of my previous posts, but based on the
discussion since then, it has become significantly simpler, basically
reducing the work to

struct irq *pci_irq_request(struct pci_device *dev,
irq_handler_t handler)
{
if (!dev->irq)
return -ENODEV;

return irq_request(irq, handler, IRQF_SHARED,
&dev->driver->name, dev);
}
int pci_irq_free(struct pci_device *dev)
{
return irq_free(dev->irq, dev);
}

The most significant change of this to the current code
would be that we can pass arguments down to irq_request
automatically, e.g. the irq handler can always get the
pci_device as its dev_id.

Yes. Mostly. Since dev_id is what is passed back to the irq handler,
it makes sense to pass the device when the irq is registered.
Passing the driver a pointer to the driver specific structure (not
the pci device) make a lot more of sense from an efficiency
standpoint. Now it may make sense to remove the irq parameter
from irq_handler_t, and require drivers to look at their dev_id to
see which irq they really are processing.

There is a danger here of making things so generic you don't have good
performance, or the code becomes unnecessarily complex.

For talking to user space I expect we will have numbers for a long time
to come yet.

I was wondering about that. Do you only mean /proc/interrupts or
are there other user interfaces we need to worry about?

Yes. There are other interfaces like /proc/irq/XXX/smp_affinity,
for irq migration. There are also device specific ioctls. There is
lspci. I don't know what all else, and given the current state of the
kernel it is hard to grep for.

For /proc/interrupts, what could break if we have interrupt numbers
only local to each controller and potentially duplicate numbers
in the list? It's good to be paranoid about changes to proc files,
but I can definitely see value in having meaningful interrupt
numbers in there instead of making up a more or less random mapping
to a flat number space.

Well I can have meaning full flat numbers and on i386 and x86_64
except for msi I have that. The problem is that for the numbers
to have meaning I get a very sparse usage of the numbers, because
very frequently the hardware interrupt controllers has pins that
are not connected up, so I have about an order of magnitude more
numbers then I have actually irqs in use. That is before I start
reserving irq numbers for MSI.

For MSI (since they cannot be shared) it would actually make a lot of
sense to make the numbers domain,bus,device,func,(Nth device irq) but
I can't because bus,device,func is 16bits worth of number. Add in
12 more bits for the worst case device assignment and I am up to
28 bits, and the rest of the 32bits can be used for domain or
something like that. So while the current unsigned int irq works fine
for that backing that with a fixed size array allocated at compile
time is just hopeless.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • Re: [RFC][PATCH] Cascaded interrupts: a simple solution
    ... IRQ layer and thus make the generic IRQ layer truly generic. ... information is permanently lost and we end up losing interrupts. ... clean way to handle this type of cascading via the 'return irq' method.) ... demultiplexing into the _demultiplexing handler_, ...
    (Linux-Kernel)
  • Re: [PATCH] FRV: do_gettimeofday() should no longer use tickadj
    ... But genirq allows you to have your own flow handlers... ... single hook, using the fasteoi handler. ... guess from looking at the IRQ priority and poking around in the hardware. ... You install a special flow handler (which means that there is ...
    (Linux-Kernel)
  • Re: [PATCH 1/5] Update Documentation/pci.txt
    ... Greg what would help driver writers get this right. ... add section "Legacy I/O Port free driver" by Kenji Kaneshige ... +Once the driver knows about a PCI device and takes ownership, ... -should call pci_disable_deviceto deallocate any IRQ resources, ...
    (Linux-Kernel)
  • Re: [PATCH RFC] e1000: clear ICR before requesting an IRQ line
    ... that the request_irq prints a warning if after calling the handler it ... int request_irq(unsigned int irq, irq_handler_t handler, ... I discovered that the e1000 driver handles the "fake" interrupt, which, ...
    (Linux-Kernel)
  • interruptible_sleep_on, interrupts and device drivers
    ... I'm writing a device driver for a UART to be used to drive a RS485 line. ... In the interrupt handler, when data becomes ready, a wakeup_interruptible ... If the test is made with IRQ closed, and IRQ are then enabled after the test ... the process will continue execution calling the sleep ...
    (comp.os.linux.embedded)