RE: refcount leak in pci_get_device()?



And the pci_get_dev_by_id() is not safe again the PCI device removal. It might fire a warning in bus_find_device() when reference count of the knode_bus is decreased to 0 by pci_remove_bus().

Need to document this or a fix?

Thanks,
Yu

On Friday, August 22, 2008 6:15 AM, Alex Chiang wrote:
* Greg KH <gregkh@xxxxxxx>:
On Thu, Aug 21, 2008 at 02:25:04PM -0600, Matthew Wilcox wrote:
On Thu, Aug 21, 2008 at 02:19:18PM -0600, Alex Chiang wrote:
#define for_each_pci_dev(d) while ((d = pci_get_device(PCI_ANY_ID,
PCI_ANY_ID, d)) != NULL)

That eventually calls pci_get_dev_by_id(), which increases the
refcount on the device, but never decrements it.

Looks like that change in behavior happened here:

PCI: clean up search.c a lot
95247b57ed844511a212265b45cf9a919753aea1

pci_get_device() used to decrement the refcount, but no longer
does.

Thanks to Matthew Wilcox for helping me get this far...

Like I said, I'm still trying to track down my particular issue,
but I'd like to get your opinion on this.

In particular, I'd like to know whether this should be fixed by
pci_get_dev_by_id() decrementing the refcount of from/dev_start,
pci_get_subsys() decrementing 'from', or by bus_find_device()
decrementing 'start'. It looks like bus_find_device() is the place
where this should logically happen, but the kerneldoc doesn't document
the intended behaviour.

Ah, no the driver core isn't supposed to do this, it's something the pci
functions do out of "niceness" as that's how we can use them in an
iterator properly.

Does the following (untested) patch fix the issue for you all?

Perfect, yes.

I applied it, and observed the refcount on the device I was using
to debug this problem.

I was able to modprobe acpiphp successfully, and echo 0 into the
device's 'power' file.

I then watched the rest of the hotplug core do its thing,
decrementing the refcount properly along the way, and at the end,
we did call pci_release_dev(), as I originally expected to. :)

Just to verify, I toggled the device's power a few times (echoing
a 1 and then a 0, etc.) and watched the refcounts. After each
offline, we properly called pci_release_dev().

Thanks for fixing this. It fixes a pretty bad leak in the hotplug
core (we were leaking an entire struct pci_dev * # of functions
for each offlined card, the first time around; subsequent
onlines/offlines were ok).

Tested-by: Alex Chiang <achiang@xxxxxx>

/ac


thanks,

greg k-h

--------------
Subject: PCI: fix reference leak in pci_get_dev_by_id()

From: Greg Kroah-Hartman <gregkh@xxxxxxx>

Alex Chiang and Matthew Wilcox pointed out that pci_get_dev_by_id() does
not properly decrement the reference on the from pointer if it is
present, like the documentation for the function states it will.

Cc: Matthew Wilcox <matthew@xxxxxx>
Cc: Alex Chiang <achiang@xxxxxx>
Cc: stable <stable@xxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxx>


diff --git a/drivers/pci/search.c b/drivers/pci/search.c
index 217814f..3b3b5f1 100644
--- a/drivers/pci/search.c
+++ b/drivers/pci/search.c
@@ -280,6 +280,8 @@ static struct pci_dev *pci_get_dev_by_id(const struct
pci_device_id *id,
match_pci_dev_by_id);
if (dev)
pdev = to_pci_dev(dev);
+ if (from)
+ pci_dev_put(from);
return pdev;
}


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • acquire barrier for atomic reference counting
    ... which case you need acquire semantics. ... The reasoning on the atomic decrement has been that release semantics ... before the refcount is decremented (since that could potentially allow ... with the destruction code interfering with earlier operations on the ...
    (comp.programming.threads)
  • Re: refcount leak in pci_get_device()?
    ... refcount on the device, but never decrements it. ... Ah, no the driver core isn't supposed to do this, it's something the pci ... not properly decrement the reference on the from pointer if it is ...
    (Linux-Kernel)
  • Re: [BUG] 2.6.22-rc3-mm1 remove bluetooth usb adapter caused kmalloc bug
    ... Note that the corruption seems to have its cause in a decrement done at ... offset 16 into the object pointing to the refcount in struct hci_dev. ... it looks like the refcount was decremented after the object was freed. ... And if this is a bug, should I write another post to list? ...
    (Linux-Kernel)
  • Re: [BUG] 2.6.22-rc3-mm1 remove bluetooth usb adapter caused kmalloc bug
    ... Note that the corruption seems to have its cause in a decrement done at ... offset 16 into the object pointing to the refcount in struct hci_dev. ... it looks like the refcount was decremented after the object was freed. ...
    (Linux-Kernel)
  • Re: [PATCH] fix page leak during core dump
    ... This patch looks OK to me, although a refcount leak on the ZERO_PAGE is ... It used to be the case that we'd ignore attempts to change the refcount on ... Nick, Hugh: agree? ... Easier to fix than work out ...
    (Linux-Kernel)