Re: [BUG] re-modprobe a nand controller driver module will cause system crash.



Hi folks,

Did anyone meet this issue before?

-Bryan

On Thu, Oct 16, 2008 at 6:56 PM, Bryan Wu <cooloney@xxxxxxxxxx> wrote:
Hi folks,

These days I found a subtle bug which should be related with mtdcore layers.
The detailed story is located at
https://blackfin.uclinux.org/gf/project/uclinux-dist/tracker/?action=TrackerItemEdit&tracker_id=141&tracker_item_id=4463.

Briefly speaking,
1) modprobe a nand controller driver to add_mtd_paritition().
2) add_mtd_partition->add_devices->blktrans_notify_add->mtdblock_add_mtd->add_mtd_blktrans_dev
3) in add_mtd_blktrans_dev, alloc_disk will be called to create a new
gendisk structure according to the partition setting.
4) "gd->queue = tr->blkcore_priv->rq;"
No matter how many partitions (in my test, 2 partitions), there
will be the same number gendisk structures but just 1 queue.
They all use the same request_queue which is created in
register_mtd_blktrans.
5) mtdblockd kthread handles this request_queue for mtdblock layer.
6) There is one backing_dev_info structure member (not pointer) in
request_queue. so for several mtd partitions (serveral gendisks) there
is only one bdi structure instance.
7) So the problem is in add_disk(),
bdi_register_dev(bdi, MKDEV(disk->major, disk->first_minor));
For 1st partition mtdblock0, it will create /sys/class/bdi/31:0
and register information in bdi structure instance.
Then for 2nd partition mtdblock1, because the bdi structure
instance is the same as the 1st partition, it will overwrite bdi
structure and create /sys/class/bdi/31:1.
So the bdi info of 1st partition are totally lost.
8) When we rmmod the nand controller driver, del_mtd_partition will
only remove /sys/class/bdi/31:1 but left 1st partition
/sys/class/bdi/31:0 there.
9) modprobe again will let the bug show up.

I found this bug does not relate with my nand flash controller driver
and it should be fixed in mtdblock layer.
And if we just add only one partition, there is no such bug at all. I
tried to solve this bug, but it related with
mtdblock/mtd_blktrans/block/bdi. It is diffcult for me to find a way
to satisfy all the parts with minimal changes.

IMHO, can we just simply remove the bdi_register_dev (in add_disk) and
bdi_unregister_dev (in unlink_disk)?

P.S. I also found this bug in latest 2.6.27 kernel mainline.

Thanks
-Bryan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages

  • [BUG] re-modprobe a nand controller driver module will cause system crash.
    ... These days I found a subtle bug which should be related with mtdcore layers. ... and register information in bdi structure instance. ... Then for 2nd partition mtdblock1, ... When we rmmod the nand controller driver, ...
    (Linux-Kernel)
  • Re: Please find my 2G
    ... But comparing the messages, this was Adam Hawes] ... When you eventually rebooted the partition was mounted and the ... I do not see that there is a bug in mc here. ... what do you find there when you umount it ...
    (Ubuntu)
  • [BUG] 2.6.17 hangs on ppc/ia64/parisc and wont load init
    ... I have this very strange bug and i know this is not the kind of bug ... different hw (ppc, ia64, parisc) running totally different workloads. ... reiserfs root disk) hung using 2.6.17-ck1 kernel and debian etch. ... I had a second system partition on the same disk and was ...
    (Linux-Kernel)
  • FC3: possible bug in diskdruid
    ... in the final FC3 installation. ... possible bug in diskdruid FC3 latest RC ... list in which the space occupied by hda1 (an ntfs partition) was also ...
    (Fedora)
  • Bug Report: HTML
    ... The 100% should refer to the layer created with ... the div-tag. ... But it seems that the IE doesn't interpret ... Is this an already known bug? ...
    (microsoft.public.windows.inetexplorer.ie6.browser)

Loading