Re: tbench regression in 2.6.25-rc1
- From: David Miller <davem@xxxxxxxxxxxxx>
- Date: Fri, 15 Feb 2008 15:22:00 -0800 (PST)
From: Eric Dumazet <dada1@xxxxxxxxxxxxx>
Date: Fri, 15 Feb 2008 15:21:48 +0100
On linux-2.6.25-rc1 x86_64 :
offsetof(struct dst_entry, lastuse)=0xb0
offsetof(struct dst_entry, __refcnt)=0xb8
offsetof(struct dst_entry, __use)=0xbc
offsetof(struct dst_entry, next)=0xc0
So it should be optimal... I dont know why tbench prefers __refcnt being
on 0xc0, since in this case lastuse will be on a different cache line...
Each incoming IP packet will need to change lastuse, __refcnt and __use,
so keeping them in the same cache line is a win.
I suspect then that even this patch could help tbench, since it avoids
writing lastuse...
I think your suspicions are right, and even moreso
it helps to keep __refcnt out of the same cache line
as input/output/ops which are read-almost-entirely :-)
I haven't done an exhaustive analysis, but it seems that
the write traffic to lastuse and __refcnt are about the
same. However if we find that __refcnt gets hit more
than lastuse in this workload, it explains the regression.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
- Follow-Ups:
- Re: tbench regression in 2.6.25-rc1
- From: Zhang, Yanmin
- Re: tbench regression in 2.6.25-rc1
- References:
- Re: tbench regression in 2.6.25-rc1
- From: Eric Dumazet
- Re: tbench regression in 2.6.25-rc1
- From: Zhang, Yanmin
- Re: tbench regression in 2.6.25-rc1
- From: Eric Dumazet
- Re: tbench regression in 2.6.25-rc1
- Prev by Date: Re: [patch] move wakeup code to .c
- Next by Date: pci_device_id definition cleanups
- Previous by thread: Re: tbench regression in 2.6.25-rc1
- Next by thread: Re: tbench regression in 2.6.25-rc1
- Index(es):