Re: [PATCH]Avoid the overflow when calculate the proportion of bdi quota




On Fri, 2007-12-14 at 16:26 +0800, zhejiang wrote:
On Thu, 2007-12-13 at 22:54 +0100, Peter Zijlstra wrote:
On Thu, 2007-12-13 at 11:30 +0800, zhejiang wrote:
__percpu_counter_add() cache the result in percpu variable until it
exceeds the batch.
The prop_norm_percpu() use the percpu_counter_read(&pl->events) to read
the counter ,and use percpu_counter_add(&pl->events, -half) to half the
counter.

There are potential problems:
1.The counter may be negative
2.After some calculation, it may be converted to a big positive number.

1.
For example, the batch is 32, when the bdi add 32 to the pl->events,
the pl->events->count will be 32.Suppose one of the percpu counter is 1.

In the prop_norm_percpu(),the half will be 16.Because it is under the
batch, the pl->events->count won't be modified and one of the percpu
counter may be -15. If call the prop_norm_percpu() again, the half will
still be 16,though it should be 8.The percpu counter may be -31.
Now, there pl->events->count is still 32.
If do the third calculation, the percpu counter will be -47, it will
be merged into the pl->evnets->count.Then pl->events->count will be
negative.

2.When the pl->events->count is negative,
unsigned long val = percpu_counter_read(&pl->events);
This statement may return a negative number, so the val would be a big
number.Because of the overflow, the pl->events->count will be converted
into a big positive number after some calculation.

Because of the overflow, I catch some very big numerators when call the
prop_fraction_percpu().

I think that it should use percpu_counter_sum() instead of the
percpu_counter_read() to be more robust.

diff -Nur a/proportions.c b/proportions.c
--- a/proportions.c 2007-12-12 11:05:59.000000000 +0800
+++ b/proportions.c 2007-12-13 11:05:40.000000000 +0800
@@ -241,7 +241,7 @@
* can never result in a negative number.
*/
while (pl->period != global_period) {
- unsigned long val = percpu_counter_read(&pl->events);
+ unsigned long val = percpu_counter_sum(&pl->events);
unsigned long half = (val + 1) >> 1;

/*


_sum is unacceptably slow here. I'm thinking something like
bdi_stat_error() as used in balance_dirty_pages() would also solve this
issue, no?

_sum is slow, but it can avoid over-subtraction when the counter is less
than 2 * BDI_STAT_BATCH.

How about do the calculation like this patch?

Change the val and half into long type, if we found the val is negative,
it means that we did subtraction too much.
So we actually add some value to it.

diff -Nur a/proportions.c b/proportions.c
--- a/proportions.c 2007-12-12 11:05:59.000000000 +0800
+++ b/proportions.c 2007-12-14 16:21:08.000000000 +0800
@@ -241,8 +241,18 @@
* can never result in a negative number.
*/
while (pl->period != global_period) {
- unsigned long val = percpu_counter_read(&pl->events);
- unsigned long half = (val + 1) >> 1;
+ long val = percpu_counter_read(&pl->events);
+ long half;
+
+ if (val >= 0) {
+ half = (val + 1) >> 1;
+ } else {
+ /*
+ *If val is negative, the counter has been
subtracted too much.
+ *Do some compensation here.
+ */
+ half = (val - 1) / 2;
+ }

/*
* Half of zero won't be much less, break out.

How about something like this:

diff --git a/lib/proportions.c b/lib/proportions.c
index 332d8c5..4afb330 100644
--- a/lib/proportions.c
+++ b/lib/proportions.c
@@ -190,6 +190,8 @@ prop_adjust_shift(int *pl_shift, unsigned long *pl_period, int new_shift)
* PERCPU
*/

+#define PROP_BATCH (8*(1+ilog2(nr_cpu_ids)))
+
int prop_local_init_percpu(struct prop_local_percpu *pl)
{
spin_lock_init(&pl->lock);
@@ -203,6 +205,16 @@ void prop_local_destroy_percpu(struct prop_local_percpu *pl)
percpu_counter_destroy(&pl->events);
}

+static unsigned long prop_percpu_read(struct prop_local_percpu *pl)
+{
+ unsigned long val = percpu_counter_read(&pl->events);
+
+ if (val < (nr_cpu_ids * PROP_BATCH))
+ val = percpu_counter_sum(&pl->events);
+
+ return val;
+}
+
/*
* Catch up with missed period expirations.
*
@@ -241,7 +253,7 @@ void prop_norm_percpu(struct prop_global *pg, struct prop_local_percpu *pl)
* can never result in a negative number.
*/
while (pl->period != global_period) {
- unsigned long val = percpu_counter_read(&pl->events);
+ unsigned long val = prop_percpu_read(pl);
unsigned long half = (val + 1) >> 1;

/*
@@ -252,7 +264,7 @@ void prop_norm_percpu(struct prop_global *pg, struct prop_local_percpu *pl)
if (!val)
break;

- percpu_counter_add(&pl->events, -half);
+ __percpu_counter_add(&pl->events, -half, PROP_BATCH);
pl->period += period;
}
pl->period = global_period;
@@ -267,7 +279,7 @@ void __prop_inc_percpu(struct prop_descriptor *pd, struct prop_local_percpu *pl)
struct prop_global *pg = prop_get_global(pd);

prop_norm_percpu(pg, pl);
- percpu_counter_add(&pl->events, 1);
+ __percpu_counter_add(&pl->events, 1, PROP_BATCH);
percpu_counter_add(&pg->events, 1);
prop_put_global(pd, pg);
}

Attachment: signature.asc
Description: This is a digitally signed message part



Relevant Pages

  • [PATCH]Avoid the overflow when calculate the proportion of bdi quota
    ... 2.After some calculation, it may be converted to a big positive number. ... For example, the batch is 32, when the bdi add 32 to the pl->events, ... If do the third calculation, the percpu counter will be -47, it will ... use a regular subtraction loop. ...
    (Linux-Kernel)
  • Re: [PATCH]Avoid the overflow when calculate the proportion of bdi quota
    ... 2.After some calculation, it may be converted to a big positive number. ... For example, the batch is 32, when the bdi add 32 to the pl->events, ... _sum is unacceptably slow here. ... it means that we did subtraction too much. ...
    (Linux-Kernel)
  • Re: [PATCH]Avoid the overflow when calculate the proportion of bdi quota
    ... There are potential problems: ... 2.After some calculation, it may be converted to a big positive number. ... For example, the batch is 32, when the bdi add 32 to the pl->events, ... If do the third calculation, the percpu counter will be -47, it will ...
    (Linux-Kernel)
  • Re: Odd VLOOKUP result.
    ... In the same way for this calculation, A3 and A4 are correctly not equal to ... Since Excel ... unwanted residue of finite precision approximations. ... that the subtraction is not the final operation, ...
    (microsoft.public.excel.worksheet.functions)
  • RE: Adjust Date if Start time -> Stop time is overnight
    ... subtraction. ... "Renraf" wrote: ... After I perform the calculation, the field displays as the full date. ... Dim NewStop As Date ...
    (microsoft.public.access.formscoding)

Loading