Re: [updated] [Patch 6/8] delay accounting usage of taskstats interface



On Tue, May 09, 2006 at 11:46:48AM +0000, Thomas Gleixner wrote:
On Fri, 2006-05-05 at 00:14 +0530, Balbir Singh wrote:

#else
static inline void delayacct_set_flag(int flag)
{}

Please make this

static inline void delayacct_set_flag(int flag) { }

Same in all other files

Will do.


@@ -89,6 +99,9 @@ static inline void delayacct_blkio_start
{}
static inline void delayacct_blkio_end(void)
{}
+static inline int delayacct_add_tsk(struct taskstats *d,
+ struct task_struct *tsk)
+{ return 0; }

{
return 0;
}

+ *
+ * xxx_count is the number of delay values recorded
+ * xxx_delay_total is the corresponding cumulative delay in nanoseconds
+ *
+ * xxx_delay_total wraps around to zero on overflow
+ * xxx_count incremented regardless of overflow
+ */

Please use a structure for that.

struct delay_account {
ktime_t delay;
u64 count;
};

Also instead of having tons of fields added, please use something like

enum {
CPU_ACCT,
BLKIO_ACCT,
....,
MAX_ACCT_TYPES
};
struct delay_account stats[MAX_ACCT_TYPES];


Yes, this would make sense if all the fields were of type delay_account.
currently, only CPU, BLKIO and SWAPIN are of that type. The new
fields that will be added to this structure (since this is a part of
a statistics interface now) could potentially belong to other subsystems.


+ /* Delay waiting for cpu, while runnable
+ * count, delay_total NOT updated atomically
+ */
+ __u64 cpu_count;
+ __u64 cpu_delay_total;
+ /* Following four fields atomically updated using task->delays->lock */
+
+ /* Delay waiting for synchronous block I/O to complete
+ * does not account for delays in I/O submission
+ */
+ __u64 blkio_count;
+ __u64 blkio_delay_total;
+
+ /* Delay waiting for page fault I/O (swap in only) */
+ __u64 swapin_count;
+ __u64 swapin_delay_total;
+
+ /* cpu "wall-clock" running time
+ * On some architectures, value will adjust for cpu time stolen
+ * from the kernel in involuntary waits due to virtualization.
+ * Value is cumulative, in nanoseconds, without a corresponding count
+ * and wraps around to zero silently on overflow

When will this overflow happen ? 2^64 nsec ~= 584 years


Detecting a potential overflow is not a bad thing IMHO. Currently
the existing data type makes this overflow very rare.


+int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
+{

Please add some comment what this code does.

Will do


+ s64 tmp;
+ struct timespec ts;
+ unsigned long t1,t2,t3;
+
+ tmp = (s64)d->cpu_run_real_total;
+ tmp += (u64)(tsk->utime + tsk->stime) * TICK_NSEC;
+ d->cpu_run_real_total = (tmp < (s64)d->cpu_run_real_total) ? 0 : tmp;
+
+ /*
+ * No locking available for sched_info (and too expensive to add one)
+ * Mitigate by taking snapshot of values
+ */
+ t1 = tsk->sched_info.pcnt;
+ t2 = tsk->sched_info.run_delay;
+ t3 = tsk->sched_info.cpu_time;
+
+ d->cpu_count += t1;
+
+ jiffies_to_timespec(t2, &ts);
+ tmp = (s64)d->cpu_delay_total + timespec_to_ns(&ts);
+ d->cpu_delay_total = (tmp < (s64)d->cpu_delay_total) ? 0 : tmp;
+
+ tmp = (s64)d->cpu_run_virtual_total + (s64)jiffies_to_usecs(t3) * 1000;
+ d->cpu_run_virtual_total =
+ (tmp < (s64)d->cpu_run_virtual_total) ? 0 : tmp;
+
+ /* zero XXX_total, non-zero XXX_count implies XXX stat overflowed */
+
+ spin_lock(&tsk->delays->lock);
+ tmp = d->blkio_delay_total + tsk->delays->blkio_delay;
+ d->blkio_delay_total = (tmp < d->blkio_delay_total) ? 0 : tmp;

I really do not get whatfor all these comparisons are. How can those
values get > 2^63 within a reasonable timeframe ?

Its defensive programming, granted that it is probably not required.


+ tmp = d->swapin_delay_total + tsk->delays->swapin_delay;
+ d->swapin_delay_total = (tmp < d->swapin_delay_total) ? 0 : tmp;
+ d->blkio_count += tsk->delays->blkio_count;
+ d->swapin_count += tsk->delays->swapin_count;
+ spin_unlock(&tsk->delays->lock);
+ return 0;
+}
diff -puN kernel/taskstats.c~delayacct-taskstats kernel/taskstats.c
--- linux-2.6.17-rc3/kernel/taskstats.c~delayacct-taskstats 2006-05-04 09:31:59.000000000 +0530
+++ linux-2.6.17-rc3-balbir/kernel/taskstats.c 2006-05-04 11:27:53.000000000 +0530
@@ -18,6 +18,7 @@

#include <linux/kernel.h>
#include <linux/taskstats_kern.h>
+#include <linux/delayacct.h>
#include <net/genetlink.h>
#include <asm/atomic.h>

@@ -120,7 +121,10 @@ static int fill_pid(pid_t pid, struct ta
* goto err;
*/

-err:
+ rc = delayacct_add_tsk(stats, tsk);
+ stats->version = TASKSTATS_VERSION;
+
+ /* Define err: label here if needed */

Please remove that comment. The err label goes to a place where it makes
sense.

Will fix this.


put_task_struct(tsk);
return rc;

@@ -152,8 +156,14 @@ static int fill_tgid(pid_t tgid, struct
* break;
*/

+ rc = delayacct_add_tsk(stats, tsk);
+ if (rc)
+ break;
+
} while_each_thread(first, tsk);
read_unlock(&tasklist_lock);
+ stats->version = TASKSTATS_VERSION;
+

/*
* Accounting subsytems can also add calls here if they don't
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/


Thanks for your review,
Balbir Singh,
Linux Technology Center,
IBM Software Labs
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



Relevant Pages