Re: sched_domains + NUMA issue

From: William Lee Irwin III (wli_at_holomorphy.com)
Date: 08/29/04

  • Next message: Jesse Barnes: "Re: SMP Panic caused by [PATCH] sched: consolidate sched domains"
    Date:	Sun, 29 Aug 2004 09:48:25 -0700
    To: Anton Blanchard <anton@samba.org>
    
    

    On Sun, Aug 29, 2004 at 09:18:55PM +1000, Anton Blanchard wrote:
    > We are seeing errors in the sched domains debug code when SMT + NUMA is
    > enabled. Nathan pointed out that the recent change to limit the number
    > of nodes in a scheduling group may be causing this - in particular
    > sched_domain_node_span.
    > It looks like ia64 are the only ones implementing a reasonable
    > node_distance, the others just do:
    > #define node_distance(from,to) (from != to)
    > On these architectures I wonder if we should disable the
    > sched_domain_node_span code since we will just get a random grouping of
    > cpus.

    For fsck's sake... macro writers need to exercise more discipline.

    Index: wait-2.6.9-rc1-mm1/include/linux/topology.h
    ===================================================================
    --- wait-2.6.9-rc1-mm1.orig/include/linux/topology.h 2004-08-24 00:03:18.000000000 -0700
    +++ wait-2.6.9-rc1-mm1/include/linux/topology.h 2004-08-29 09:44:35.932705488 -0700
    @@ -55,7 +55,7 @@
             for (node = 0; node < numnodes; node = __next_node_with_cpus(node))
     
     #ifndef node_distance
    -#define node_distance(from,to) (from != to)
    +#define node_distance(from,to) ((from) != (to))
     #endif
     #ifndef PENALTY_FOR_NODE_WITH_CPUS
     #define PENALTY_FOR_NODE_WITH_CPUS (1)
    Index: wait-2.6.9-rc1-mm1/include/asm-ia64/numa.h
    ===================================================================
    --- wait-2.6.9-rc1-mm1.orig/include/asm-ia64/numa.h 2004-08-24 00:02:26.000000000 -0700
    +++ wait-2.6.9-rc1-mm1/include/asm-ia64/numa.h 2004-08-29 09:45:07.223948496 -0700
    @@ -59,7 +59,7 @@
      */
     
     extern u8 numa_slit[MAX_NUMNODES * MAX_NUMNODES];
    -#define node_distance(from,to) (numa_slit[from * numnodes + to])
    +#define node_distance(from,to) (numa_slit[(from) * numnodes + (to)])
     
     extern int paddr_to_nid(unsigned long paddr);
     
    Index: wait-2.6.9-rc1-mm1/include/asm-i386/topology.h
    ===================================================================
    --- wait-2.6.9-rc1-mm1.orig/include/asm-i386/topology.h 2004-08-24 00:02:20.000000000 -0700
    +++ wait-2.6.9-rc1-mm1/include/asm-i386/topology.h 2004-08-29 09:45:24.973250192 -0700
    @@ -67,7 +67,7 @@
     }
     
     /* Node-to-Node distance */
    -#define node_distance(from, to) (from != to)
    +#define node_distance(from, to) ((from) != (to))
     
     /* Cross-node load balancing interval. */
     #define NODE_BALANCE_RATE 100
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Jesse Barnes: "Re: SMP Panic caused by [PATCH] sched: consolidate sched domains"

    Relevant Pages

    • Re: sched_domains + NUMA issue
      ... > We are seeing errors in the sched domains debug code when SMT + NUMA is ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • RE: [patch] prefer TSC over PM Timer
      ... and my measurements show it in the same performance ballpark as TSC. ... wouldn't local apic timer be a lot better for NUMA too? ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • Re: SMP syncronization on AMD processors (broken?)
      ... > uses spinlocks without any problematic restart. ... NUMA, and in many ways you do absolutely _not_ want fairness, because it's ... Fairness is often very expensive indeed. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)
    • [PATCH 0/1] mm: define default cpu_to_node
      ... #ifndef CONFIG_NUMA ... So before my change arch 'sh' used the default define whether NUMA was ... static inline int _cpu_to_node ...
      (Linux-Kernel)
    • Re: topology api confusion
      ... I suspect that it's supposed to be unconditionally included at ... Since it doesnt look like this will be resolved by 2.6.13 and NUMA is ... overriding all the ppc64 specific functions when NUMA is on. ... send the line "unsubscribe linux-kernel" in ...
      (Linux-Kernel)