RE: [patch 1/1] Hot plug CPU to support physical add of new processors (i386)

From: Protasevich, Natalie (Natalie.Protasevich_at_UNISYS.com)
Date: 09/01/05

  • Next message: Nigel Cunningham: "Re: reboot vs poweroff"
    Date:	Thu, 1 Sep 2005 16:09:09 -0500
    To: "Ashok Raj" <ashok.raj@intel.com>
    
    

    > Hi Natalie,
    >
    > sorry for the late catchup...
    >
    > Just to make sure we use the cpu maps properly here is the
    > current definition iam not sure if you are breaking any of
    > these assumptions, cause if you do you will break existing
    > implementations... so pardon for the verbosity.
    >
    > cpu_possible_map - must include all possible cpus that can
    > ever exist in this
    > instance of boot. This map is static, i.e
    > you can grow
    > or remove bits without causing confution
    > to the rest of the
    > subsystem. We attempted to change, but
    > there was not much
    > in return for the savings, for the
    > complexity we would
    > add to make this map dynamic.
    >
    > cpu_present_map - indicates that these are present and operational. OS
    > can decide if they should be brought up or not.
    >
    > when CONFIG_HOTPLUG_CPU is *not* defined, then
    > cpu_possible_map and cpu_present_map are the same. When
    > CONFIG_HOTPLUG_CPU=y, then we extend cpu_possible_map to
    > cover the range of NR_CPUS, but cpu_present_map still shows
    > what is real.
    >
    > When a physical hotplug happens, cpu_present_map is updated
    > by who ever manages the physical hotplug.
    > >
    > > Current IA32 CPU hotplug code doesn't allow bringing up
    > processors that were not present in the boot configuration.
    > > To make existing hot plug facility more practical for physical hot
    > > plug, possible processors should be encountered during boot for
    > > potentual hot add/replace/remove. On ES7000, ACPI marks all the
    > > sockets that are empty or not assigned to the partitionas as
    > > "disabled". The patch allows arrays/masks with APIC info
    > for disabled
    > > processors to be
    >
    > This sounds like a cluge to me. The correct implementation
    > would be you would need some sysmgmt deamon or something that
    > works with the kernel to notify of new cpus and populate
    > apicid and grow cpu_present_map. Making an assumption that
    > disabled APICID are valid for ES7000 sake is not a safe assumption.

    Yes, this is a kludge, I realize that. The AML code was not there so far
    (it will be in the next one). I have a point here though that if the
    processor is there, but is unusable (what "disabled" means as the ACPI
    spec says), meaning bad maybe, then with physical hot plug it can
    certainly be made usable and I think it should be taken into
    consideration (and into configuration). It should be counted as possible
    at least, with hot plug, because it represent existing socket.

    > > initialized. Then the OS can bring up a processor that was
    > inserted in
    > > the socked and brought into configuration during runtime.
    > > To test the code, one can boot the system with maxcpu=1 and
    > then bring
    > > the rest of the processors up, which was not possible so
    > far (only maxcpus number of nodes were created).
    > > The patch also makes proc entry for interrupts dynamically
    > change to only show current onlined processors.
    > >
    > > Signed-off-by: Natalie Protasevich <Natalie.Protasevich@unisys.com>
    > > ---
    > >
    > > arch/i386/kernel/acpi/boot.c | 2 ++
    > > arch/i386/kernel/irq.c | 8 ++++----
    > > arch/i386/kernel/mpparse.c | 4 ++++
    > > arch/i386/kernel/smpboot.c | 3 ++-
    > > arch/i386/mach-default/topology.c | 4 ++--
    > > kernel/cpu.c | 4 ++++
    > > 6 files changed, 18 insertions(+), 7 deletions(-)
    > >
    > > diff -puN arch/i386/kernel/acpi/boot.c~hotcpu-i386
    > arch/i386/kernel/acpi/boot.c
    > > ---
    > linux-2.6.13-rc6-mm2/arch/i386/kernel/acpi/boot.c~hotcpu-i386
    > 2005-08-31 04:17:20.832038600 -0700
    > > +++ linux-2.6.13-rc6-mm2-root/arch/i386/kernel/acpi/boot.c
    > 2005-08-31 04:19:35.259602504 -0700
    > > @@ -255,9 +255,11 @@ acpi_parse_lapic(acpi_table_entry_header
    > >
    > > acpi_table_print_madt_entry(header);
    > >
    > > +#ifndef CONFIG_HOTPLUG_CPU
    > > /* no utility in registering a disabled processor */
    > > if (processor->flags.enabled == 0)
    > > return 0;
    > > +#endif
    >
    > On same lines, the above config code seems not appropriate.
    > When your platform needs to hot-add a cpu, find a way to set
    > these flags, you cant break current behaviour as for a platform quirk!
    > >
    > > x86_acpiid_to_apicid[processor->acpi_id] = processor->id;
    > >
    > > diff -puN arch/i386/kernel/mpparse.c~hotcpu-i386
    > arch/i386/kernel/mpparse.c
    > > ---
    > linux-2.6.13-rc6-mm2/arch/i386/kernel/mpparse.c~hotcpu-i386
    > 2005-08-31 04:17:20.877031760 -0700
    > > +++ linux-2.6.13-rc6-mm2-root/arch/i386/kernel/mpparse.c
    > 2005-08-31 04:20:49.297347056 -0700
    > > @@ -125,8 +125,10 @@ static void __init MP_processor_info (st
    > > int ver, apicid, cpu, found_bsp = 0;
    > > physid_mask_t tmp;
    > >
    > > +#ifndef CONFIG_HOTPLUG_CPU
    > > if (!(m->mpc_cpuflag & CPU_ENABLED))
    > > return;
    > > +#endif
    >
    > Same here... platform quirk!

    Yes it is indeed :) because it is masking absent processors as disabled.
    And again I believe (correct me if I'm wrong) the disabled processors
    should be counted as possible with hot plug (physical) in any case. This
    patch is not actually covering physical hot plug, all it does is making
    room for say hot swap of failed processors. How about making platform
    hooks here for non-conforming (but not prohibited!) hot plug methods. In
    fact the one used now on ES7000 is very intuitive and based on
    assumption that "disabled" case is handled properly.
     
    > >
    > > apicid = mpc_apic_id(m, translation_table[mpc_record]);
    > >
    > > @@ -190,11 +192,13 @@ static void __init MP_processor_info (st
    > > return;
    > > }
    > >
    > > +#ifndef CONFIG_HOTPLUG_CPU
    > > if (num_processors >= maxcpus) {
    > > printk(KERN_WARNING "WARNING: maxcpus limit of
    > %i reached."
    > > " Processor ignored.\n", maxcpus);
    > > return;
    > > }
    > > +#endif
    > > num_processors++;
    > > ver = m->mpc_apicver;
    > >
    > > diff -puN arch/i386/kernel/smpboot.c~hotcpu-i386
    > arch/i386/kernel/smpboot.c
    > > ---
    > linux-2.6.13-rc6-mm2/arch/i386/kernel/smpboot.c~hotcpu-i386
    > 2005-08-31 04:17:20.924024616 -0700
    > > +++ linux-2.6.13-rc6-mm2-root/arch/i386/kernel/smpboot.c
    > 2005-08-31 04:21:49.474198784 -0700
    > > @@ -1003,9 +1003,10 @@ int __devinit smp_prepare_cpu(int cpu)
    > > struct warm_boot_cpu_info info;
    > > struct work_struct task;
    > > int apicid, ret;
    > > + extern u8 bios_cpu_apicid[NR_CPUS];
    > >
    > > lock_cpu_hotplug();
    > > - apicid = x86_cpu_to_apicid[cpu];
    > > + apicid = bios_cpu_apicid[cpu];
    > > if (apicid == BAD_APICID) {
    > > ret = -ENODEV;
    > > goto exit;
    > > diff -puN arch/i386/mach-default/topology.c~hotcpu-i386
    > arch/i386/mach-default/topology.c
    > > ---
    > linux-2.6.13-rc6-mm2/arch/i386/mach-default/topology.c~hotc
    > pu-i386 2005-08-31 04:17:20.957019600 -0700
    > > +++
    > linux-2.6.13-rc6-mm2-root/arch/i386/mach-default/topology.c
    > 2005-08-31 04:22:13.020619184 -0700
    > > @@ -76,7 +76,7 @@ static int __init topology_init(void)
    > > for_each_online_node(i)
    > > arch_register_node(i);
    > >
    > > - for_each_present_cpu(i)
    > > + for_each_cpu(i)
    >
    > Nope. Should still be for present_cpus. with NR_CPUS large we
    > would see way too many files in sysfs than what is really available.

    Actually, it will only create as many nodes as listed in ACPI (enabled
    or disabled). If you run it on regular system with maxcpus=1, you'll
    only see the actual processor nodes (offlined beyond maxcpus).

    > > arch_register_cpu(i);
    > > return 0;
    > > }
    > > @@ -87,7 +87,7 @@ static int __init topology_init(void) {
    > > int i;
    > >
    > > - for_each_present_cpu(i)
    > > + for_each_cpu(i)
    > > arch_register_cpu(i);
    > > return 0;
    > > }
    > > diff -puN kernel/cpu.c~hotcpu-i386 kernel/cpu.c
    > > --- linux-2.6.13-rc6-mm2/kernel/cpu.c~hotcpu-i386
    > 2005-08-31 04:17:21.002012760 -0700
    > > +++ linux-2.6.13-rc6-mm2-root/kernel/cpu.c 2005-08-31
    > 04:23:34.378250944 -0700
    > > @@ -158,7 +158,11 @@ int __devinit cpu_up(unsigned int cpu)
    > > if ((ret = down_interruptible(&cpucontrol)) != 0)
    > > return ret;
    > >
    > > +#ifdef CONFIG_HOTPLUG_CPU
    > > + if (cpu_online(cpu)) {
    > > +#else
    > > if (cpu_online(cpu) || !cpu_present(cpu)) {
    > > +#endif
    >
    > ditto.....

    I'm quite agree with you on proper implementation of physical hot plug,
    but "disabled" case has to be handled correctly, and it should be some
    flexibility built in. I don't think there is enough right now even with
    proper ACPI based implementation.

    Thanks,
    --Natalie

    > > ret = -EINVAL;
    > > goto out;
    > > }
    > > diff -puN arch/i386/kernel/irq.c~hotcpu-i386 arch/i386/kernel/irq.c
    > > --- linux-2.6.13-rc6-mm2/arch/i386/kernel/irq.c~hotcpu-i386
    > 2005-08-31 04:17:21.047005920 -0700
    > > +++ linux-2.6.13-rc6-mm2-root/arch/i386/kernel/irq.c
    > 2005-08-31 04:25:21.761926144 -0700
    > > @@ -248,7 +248,7 @@ int show_interrupts(struct seq_file *p,
    > >
    > > if (i == 0) {
    > > seq_printf(p, " ");
    > > - for_each_cpu(j)
    > > + for_each_online_cpu(j)
    > > seq_printf(p, "CPU%d ",j);
    > > seq_putc(p, '\n');
    > > }
    > > @@ -262,7 +262,7 @@ int show_interrupts(struct seq_file *p,
    > #ifndef
    > > CONFIG_SMP
    > > seq_printf(p, "%10u ", kstat_irqs(i)); #else
    > > - for_each_cpu(j)
    > > + for_each_online_cpu(j)
    > > seq_printf(p, "%10u ",
    > kstat_cpu(j).irqs[i]); #endif
    > > seq_printf(p, " %14s",
    > irq_desc[i].handler->typename); @@ -276,12
    > > +276,12 @@ skip:
    > > spin_unlock_irqrestore(&irq_desc[i].lock, flags);
    > > } else if (i == NR_IRQS) {
    > > seq_printf(p, "NMI: ");
    > > - for_each_cpu(j)
    > > + for_each_online_cpu(j)
    > > seq_printf(p, "%10u ", nmi_count(j));
    > > seq_putc(p, '\n');
    > > #ifdef CONFIG_X86_LOCAL_APIC
    > > seq_printf(p, "LOC: ");
    > > - for_each_cpu(j)
    > > + for_each_online_cpu(j)
    > > seq_printf(p, "%10u ",
    > > per_cpu(irq_stat,j).apic_timer_irqs);
    > > seq_putc(p, '\n');
    > > _
    >
    > --
    > Cheers,
    > Ashok Raj
    > - Open Source Technology Center
    >
    -
    To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
    the body of a message to majordomo@vger.kernel.org
    More majordomo info at http://vger.kernel.org/majordomo-info.html
    Please read the FAQ at http://www.tux.org/lkml/


  • Next message: Nigel Cunningham: "Re: reboot vs poweroff"

    Relevant Pages

    • Re: PC is Freezing after a few hours
      ... There is no plug in necessary to make Azureus work, ... Instead, just make Azureus ... > new CPU the bios should jsut automatically set the ... >>I would recommend sticking that Corsair memory back in as ...
      (microsoft.public.windowsxp.hardware)
    • Re: Slow death of a PC
      ... After I pull the plug out and put in again, ... Motherboard(with on-board graphic card) with a Pentium 4 CPU, ... this is what happens when I press the ON switch: ...
      (alt.comp.hardware.pc-homebuilt)
    • Slow death of a PC
      ... After I pull the plug out and put in again, ... Motherboard(with on-board graphic card) with a Pentium 4 CPU, ... this is what happens when I press the ON switch: ...
      (alt.comp.hardware.pc-homebuilt)
    • Re: need help....XP just shut down on me
      ... the cpu temp is around 74C. ... > You could download motherboard monitor and set it to monitor your temps. ... > graphics work to check the CPU usage. ... >> way I can get the computer back on is to physically pull the plug and put ...
      (microsoft.public.windowsxp.help_and_support)
    • Re: Strange Street Fighter 2 jamma problems
      ... i'm going to make all the proper ... If the outlet on the bottom of the case is always on, and I plug ... in the power supply, and connect it to the jamma wires, then.... ...
      (rec.games.video.arcade.collecting)