CPU spikes when porting proprietary code from AS 2.1 to AS 3.0

From: Sean Kirkpatrick (Sean.Kirkpatrick_at_PipelineTrading.com)
Date: 12/23/04

  • Next message: Schott, Erik J Mr ANOSC/FCBS: "RE: up2date and problems with X, others"
    Date: Thu, 23 Dec 2004 11:27:28 -0500
    To: <redhat-list@redhat.com>
    
    

    Hello All,

    We have observed CPU utilization spikes when porting our software from our current platform (Redhat AS 2.1) to our new development platform (Redhat AS 3.0). When running on the AS 2.1 multi-processor boxes, using kernel 2.4.9-e.3smp #1 SMP, everything behaves normally. However when running on AS 3.0 multi-processor boxes, using kernel 2.4.21-27.ELsmp #1 SMP or 2.4.21-20.ELsmp #1 SMP, we noticed that periodically the CPU utilization would spike to the point where one of the processors would be 100% consumed. This would occur whether hyper-threading was turned on or off. If we switch to the non SMP kernel (2.4.21-27.EL #1) there are no CPU spikes.

    The attached program demonstrates the problem. It can be run as a daemon (default) or non-daemon (cmd opt -no-daemon). It will stat a non existent file 100 times every 10 milliseconds, using select to sleep. (I have also tried using nanosleep with the same results.) When run on the 2.4.21 SMP kernels it will quickly begin to accumulate CPU time, caused by these intermittent spikes as opposed to a steady build-up. When run on the 2.4.9 SMP kernel or the 2.4.21 NON - SMP kernel, there are no cpu spikes, and no accumulation of CPU time.

    My guess is that the problem ultimately has to do with the frequent 10 ms sleeps. However I would like to know why it only occurs with the 2.4.21 SMP kernels and not the 2.4.9 or 2.4.21 non smp. I'm not sure if it has anything to do with it or not, however I have noticed that when running on 2.4.9 or 2.4.21 NON smp the test app retains a priority of 15. When running on the 2.4.21 SMP kernels it ends up at 25.

    The test program was compiled with gcc version 3.2.3, using the following command lines:
    gcc -o testd_c -x c testd.cpp
    g++ -o testd_cpp testd.cpp

    The test code is as follows:

    #include <stdlib.h>
    #include <sys/types.h>
    #include <sys/stat.h>
    #include <sys/time.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <stdio.h>
    #include <errno.h>
    #include <time.h>
    #include <string.h>

    void makeDaemon()
    {
            pid_t pid = fork();
            if (pid < 0)
            {
                    printf("failed to fork1 for Daemon: %d\n", errno);
                    exit(-1);
            }
            else
            if ( pid > 0)
            {
                    exit(0);
            }

            if (setsid() < 0)
            {
                    printf("error: cannot disassociate from controlling TTY: %d\n", errno);
                    exit(-1);
            }
            
            pid = fork();
            if (pid < 0)
            {
                    printf("failed to fork2 for Daemon: %d\n", errno);
                    exit(-1);
            }
            else
            if ( pid > 0)
            {
                    exit(0);
            }
            
            // pid is zero surely meaning we are the child
            int i = open("/dev/null", O_RDWR );
        dup2(i, 0);
        dup2(i, 1);
        dup2(i, 2);
        close(i);
    }

    int checkFile(const char* fileName)
    {
            struct stat tmpStat;
            memset(&tmpStat,0,sizeof(tmpStat));

            if (stat(fileName,&tmpStat) == -1)
            {
                    return 0;
            }

            return tmpStat.st_size;
    }

    int main(int argc,char* argv[])
    {
            int noDaemon = 0;

            if ((argc > 1) && (0 == strcmp(argv[1],"--no-daemon")))
            {
                    printf("Forgo daemonization.\n");
                    noDaemon = 1;
            }
            else
            {
                    printf("Run as daemon.\n");
                    makeDaemon();
            }

            struct timeval tm;
            time_t now = 0;
            time_t lasttime = 0;
            time_t startTime = time(NULL);
            int i = 0;
            while(1)
            {
                    for (i = 0; i < 99; i++)
                    {
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                            checkFile("abcdefghikabcdefghik");
                    }
                    
                    tm.tv_usec = 10000;
                    tm.tv_sec = 0;
    // tm.tv_usec = 0;
    // tm.tv_sec = 60;
                    select(0,NULL,NULL,NULL,&tm);
                    
                    now = time(NULL);
                    if (noDaemon && ((now - lasttime) > 30))
                    {
                            double accum = ( clock() / CLOCKS_PER_SEC);
                            printf("Running time: %ld, ", ((now - startTime)));
                            printf("Accumulated time: %.02f\n",accum);
                            lasttime = now;
                    }
            }
            
            return 0;
    }

    Regards,

    Sean Kirkpatrick
    Pipeline Trading Systems LLC
    Software Engineer
    (212) 370 - 8328
    sean.kirkpatrick@pipelinetrading.com

    -- 
    redhat-list mailing list
    unsubscribe mailto:redhat-list-request@redhat.com?subject=unsubscribe
    https://www.redhat.com/mailman/listinfo/redhat-list
    

  • Next message: Schott, Erik J Mr ANOSC/FCBS: "RE: up2date and problems with X, others"

    Relevant Pages

    • Re: 16 bit pointer typecast on 16 bit system
      ... Little-endian) but definitely not 0x1. ... I've confirned this with the following test program in gcc: ... int main ...
      (comp.lang.c)
    • Re: 16 bit pointer typecast on 16 bit system
      ... Little-endian) but definitely not 0x1. ... I've confirned this with the following test program in gcc: ... int main ...
      (comp.lang.c)
    • Re: data types
      ... Usually short is smaller than an int. ... Your compiler is either *really* old or broken or both. ... recommend either a version of gcc,, or Visual Studio ... Microsoft's Visual Studio Express has an onerous EULA). ...
      (comp.lang.c)
    • Re: Why doesnt std::cin choke on this?
      ... when I was writing a user-driven test program for a data structure I ... You read an int. ... it simpler then checking the stream for it's ... read in a string, this becomes trivial. ...
      (comp.lang.cpp)
    • Re: gcc bug? Openoffice port impossibel to compile on 4.8
      ... remove this silly "bitten by the Linux bug" and the red-herring of gcc ... struct bar {int a; int b;} dapper; ... The *warning* emitted by gcc when enough analysis is done (e.g. ...
      (freebsd-hackers)