[RFT] Port 0x80 I/O speed



Good day.

Would some people on x86 (both 32 and 64) be kind enough to compile and run the attached program? This is about testing how long I/O port access to port 0x80 takes. It measures in CPU cycles so CPU speed is crucial in reporting.

Posted a previous incarnation of this before, buried in the outb 0x80 thread which had a serialising problem. This one should as far as I can see measure the right thing though. Please yell if you disagree...

For me, on a Duron 1300 (AMD756 chipset) I have a constant:

rene@7ixe4:~/src/port80$ su -c ./port80
cycles: out 2400, in 2400

and on a PII 400 (Intel 440BX chipset) a constant:

rene@6bap:~/src/port80$ su -c ./port80
cycles: out 553, in 251

Results are (mostly) independent of compiler optimisation, but testing with an -O2 compile should be most useful. Thanks!

Rene. /* gcc -W -Wall -O2 -o port80 port80.c */

#include <stdlib.h>
#include <stdio.h>

#include <sys/io.h>

#define LOOPS 10000

inline unsigned long long rdtsc(void)
{
unsigned long long tsc;

asm volatile ("rdtsc": "=A" (tsc));

return tsc;
}

inline void serialize(void)
{
asm volatile ("cpuid": : : "eax", "ebx", "ecx", "edx");
}

int main(void)
{
unsigned long long start;
unsigned long long overhead;
unsigned long long output;
unsigned long long input;
int i;

if (iopl(3) < 0) {
perror("iopl");
return EXIT_FAILURE;
}

asm volatile ("cli");
start = rdtsc();
for (i = 0; i < LOOPS; i++) {
serialize();
serialize();
}
overhead = rdtsc() - start;

start = rdtsc() + overhead;
for (i = 0; i < LOOPS; i++) {
serialize();
asm volatile ("outb %al, $0x80");
serialize();
}
output = rdtsc() - start;

start = rdtsc() + overhead;
for (i = 0; i < LOOPS; i++) {
serialize();
asm volatile ("inb $0x80, %%al": : : "al");
serialize();
}
input = rdtsc() - start;
asm volatile ("sti");

output /= LOOPS;
input /= LOOPS;
printf("cycles: out %llu, in %llu\n", output, input);

return EXIT_SUCCESS;
}


Relevant Pages

  • Re: [RFT] Port 0x80 I/O speed
    ... It measures in CPU cycles so CPU speed is crucial in reporting. ... unsigned long long overhead; ... overhead = rdtsc() - start; ...
    (Linux-Kernel)
  • Re: No difference on my machine
    ... show better reproducable timing values. ... No, the serialising itself may take 190...+++ cycles, ... "Have you ever seen an odd value from rdtsc?" ...
    (alt.lang.asm)
  • Re: No difference on my machine
    ... show better reproducable timing values. ... Apparently 300 cycles on my machine. ... My understanding is that without a serializing instruction, ... come from after the rdtsc in our code, ...
    (alt.lang.asm)
  • Re: from elsewhere, an assembler
    ... a off by one may came from RDTSC itself ... CPUID | rdtsc | push eax edx ... fdiv D$FPU_Mem32 ... activities (for sure more cycles than your code under test..). ...
    (alt.lang.asm)
  • Re: early_printk accessing __log_buf
    ... A single function which does the copy as a loop is going to ... I'd expect the overhead of the instructions to be ... return (5 cycles) ...
    (Linux-Kernel)