[PATCH 4/4] optimize and simplify get_cycles_sync()
- From: "Joerg Roedel" <joerg.roedel@xxxxxxx>
- Date: Wed, 28 Feb 2007 15:25:54 +0100
From: Joerg Roedel <joerg.roedel@xxxxxxx>
This patch simplifies the get_cycles_sync() function by removing the
#ifdefs from it. Further it introduces an optimization for AMD
processors. There the RDTSCP instruction is used instead of CPUID;RDTSC
which is helpfull if the kernel runs as a KVM guest. Running as a guest
makes CPUID very expensive because it causes an intercept of the guest.
Signed-off-by: Joerg Roedel <joerg.roedel@xxxxxxx>
--
Joerg Roedel
Operating System Research Center
AMD Saxony LLC & Co. KG
diff --git a/include/asm-i386/cpufeature.h b/include/asm-i386/cpufeature.h
index 3f92b94..a9f1f01 100644
--- a/include/asm-i386/cpufeature.h
+++ b/include/asm-i386/cpufeature.h
@@ -49,6 +49,7 @@
#define X86_FEATURE_MP (1*32+19) /* MP Capable. */
#define X86_FEATURE_NX (1*32+20) /* Execute Disable */
#define X86_FEATURE_MMXEXT (1*32+22) /* AMD MMX extensions */
+#define X86_FEATURE_RDTSCP (1*32+27) /* RDTSCP */
#define X86_FEATURE_LM (1*32+29) /* Long Mode (x86-64) */
#define X86_FEATURE_3DNOWEXT (1*32+30) /* AMD 3DNow! extensions */
#define X86_FEATURE_3DNOW (1*32+31) /* 3DNow! */
diff --git a/include/asm-x86_64/tsc.h b/include/asm-x86_64/tsc.h
index 9a0a368..05df3f6 100644
--- a/include/asm-x86_64/tsc.h
+++ b/include/asm-x86_64/tsc.h
@@ -34,22 +34,15 @@ static inline cycles_t get_cycles(void)
/* Like get_cycles, but make sure the CPU is synchronized. */
static __always_inline cycles_t get_cycles_sync(void)
{
- unsigned long long ret;
-#ifdef X86_FEATURE_SYNC_RDTSC
- unsigned eax;
+ unsigned int a, d;
- /*
- * Don't do an additional sync on CPUs where we know
- * RDTSC is already synchronous:
- */
- alternative_io("cpuid", ASM_NOP2, X86_FEATURE_SYNC_RDTSC,
- "=a" (eax), "0" (1) : "ebx","ecx","edx","memory");
-#else
- sync_core();
-#endif
- rdtscll(ret);
+ alternative_io_two("cpuid\nrdtsc",
+ "rdtsc", X86_FEATURE_SYNC_RDTSC,
+ "rdtscp", X86_FEATURE_RDTSCP,
+ ASM_OUTPUT2("=a" (a), "=d" (d)),
+ "0" (1) : "ecx", "memory");
- return ret;
+ return ((unsigned long long)a) | (((unsigned long long)d)<<32);
}
extern void tsc_init(void);
- References:
- Prev by Date: Wanted: simple, safe x86 stack overflow detection
- Next by Date: struct page field arrangement
- Previous by thread: [PATCH 3/4] i386: add the X86_FEATURE_SYNC_RDTSC flag
- Next by thread: [RFC][PATCH] intel8x0: revert regression that broke sounds after S3 suspend
- Index(es):
Relevant Pages
- [PATCH 4/4 TRY#3] optimize and simplify get_cycles_sync()
... Further it introduces an optimization for AMD ... There the RDTSCP
instruction is used instead of CPUID;RDTSC ... which is helpfull if the kernel runs as a KVM
guest. ... (Linux-Kernel) - Re: [PATCH 4/4 TRY#3] optimize and simplify get_cycles_sync()
... Further it introduces an optimization for AMD ... There the RDTSCP instruction
is used instead of CPUID;RDTSC ... which is helpfull if the kernel runs as a KVM guest.
... (Linux-Kernel) - Re: [RFC][PATCH 2/7] RSS controller core
... and libraries) to allow for reduced memory ... would also be mapped for each
guest separately ... no, not hard, but a reasonable optimization ... ...
additional 64MB of 'virtual swap' assigned ... ... (Linux-Kernel) - Re: [patch 3/9] Guest page hinting: volatile page cache.
... AMD K8 with the SVM feature has host and guest page tables and
... (Linux-Kernel) - Re: [patch 3/9] Guest page hinting: volatile page cache.
... AMD K8 with the SVM feature has host and guest page tables and
... (Linux-Kernel)