Re: [patch 00/21] mutex subsystem, -V14



ISYNC_ON_SMP flushes all speculative reads currently in the queue - and is hence a smp_rmb_backwards() primitive [per my previous mail] - but does not affect writes - correct?

if that's the case, what prevents a store from within the critical section going up to right after the EIEIO_ON_SMP, but before the atomic-dec instructions? Does any of those instructions imply some barrier perhaps? Are writes always ordered perhaps (like on x86 CPUs), and hence the store before the bne is an effective write-barrier?

It really makes more sense after reading PowerPC Book II, which you can find at this link, it was written by people who explain this for a living: http://www-128.ibm.com/developerworks/eserver/articles/archguide.html


While isync technically doesn't order stores it does order instructions. The previous bne- must complete, that bne- is dependent on the previous stwcx being complete. So no stores are slipping up. To get a better explanation you will have to read the document yourself.

Here is a first pass at a powerpc file for the fast paths just as an FYI/RFC. It is completely untested, but compiles.

Signed-off-by: Joel Schopp <jschopp@xxxxxxxxxxxxxx>



Index: 2.6.15-mutex14/include/asm-powerpc/mutex.h
===================================================================
--- 2.6.15-mutex14.orig/include/asm-powerpc/mutex.h 2006-01-04 14:46:31.%N -0600
+++ 2.6.15-mutex14/include/asm-powerpc/mutex.h 2006-01-05 16:25:41.%N -0600
@@ -1,9 +1,83 @@
/*
- * Pull in the generic implementation for the mutex fastpath.
+ * include/asm-powerpc/mutex.h
*
- * TODO: implement optimized primitives instead, or leave the generic
- * implementation in place, or pick the atomic_xchg() based generic
- * implementation. (see asm-generic/mutex-xchg.h for details)
+ * PowerPC optimized mutex locking primitives
+ *
+ * Please look into asm-generic/mutex-xchg.h for a formal definition.
+ * Copyright (C) 2006 Joel Schopp <jschopp@xxxxxxxxxxxxxx>, IBM
*/
+#ifndef _ASM_MUTEX_H
+#define _ASM_MUTEX_H
+#define __mutex_fastpath_lock(count, fail_fn)\
+do{ \
+ long tmp; \
+ __asm__ __volatile__( \
+"1: lwarx %0,0,%1\n" \
+" addic %0,%0,-1\n" \
+" stwcx. %0,0,%1\n" \
+" bne- 1b\n" \
+" isync \n" \
+ : "=&r" (tmp) \
+ : "r" (&(count)->counter) \
+ : "cr0", "memory"); \
+ if (unlikely(tmp < 0)) \
+ fail_fn(count); \
+} while (0)
+
+#define __mutex_fastpath_unlock(count, fail_fn)\
+do{ \
+ long tmp; \
+ __asm__ __volatile__(SYNC_ON_SMP \
+"1: lwarx %0,0,%1\n" \
+" addic %0,%0,1\n" \
+" stwcx. %0,0,%1\n" \
+" bne- 1b\n" \
+ : "=&r" (tmp) \
+ : "r" (&(count)->counter) \
+ : "cr0", "memory"); \
+ if (unlikely(tmp <= 0)) \
+ fail_fn(count); \
+} while (0)
+
+
+static inline int
+__mutex_fastpath_trylock(atomic_t* count, int (*fail_fn)(atomic_t*))
+{
+ long tmp;
+ __asm__ __volatile__(
+"1: lwarx %0,0,%1\n"
+" cmpwi 0,%0,1\n"
+" bne- 2f\n"
+" stwcx. %0,0,%1\n"
+" bne- 1b\n"
+" isync\n"
+"2:"
+ : "=&r" (tmp)
+ : "r" (&(count)->counter)
+ : "cr0", "memory");
+
+ return (int)tmp;
+
+}
+
+#define __mutex_slowpath_needs_to_unlock() 1

-#include <asm-generic/mutex-dec.h>
+static inline int
+__mutex_fastpath_lock_retval(atomic_t* count, int (*fail_fn)(atomic_t *))
+{
+ long tmp;
+ __asm__ __volatile__(
+"1: lwarx %0,0,%1\n"
+" addic %0,%0,-1\n"
+" stwcx. %0,0,%1\n"
+" bne- 1b\n"
+" isync \n"
+ : "=&r" (tmp)
+ : "r" (&(count)->counter)
+ : "cr0", "memory");
+ if (unlikely(tmp < 0))
+ return fail_fn(count);
+ else
+ return 0;
+}
+#endif


Relevant Pages

  • We sell various Indonesian Antiques (Wayang/Puppet, Batik & more)
    ... both antique and new item in our ... arts, primitives, even unique clothing, and many more. ... Visit our eBay store to find out more: ...
    (rec.arts.puppetry)
  • RE: Problem Setting Up x.509 Certificates for WSE2.0
    ... and it turns out that the instructions and WSE2 itself ... "Other People" store on the certificate import dialog on the WinXPPro machine ... > certificate store and the local machine certificate store snap-ins. ... > CurrentUser/Other People ...
    (microsoft.public.dotnet.framework.webservices.enhancements)
  • Re: How to cast using MSVC++ Intrinsics
    ... to load the SIMD register from the stack on a quadword floating point load. ... What would be nice is to have init to 0 or -1 instructions. ... Store, Load, Store, etc. ...
    (comp.lang.asm.x86)
  • Re: problem with cache flush routine for G5?
    ... it then has to make sure that the icache doesn't contain ... > The original code was written for ppc hardware that had the ability to ... > changed instructions from the L2 cache. ... Do I actually still need the store? ...
    (Linux-Kernel)
  • Problem Setting Up x.509 Certificates for WSE2.0
    ... I am having trouble following the instructions provided for the sample apps ... certificate store and the local machine certificate store snap-ins. ... MsdnWse2SecuritySamplesClient.cer (Client's public key) -> ...
    (microsoft.public.dotnet.framework.webservices.enhancements)