Sponsored Content
Special Forums UNIX and Linux Applications High Performance Computing Memory Barriers for (Ubuntu) Linux (i686) Post 302430438 by Corona688 on Thursday 17th of June 2010 03:00:49 PM
Old 06-17-2010
Small world, how so? Smilie
Quote:
Originally Posted by gorga
I hadn't heard of futexes until you mentioned them, but I did some reading and it seems they still use atomic instructions to update shared variables.
Well, yes. It has to synchronize somehow. One way or another you must interrupt other cores with this change in status, or they may never know.
Quote:
In that case I could just use one of GCC's built-in atomic operations like "__sync_fetch_ and_ add" or "__sync_bool_compare_and_swap" as described here...

Atomic Builtins - Using the GNU Compiler Collection (GCC)
Wow, those are nice.

Quote:
The thing with these is they use the asm op-code "lock", which issues a hardware lock on the data-bus effectively locking every other process out of memory.
I think you're overreacting... Any memory I/O monopolizes the bus*, LOCK just guarantees one instruction gets two ops in a row.

Also. The original 8088 has precisely one instruction worth of cache, so locking the bus stalls it instantly... The huge caches, multiple independent memory buses, and cache communication systems in recent NUMA systems usually let cores keep going or find something else to do. I'm not sure LOCK XCGH even forces a real memory fetch anymore(might be simple to test, try to get back to you on that.)

Lastly, if you're doing no mutexing, what are you doing instead -- polling? That's not going to be more efficient, untold amounts of CPU will be expended on what amounts to a while(1) loop.

I really think pthreads is still what you're looking for. They've made it as fast as they know how, significantly changing the kernel to accommodate it.

* Exceptions exist for very special-purpose memory chips like video RAM.
 

4 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Memory-waste in Ubuntu/Debian?

I have 512 mem on this laptop, though 'top' tells me I only have 380. However, Ubuntu is using 288 mb of memory, when I only have 3 terminals, running lynx, vim(for this file) and (of course) top. Considering it I have lynx running a 600 page txt file, which of course would eat some memory but 300?... (0 Replies)
Discussion started by: riwa
0 Replies

2. Linux

i686, x86 64, ppc

Hi, i am quite new to linux. I am interested in fedora linux distro. Fedora Project I dont know which one to choose, either i686, x86 64 or ppc. I prefer a live cd, coz its easy to use. And what is the difference between "Fedora Desktop Live Media" and "Fedora KDE Live Media". (3 Replies)
Discussion started by: superblacksmith
3 Replies

3. Programming

Getting the total virtual memory for ubuntu in c++

Hi guys , i need to get the total virtual memory in ubuntu but i need to write a C++ code for that, any idea on how to go about doing it? any references? or website that i can refer to ? (6 Replies)
Discussion started by: xiaojesus
6 Replies

4. Ubuntu

XP and Linux (Ubuntu) on same disk, Can I install Ubuntu on not-yet partitioned portion of disk?

My PC (Esprimo, 3 yeas old) has one hard drive having 2 partitions C: (80 GB NTFS, XP) and D: (120 GB NTFS, empty) and and a 200 MB area that yet is not-partitioned. I would like to try Ubuntu and to install Ubuntu on the not-partitioned area . The idea is to have the possibility to run... (7 Replies)
Discussion started by: C.Weidemann
7 Replies
ATOMIC_DEPRECATED(3)					   BSD Library Functions Manual 				      ATOMIC_DEPRECATED(3)

NAME
OSAtomicAdd32, OSAtomicAdd32Barrier, OSAtomicIncrement32, OSAtomicIncrement32Barrier, OSAtomicDecrement32, OSAtomicDecrement32Barrier, OSAtomicOr32, OSAtomicOr32Barrier, OSAtomicOr32Orig, OSAtomicOr32OrigBarrier, OSAtomicAnd32, OSAtomicAnd32Barrier, OSAtomicAnd32Orig, OSAtomicAnd32OrigBarrier, OSAtomicXor32, OSAtomicXor32Barrier, OSAtomicXor32Orig, OSAtomicXor32OrigBarrier, OSAtomicAdd64, OSAtomicAdd64Barrier, OSAtomicIncrement64, OSAtomicIncrement64Barrier, OSAtomicDecrement64, OSAtomicDecrement64Barrier, OSAtomicCompareAndSwapInt, OSAtomicCompareAndSwapIntBarrier, OSAtomicCompareAndSwapLong, OSAtomicCompareAndSwapLongBarrier, OSAtomicCompareAndSwapPtr, OSAtomicCompareAndSwapPtrBarrier, OSAtomicCompareAndSwap32, OSAtomicCompareAndSwap32Barrier, OSAtomicCompareAndSwap64, OSAtomicCompareAndSwap64Barrier, OSAtomicTestAndSet, OSAtomicTestAndSetBarrier, OSAtomicTestAndClear, OSAtomicTestAndClearBarrier, OSMemoryBarrier -- deprecated atomic add, increment, decrement, or, and, xor, compare and swap, test and set, test and clear, and memory barrier SYNOPSIS
#include <libkern/OSAtomic.h> int32_t OSAtomicAdd32(int32_t theAmount, volatile int32_t *theValue); int32_t OSAtomicAdd32Barrier(int32_t theAmount, volatile int32_t *theValue); int32_t OSAtomicIncrement32(volatile int32_t *theValue); int32_t OSAtomicIncrement32Barrier(volatile int32_t *theValue); int32_t OSAtomicDecrement32(volatile int32_t *theValue); int32_t OSAtomicDecrement32Barrier(volatile int32_t *theValue); int32_t OSAtomicOr32(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicOr32Barrier(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicAnd32(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicAnd32Barrier(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicXor32(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicXor32Barrier(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicOr32Orig(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicOr32OrigBarrier(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicAnd32Orig(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicAnd32OrigBarrier(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicXor32Orig(uint32_t theMask, volatile uint32_t *theValue); int32_t OSAtomicXor32OrigBarrier(uint32_t theMask, volatile uint32_t *theValue); int64_t OSAtomicAdd64(int64_t theAmount, volatile OSAtomic_int64_aligned64_t *theValue); int64_t OSAtomicAdd64Barrier(int64_t theAmount, volatile OSAtomic_int64_aligned64_t *theValue); int64_t OSAtomicIncrement64(volatile OSAtomic_int64_aligned64_t *theValue); int64_t OSAtomicIncrement64Barrier(volatile OSAtomic_int64_aligned64_t *theValue); int64_t OSAtomicDecrement64(volatile OSAtomic_int64_aligned64_t *theValue); int64_t OSAtomicDecrement64Barrier(volatile OSAtomic_int64_aligned64_t *theValue); bool OSAtomicCompareAndSwapInt(int oldValue, int newValue, volatile int *theValue); bool OSAtomicCompareAndSwapIntBarrier(int oldValue, int newValue, volatile int *theValue); bool OSAtomicCompareAndSwapLong(long oldValue, long newValue, volatile long *theValue); bool OSAtomicCompareAndSwapLongBarrier(long oldValue, long newValue, volatile long *theValue); bool OSAtomicCompareAndSwapPtr(void* oldValue, void* newValue, void* volatile *theValue); bool OSAtomicCompareAndSwapPtrBarrier(void* oldValue, void* newValue, void* volatile *theValue); bool OSAtomicCompareAndSwap32(int32_t oldValue, int32_t newValue, volatile int32_t *theValue); bool OSAtomicCompareAndSwap32Barrier(int32_t oldValue, int32_t newValue, volatile int32_t *theValue); bool OSAtomicCompareAndSwap64(int64_t oldValue, int64_t newValue, volatile OSAtomic_int64_aligned64_t *theValue); bool OSAtomicCompareAndSwap64Barrier(int64_t oldValue, int64_t newValue, volatile OSAtomic_int64_aligned64_t *theValue); bool OSAtomicTestAndSet(uint32_t n, volatile void *theAddress); bool OSAtomicTestAndSetBarrier(uint32_t n, volatile void *theAddress); bool OSAtomicTestAndClear(uint32_t n, volatile void *theAddress); bool OSAtomicTestAndClearBarrier(uint32_t n, volatile void *theAddress); bool OSAtomicEnqueue(OSQueueHead *list, void *new, size_t offset); void* OSAtomicDequeue(OSQueueHead *list, size_t offset); void OSMemoryBarrier(void); DESCRIPTION
These are deprecated interfaces for atomic and synchronization operations, provided for compatibility with legacy code. New code should use the C11 <stdatomic.h> interfaces described in stdatomic(3). These functions are thread and multiprocessor safe. For each function, there is a version which incorporates a memory barrier and another version which does not. Barriers strictly order memory access on a weakly-ordered architecture such as ARM. All loads and stores executed in sequential program order before the barrier will complete before any load or store executed after the barrier. On some platforms, such as ARM, the barrier operation can be quite expensive. Most code will want to use the barrier functions to ensure that memory shared between threads is properly synchronized. For example, if you want to initialize a shared data structure and then atomically increment a variable to indicate that the initialization is complete, then you must use OSAtomicIncrement32Barrier() to ensure that the stores to your data structure complete before the atomic add. Likewise, the con- sumer of that data structure must use OSAtomicDecrement32Barrier(), in order to ensure that their loads of the structure are not executed before the atomic decrement. On the other hand, if you are simply incrementing a global counter, then it is safe and potentially much faster to use OSAtomicIncrement32(). If you are unsure which version to use, prefer the barrier variants as they are safer. The logical (and, or, xor) and bit test operations are layered on top of the OSAtomicCompareAndSwap() primitives. There are four versions of each logical operation, depending on whether or not there is a barrier, and whether the return value is the result of the operation (eg, OSAtomicOr32() ) or the original value before the operation (eg, OSAtomicOr32Orig() ). The memory address theValue must be "naturally aligned", i.e. 32-bit aligned for 32-bit operations and 64-bit aligned for 64-bit operations. Note that this is not the default alignment of the int64_t in the iOS ARMv7 ABI, the OSAtomic_int64_aligned64_t type can be used to declare variables with the required alignment. The OSAtomicCompareAndSwap() operations compare oldValue to *theValue, and set *theValue to newValue if the comparison is equal. The compar- ison and assignment occur as one atomic operation. OSAtomicTestAndSet() and OSAtomicTestAndClear() operate on bit (0x80 >> ( n & 7)) of byte ((char*) theAddress + ( n >> 3)). They set the named bit to either 1 or 0, respectively. theAddress need not be aligned. The OSMemoryBarrier() function strictly orders memory accesses in a weakly ordered memory model such as with ARM, by creating a barrier. All loads and stores executed in sequential program order before the barrier will complete with respect to the memory coherence mechanism, before any load or store executed after the barrier. Used with an atomic operation, the barrier can be used to create custom synchronization proto- cols as an alternative to the spinlock or queue/dequeue operations. Note that this barrier does not order uncached loads and stores. On a uniprocessor, the barrier operation is typically optimized into a no-op. RETURN VALUES
The arithmetic operations return the new value, after the operation has been performed. The boolean operations come in two styles, one of which returns the new value, and one of which (the "Orig" versions) returns the old. The compare-and-swap operations return true if the com- parison was equal, ie if the swap occured. The bit test and set/clear operations return the original value of the bit. SEE ALSO
stdatomic(3), atomic(3), spinlock_deprecated(3) HISTORY
Most of these functions first appeared in Mac OS 10.4 (Tiger). The "Orig" forms of the boolean operations, the "int", "long" and "ptr" forms of compare-and-swap first appeared in Mac OS 10.5 (Leopard). Darwin Mar 7, 2016 Darwin
All times are GMT -4. The time now is 07:11 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy