Re: [opensuse] Re: RAM



Randall R Schulz wrote:
On Wednesday 19 September 2007 15:53, Aaron Kulkis wrote:
Randall R Schulz wrote:
On Wednesday 19 September 2007 12:53, JJB wrote:
Price for the system is the same with quad core 1.6 or dual core
3.0 ghz,
That's an interesting pair of options. I wonder how to analyze
one's applications to make the better choice.
The more cores contending for memory access, the more your system
falls short of theoretical maximum throughput.

So, go with the high-speed dual core rather than the low-speed
quad-core. Memory contention issues will more than destroy the
theoretical 0.4 GHz*CPU advantage of the quad core.

Not necessarily. Some instruction mixes have a much higher ratio of CPU internal instruction cycles to memory accesses than others.

For example, tight inner loops (where all the instructions remain in the level 1 cache) that perform lots of floating-point operations on values that were computed by the immediately preceding instruction (again benefitting from the on-chip cache) will exhibit relatively few memory accesses per clock cycle.


You would be surprised.
For a couple years, I worked with the supercomputing group at the
General Motors Tech Center in Warren, MI, and even with programs
that were specifically DESIGNED to run as parallel instances (usually
using divide & conquer strategy across 2^N processors) took more
wall-clock CPU-hours using high parallelism than low-parallelism.
This was on high end SGI (16 to 64 CPU) and IBM P-5 series computers
(up to 96 CPUs) which were specifically designed for parallel
processing, running programs specifically designed for such
environment (code from Lawrence-Livermore Labs, for example).

The President of Platform Computing (http://www.platformcomputing.com)
described similar results at other customer sites in his speech at
a day-long presentation sponsored by Compaq shortly after Compaq
took over HP.


In contrast, processing mixes that involve a lot of data movement and relatively little calculation (especially mixes that use few instructions that take multiple clock cycles) will benefit most from fast RAM.


There's no one-size-fits-all answer. If you really want to optimize a particular application, you must understand and analyze it carefully. For "general-purpose" applications (not really meaningful without _some_ characterization of the processing mix), there presumably are some kinds of rules of thumb, but I'm not sure what they are.

(I know that for the application that absorbs most of my attention these days the dominant factor is definitely RAM speed. I've observed that a 2.0 GHz Core Duo (_not_ Core 2) beats a 3.0 GHz Pentium 4 HT simply because the former has faster memory. In fact, the ratio of the speed of my current project is almost exactly the ratio of the RAM speed between the two systems. It's as if the CPU speed didn't even matter!

For most every application out there, RAM speed is the primary bottleneck,
followed by disk access (caused by RAM shortages). The first byte of any
RAM access is on the order of 1,000 to 100,000 times faster than disk
access. Network filesystem access tends to be even slower.

[I have a long story about how the execution time of an analyst's
execution times for some Computational Fluid Dynamics problems were
gradually cut from 16 hours (wall clock time) down to 5 ~ 10 minutes,
all by getting rid of disk access over the network(not only the data,
but the executable being remotely located ALSO had a significant
impact on wall-clock times.)



--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
For additional commands, e-mail: opensuse+help@xxxxxxxxxxxx