Color quad processor software
The user interface is so simple to use. You can watch more Viewtron video demos here. Check out this demo using on of our new Viewtron AI security cameras. These IP cameras have built-in artificial intelligence functions for car detection, people detection, face match detection , facial recognition , license plate recognition LPR , and automatic number plate recognition, ANPR. You can watch more videos using Viewtron AI security cameras here.
Skip to content. Facebook Twitter Pinterest Email. About Mike Haldas View all posts. Join our email list to receive notifications when we post new articles and videos. The comparison to Nehalem is instructive for many reasons, not least of which is the very different approaches AMD and Intel have taken with their latest CPU architectures. From a certain way of looking at things, they reach similar destinations by different paths. Istanbul, of course, has six execution cores, each of which can issue three instructions per clock.
Nehalem has four cores, but they are true four-issue cores, capable of issuing, executing, and retiring four instructions per clock. Chip wide, then, Istanbul can issue 18 instructions per clock, while Nehalem can issue 16—closer than one might think, when just considering core counts. AMD expects Istanbul to give it a clear lead in this space, at least until Nehalem-EX arrives later this year with native octal cores and four memory channels per socket. Istanbul Opterons will populate the new Opteron and series lineups, and their introduction brings with it some price reductions on existing Shanghai Opterons.
The three 2P versions of Istanbul run at 2. AMD is quick to point out that its entire product lineup shares the same basic feature set—including cache sizes, memory speeds, and virtualization support—in contrast to the breathtaking variety of the Xeon series , which can be rather daunting to keep sorted. One can see here how AMD intends for the quad- and six-core Opterons to coexist.
The top Shanghai model, the at 2. The other Shanghais tumble in reaction. So the customer is faced with a fairly straightforward choice between four cores at 2. The 4P-and-greater Opteron series presents the same choice, with higher stakes.
The first wave of Istanbuls all occupy standard power envelopes, but the six-core chips will proliferate to the other Opteron power grades this summer. These Nehalem CPUs have a core clock of 2. Hence the development of its ACP metric.
After that, in early , will come the bifurcation of Opteron socket types into two classes, the higher-end G34 with four memory channels and the mid-range C32 socket with dual-channel memory. These new sockets will enable some features already present in 45nm Opteron silicon, including DDR3 memory support and a fourth HyperTransport link. The two socket types will overlap in the 2P space, while only the G34 will serve 4P and beyond. However, without the need to traverse a longer distance over a motherboard, Goddard said AMD should be able to tune the synchronizers on the HT links to achieve much lower latencies than a socket-to-socket connection.
Beyond that, mapping out the multi-chip-per-package future of the Opteron becomes rather tricky. Magny-Cours, for instance, will be fully connected on a per chip basis, not just per socket, in a 2P system. Details about this one are sketchy, but Goddard told us that platform would include on-die PCI Express connectivity. As ever, we did our best to deliver clean benchmark numbers. Tests were run at least three times, and the results were averaged.
The tests and methods we employ are usually publicly available and reproducible. If you have questions about our methods, hit our forums to talk with us about them. This bandwidth test gives a nice visual for the different levels of the cache and memory hierarchy. This version of Stream is multithreaded and can be told how many threads to create. As you can see, the Nehalem Xeons have a clear lead in available bandwidth thanks primarily to their three channels of DDR3 MHz memory.
With no real changes to the memory subsystem, Istanbul achieves no more throughput than Shanghai. We can get a closer look at access latencies throughout the memory hierarchy with the 3D graphs below. The continuity between Istanbul and Shanghai continues here. The Xeon X looks pretty similar, too, but it has smaller L1 and L3 caches, a larger, quicker L3 cache 8MB and much shorter access times to main memory.
The logic executed by the test is written in Java and runs in a JVM. This benchmark tests scaling with one to many threads, although its main score is largely a measure of peak throughput. As you may know, system vendors spend tremendous effort attempting to achieve peak scores in benchmarks like this one, which they then publish via SPEC.
We did not intend to challenge the best published scores with our results, but we did hope to achieve reasonably optimal tuning for our test systems. We used two JVM instances on all systems one per socket , with the following command line options:.
Those options are specifically the ones used with the Istanbul Opteron system. They varied for the other two systems in a couple of ways. We also adjusted the number of garbage collector threads -XXgcthreads for each JVM to match the number of hardware threads per socket. In keeping with the SPECjbb run rules, we tested at up to twice the optimal number of warehouses per system, with the optimal count being the total number of hardware threads.
Istanbul does bring substantial progress over Shanghai, however, closing the gap quite a bit. Like SPECjbb, this benchmark is based on multithreaded Java workloads and uses similar tuning parameters, but its workloads are somewhat different.
The benchmark then reports power-performance ratios at each load level. Although it generally works well enough, the Extech occasionally produces a clearly wrong reading, which is either approximately one half or twice the prior reading—apparently a simple serial communications quirk.
Our results would not be accepted for publication by SPEC unless we used an approved and much costlier power meter. They should, however, be good enough for our purposes. We used the same basic performance tuning and system setup parameters here that we did with SPECjbb, with the exception that we lowered the JVM heap size slightly to avoid a memory allocation error.
Like I said, the heap size is the only real change. The Istanbul Opteron based system looks awfully good here; its power consumption is similar to the Shanghai system at each load level, but with substantially higher performance. The Xeon X system is a little different; at active idle, it draws W, versus W for the two Opteron boxes.
Beyond that, the Xeon X system draws more power but achieves higher performance at each step than the Opteron Now we can see just how incredibly close a race this is. The performance-power curves for the Opteron and Xeon X systems almost perfectly overlap, amazingly enough.
Multi-core processors tend to offer very strong power efficiency propositions with highly parallel workloads. Adding two more cores and dialing back clock speeds in order to fit into the same power envelopes as Shanghai proves to be a very effective strategy in this case.
The overall result takes power draw at active idle into account, which is probably what puts the Xeon over the top. Make no mistake, though: this Istanbul system is very much a match for the Xeon in terms of power-efficient performance. We can take another look at power consumption and energy-efficient performance by using a test whose time to completion varies with performance. As the multithreaded version of this test ran, we measured power draw at the wall socket for each of our test systems across a set time period.
A quick look at the data tells us much of what we need to know, Still, we can quantify these things with more precision. Only one watt separates it from the Istanbul box. Next, we can look at peak power draw by taking an average from the ten-second span from 15 to 25 seconds into our test period, during which the processors were rendering.
One way to gauge power efficiency is to look at total energy use over our time span. This method takes into account power use both during the render and during the idle time. We can express the result in terms of watt-seconds, also known as joules. The Istanbul system consumes less energy over the course of the test period than the Xeon X We can quantify efficiency even better by considering specifically the amount of energy used to render the scene.
This method should account for both power use and, to some degree, performance, because shorter render times may lead to less energy consumption. The energy efficiency picture comes into sharper focus with this final metric. Our benchmarks sometimes come from unexpected places, and such is the case with this one. David Tabb is a friend of mine from high school and a long-time TR reader. In shotgun proteomics, researchers digest complex mixtures of proteins into peptides, separate them by liquid chromatography, and analyze them by tandem mass spectrometers.
This creates data sets containing tens of thousands of spectra that can be identified to peptide sequences drawn from the known genomes for most lab organisms. Recently, David Tabb and Matthew Chambers at Vanderbilt University developed MyriMatch , an algorithm that can exploit multiple cores and multiple computers for this matching. Source code and binaries of MyriMatch are publicly available. In this test, tandem mass spectra from a Thermo LTQ mass spectrometer are identified to peptides generated from the proteins of S.
The data set was provided by Andy Link at Vanderbilt University. MyriMatch uses threading to accelerate the handling of protein sequences. The database read into memory is separated into a number of jobs, typically the number of threads multiplied by When a thread finishes handling all proteins in the current job, it accepts another job from the queue.
This technique is intended to minimize synchronization overhead between threads and minimize CPU idle time. The most important news for us is that MyriMatch is a widely multithreaded real-world application that we can use with a relevant data set. MyriMatch also offers control over the number of threads used. I should mention that performance scaling in MyriMatch tends to be limited by several factors, including memory bandwidth, as David explains:.
Inefficiencies in scaling occur from a variety of sources. First, each thread is comparing to a common collection of tandem mass spectra in memory. Although most peptides will be compared to different spectra within the collection, sometimes multiple threads attempt to compare to the same spectra simultaneously, necessitating a mutex mechanism for each spectrum.
Second, the number of spectra in memory far exceeds the capacity of processor caches, and so the memory controller gets a fair workout during execution. This benchmark has been available to the public for some time in single-threaded form, but Charles was kind enough to put together a multithreaded version of the benchmark for us with a larger data set.
He has also put a web page online with a downloadable version of the multithreaded benchmark, a description, and some results here. In this test, the application is basically doing analysis of airflow over an aircraft wing. I will step out of the way and let Charles explain the rest:. The CFD grid contains 1. The benchmark executable advances the Mach 0. A benchmark score is reported as a CFD cycle frequency in Hertz. So the higher the score, the faster the computer.
Charles tells me these CFD solvers are very floating-point intensive, but oftentimes limited primarily by memory bandwidth. He has modified the benchmark for us in order to enable control over the number of threads used. I thought it might be nice to plot the performance at different thread counts, which I did in the graph above. Just for kicks, I decided to try running two instances of this benchmark concurrently, with each one affinitized to a socket, and adding the results into an aggregate compute rate.
Doing so proved to offer a nice performance boost. Both the Xeons and Opterons benefited from the change. However you run this test, though, the Nehalem Xeons are simply faster, probably due to their superior memory bandwidth. Overall, Folding Home should be a great example of real-world scientific computing. It then processes a sample work unit of each type.
When either of those WUs are finished, the benchmark moves on to additional WU types, always keeping both cores occupied with some sort of calculation. Should the benchmark run out of new WUs to test, it simply processes another WU in order to prevent any of the cores from going idle as the others finish.
Once all four of the WU types have been tested, the benchmark averages the points per day among them. That points-per-day average is then multiplied by the total number of cores or threads, in the case of SMT in the system in order to estimate the total number of points per day that CPU might achieve. This may be a somewhat quirky method of estimating overall performance, but my sense is that it generally ought to work. I have included results for each of the individual WU types below, so you can see how the different CPUs perform on each.
Once we get to the final analysis, though, its total projected points per day look much stronger. Istanbul is once again a respectable improvement on Shanghai, but not quite fast enough to catch the new Xeons. The POV-Ray benchmark scene has a large single-threaded component, so it produces very different results. The Opteron is faster in this case, giving us a peek at the other side of the core-versus-frequency tradeoff.
Valve uses VRAD to pre-compute lighting that goes into its games. This benchmark tests performance with one of the most popular H. The results come in two parts, for the two passes the encoder makes through the video file. These scores come from the newer, faster version 0.
True to form, the Shanghai Opterons are faster in pass one, while Istanbul is faster the second time around. This works by interlacing, i. That honor would fall to the Xeon W processor that appeared in some of our benchmark results. In terms of raw performance in a 2P system, Nehalem still reigns supreme. Yet Istanbul should be a clear improvement over Shanghai for many workstation-class workloads and most server-class workloads—i.
Whether you are a system integrator or the business owner, we are here for you! We stand behind our hardware, software and systems. We will provide you with a warranty and support for all our products and software. We have served customers in over 50 countries around the world.
We can handle any need, with any system, any where! A great system is only as good as it's hardware. We carry a large range of hardware for different global and industrial needs.
0コメント