Primate Labs: Discussion

relative performance of 4770K across linux and windows

2013-09-20T01:31:01Z

Any reply to this email. I would think the same hardware should perform similarly.. but evidently there's a very large discrepancy in performance. There seems to be a serious performance issue when comparing different platforms. I have no idea what apple's done with their LLVM but I'd imagine it's not very comparable to what is available on linux. Also, for processors which are x86, there are optimized math libraries available in linux and windows (and I'd imagine apple) with *GEMM and *FFT routine you could call. That away you're testing something relevant in performance other than memory latency. You're *GEMM should be close to what's theoretically capable. Lastly there are optimized AES and other encryption implementations, even ISA instructions, why not use them?

Tim

relative performance of 4770K across linux and windows

2013-09-20T20:27:52Z

Hi Tim,

Thank you for your questions, and sorry for the delay in getting back to you. We appreciate when users consider the scores so carefully.

During development of the benchmark we run tests similar to what you are doing here: run the bench mark on the same hardware across different OSes. We see small performance differences, but are unable to reproduce the severe discrepancy that you have observed on either Sandy Bridge or Ivy Bridge (we do not have a Haswell system to test).

We enable auto-vectorization on all compilers. However, SFFT is not vectorized on any platform, so I do not believe that vectorization is causing the differences you are observing.

We certainly agree that the same hardware should perform similarly and we strive for this. We want the scores to represent the hardware performing well, but we also intend the scores to reflect execution performance of real-world application code. We expect that a programmer will write their code once and compile that same code for each platform that he supports. We choose not to use optimized vendor libraries for workloads such as GEMM and FFT since we expect such libraries to be optimized for each target architecture making a direct comparison of the score troublesome. Furthermore, if the libraries are proprietary we don't know exactly what optimizations the library performs.

We use the AES-NI instructions on systems that support them. Similarly, we use the SHA1 instructions when they are available. In this way we have encryption and hash function implemented in both hardware and software: AES and SHA1 in hardware and Twofish and SHA2 in software.

I hope this addresses your concerns and helps to explain our design choices. Let me know if you have any further questions or concerns and I'd be happy to help out.

Best,
John

relative performance of 4770K across linux and windows

2013-09-20T20:31:05Z

Hi Tim,

One thing I forgot to mention regarding the low AES performance you observed on Linux. We're aware of situations where Linux is slow to "ramp up" the frequency of the processor (relative to Windows). Since AES is the first workload executed, that might be what is happening here? In our testing AES performance is almost identical between Windows and Linux on the same hardware.

Best,
John

relative performance of 4770K across linux and windows

2017-01-08T18:43:52Z

A big disappointment here, also. Similar test runs on the same hex core hardware showed win 8 faster (by about the same factor you showed), in both integer and floating point, than a "scientific linux" version we tested.

A number of knowledgable persons in computer science told me that linux would be fastest for scientific/math computing. Turned out not to be true.

We need to complete three continuous weeks (24X7) of multivariate statistical computations for 3/4 million cancer patients.

Would you have found better FP and INT performance had you tested the commercial Red Hat version of linux instead?