Why Xeon E5-2697 v4 Can Deliver Better Performance Than Xeon Platinum 8160
In Intel’s technical documentation and marketing materials, the Xeon Platinum line appears to be an obvious upgrade over Broadwell-EP processors. More cores, a new interconnect, AVX-512, higher memory bandwidth — everything suggests that Xeon Platinum 8160 should be faster.
But in practice, we often see the opposite: older 2× Xeon E5-2697 v4 outperform 2× Xeon Platinum 8160 in several real-world workloads.
In this article, we’ll break down why this happens and which scenarios favor the older platform.
Per-core performance still matters
Let’s compare the frequencies:
| CPU | Base Clock | Turbo (1–2 cores) | Turbo (all-core) |
|---|---|---|---|
| E5-2697 v4 | 2.3 GHz | ~3.6 GHz | ~2.7–2.8 GHz |
| Platinum 8160 | 2.1 GHz | ~3.7 GHz | ~2.3–2.5 GHz |
In real workloads, E5-2697 v4 consistently maintains 40–80 MHz higher all-core frequency than 8160, which is crucial for non-perfectly scalable tasks.
For video streaming, FFmpeg pipelines, container parsing/muxing, cryptography, and traffic handling, the frequency of individual cores often matters more than the total number of cores.
Workload scalability ≠ number of cores
Xeon 8160 offers 24 cores per socket — 48 in a dual-socket system.
But most real Flussonic workloads, CDN pushes, metadata processing, and IO-heavy tasks do not scale efficiently across 48 cores.
Typical reasons:
- networking processes limited by internal synchronization,
- core Flussonic logic (ingest, DVR, edge proxy) generally uses a limited number of threads.
If the workload heavily uses 10–25 threads but requires high frequency, the 2697 v4 often outperforms the 8160.
Architectural latency: ring bus vs. mesh
Broadwell-EP uses a ring bus, providing predictable inter-core communication.
Skylake-SP moved to a mesh topology, which is excellent for HPC and ML, but introduces extra hops for typical server workloads.
As a result:
- inter-core latencies are higher,
- NUMA balancing affects performance more often,
- workloads with many small data structures may run slower.
This is particularly noticeable in processing numerous short video fragments, metadata operations, and high-frequency IO interactions.
AVX-512 reduces frequency — and that matters for video workloads
Xeon Platinum 8160 supports AVX-512. Sounds great, but in practice:
- AVX-512 reduces frequency by 600–900 MHz,
- x264, x265, and most filters rarely benefit enough to offset that drop,
- many transcoding libraries use AVX2, not AVX-512.
Broadwell-EP does not have AVX-512 → stays stable under load and does not throttle as aggressively.
For Flussonic workloads, AVX-512 rarely provides benefits — but it often harms performance.
Thermal behavior: Skylake-SP runs hotter and throttles more often
- Xeon 8160 TDP: 150W
- Xeon E5-2697 v4 TDP: 145W
Despite similar TDP values, Skylake-SP generates significantly more heat, and in older servers:
- thermal throttling happens frequently,
- coolers struggle with AVX workloads,
- temperature reduces all-core turbo levels.
In real tests, E5-2697 v4 often stays at 2.7–2.8 GHz, while 8160 may drop to ~2.2–2.3 GHz.
BIOS and power management matter
Skylake-SP requires:
- disabling deep C-states,
- enabling Performance mode,
- disabling NUMA balancing in Linux,
- tuning the frequency scaling governor.
Without these settings, the CPU does not reach its expected performance levels.
E5-2697 v4 is less sensitive to BIOS/OS tuning and works close to optimal out of the box.
What this means for Flussonic and video streaming
For typical Flussonic workloads:
- ingesting 50–200 incoming streams,
- DVR on disk,
- partial transcoding,
- mux/demux,
- edge proxying,
- HLS/DASH segmentation,
per-core performance is far more important than simply having many cores.
This often makes:
- E5-2697 v4 deliver more stable throughput,
- lower latency,
- better load distribution,
- and no AVX-512-induced throttling.
Yes — older dual Xeon E5-2697 v4 systems can outperform dual Xeon Platinum 8160 — and it’s completely logical.
Main reasons:
- Higher and more stable per-core frequency.
- Better behavior for workloads with limited scalability.
- Predictable inter-core latency.
- AVX-512 is often overrated in video workloads.
- Less thermal throttling.
- Less dependency on precise BIOS/OS tuning.
For latency-sensitive and throughput-critical services like Flussonic, Broadwell-EP often remains an optimal and reliable choice, outperforming newer but more temperamental Skylake-SP systems.