A world leader in embedded and mobile software.
Wind River Linux Performance Studio
for Intel Architecture
CONTACT US  Bookmark and Share
Home : Products : Linux : Performance Studio for Intel Architecture

Make Your Embedded Linux Applications Sing

Faster algorithms. Lower power consumption. Increased throughput. Greater responsiveness. It all adds up to a better return on investment.

Performance Studio integrates the latest generation of Intel development tools with Wind River Linux to deliver dramatic performance and productivity enhancements for teams developing embedded applications on Intel® Architecture platforms. Optimizing your code means you can tap the full power of your hardware—using your Intel embedded platform of choice. This includes the Intel platforms running the latest embedded Core™, Xeon®, or Atom™ processors.

Performance Studio consists of the following tools:

  • Intel® C/C++ Compiler, to boost performance on Intel architecturess
  • Intel® Integrated Performance Primitives, which provide platform-optimized algorithms, code samples and APIs for high-bandwidth applications
  • Intel® VTune™ Amplifier XE, which delivers actionable analysis of code behavior and performance without having to instrument the source code

All three tools are united in a single build system layer, so Wind River Linux users can install and use them easily. Integration with Wind River Workbench offers enhanced visibility into your software at all stages of the development life cycle.

Performance Studio is designed to optimize performance of CPU throughput–hungry applications like these:

  • Multimedia
  • Data processing
  • Signal processing
  • Cryptography

All Performance Studio components are tested and supported across all Intel board support packages (BSPs), all Linux kernel types (including preempt_rt) and all root file system types (including cgl and small).

 

Intel C/C++ CompilerĀ for Wind River Linux: Remove the roadblocks to peak performance.

Nobody's happy when an application hangs. For high-bandwidth tasks like streaming video, signal processing, or encrypting and decrypting data, code that works well is essential to applications that work well. Performance Studio's Intel C/C++ Compiler (ICC) works out of the box with Wind River Linux to ensure that your embedded code is optimized for Intel embedded platforms. With ICC for Wind River Linux, you get access to compiler performance leadership on Intel embedded platforms. ICC takes advantage of the latest IA platform optimizations and outperforms other compilers, particularly for processor-intensive algorithms or algorithms involving floating point arithmetic on all Intel embedded platforms—including Atom SoC platforms. By simply recompiling your code with ICC for Wind River Linux you can achieve significant performance gains. Intel C/C++ Compiler for Wind River Linux provides the following:

  • Multi-threading capabilities
  • Optimized performance libraries
  • Error-checking and security
  • Profiling tools

Here are examples of the optimizations you can achieve with ICC:

Architectural
Optimizations

Vectorizer

Interprocedural
Optimizations

Profile-guided
Optimizations

Intel Atom
Intel Core
  • -xavx
  • -xSSE3_ATOM
  • In-order scheduler
  • SIMD support
  • SIMD Parallelism
  • Key to loop performance
  • Great multimedia processing
  • In-lining
  • Passing arguments.in registers
  • Dead-code elimination
  • Execution time feedback
  • Iterative optimization process
  • Use-case-driven optimizations
  • Better cache behavior

 

Intel Integrated Performance Primitives for Wind River Linux: Smart kids use the library.

With Intel Integrated Performance Primitives (IPP), you don't have to reinvent the wheel—just drive fast in the right direction. Intel IPP provides literally thousands of multi-core-ready, highly optimized algorithms and code samples to speed development of sophisticated high-bandwidth media-processing, data-processing, and signal-processing applications. Intel IPP provides cross-platform APIs to ensure that your code can be reused on current and future Intel embedded platforms.

The Intel IPP libraries are designed to deliver performance beyond what can be achieved with an optimizing compiler by implementing low-level optimizations that take full advantage of specific Intel platform features such as the SSE extensions, the 256-bit Intel AVX SIMD instructions, or other optimized instruction sets on Intel Core, Xeon, and Atom processors.

Intel Integrated Performance Primitives

Intel VTune Amplifier XE for Wind River Linux: Better analysis leads to better performance.

Workbench-integrated VTune gives you additional insight into your software running on Intel embedded hardware, helping you to pinpoint areas that would benefit from optimization. VTune provides the following:

  • Hotspot, concurrency, and profile analysis
  • Low overhead data collection
  • Tuning of threaded and non-threaded code

VTune extends the profiling and analysis capabilities in Workbench, providing you with a powerful toolkit to examine your application algorithms, Wind River Linux, and performance counters on your Intel hardware platform—without requiring special builds.

VTune gives you actionable answers to questions like these.

Where is my application...
Spending Time?
  • Focus tuning on functions taking time
  • See call stacks
  • See time on source
Wasting Time?
  • See cache misses on your source
  • See functions sorted by # of cache misses
Waiting Too Long?
  • See locks by wait time
  • See CPU utilization during wait

 

VTune Amplifier XE for Wind River Linux

Integration Means Your Development Team Gets Optimized, Too

  • Integration with Wind River Linux and Wind River Workbench: Eliminates compiler configuration hassles and expands the toolset for profiling and diagnosing your software

  • 600+ validated user space packages: Automatically cross-compiles user space packages

  • Performance boosts and power savings: Runs applications faster, so your processor can spend more time at low-power idle

  • Instruction set level optimizations: For each compatible processor, the performance primitives detect the instruction set level and dispatch optimized code to take advantage of SIMD instructions

  • Intel IPP libraries templatized for Wind River Linux: Enhances developer productivity with more than 12,000 algorithms for data, image, audio, and signal processing, easy to access and use in Wind River Linux Projects

  • Multi-core support: Includes thread-safe primitives to let you get the most from Intel Architecture multi-core platforms, and VTune Amplifier XE to help you analyze and diagnose multi-core and concurrency issues

  • GNU and GCC compatibility: Recompiles existing code with ICC to take advantage of optimizations available for your Intel Architecture platform

  • Source code usage samples: Shows you how to use the performance primitives and VTune Amplifier XE with Workbench-integrated code samples

Let the Performance Games Begin

The tools that comprise Wind River Linux Performance Studio for Intel Architecture were rigorously tested throughout the development process. In addition, we performed preliminary benchmarking using the Phoronix Test Suite. All benchmarks measure performance on Intel hardware. All compare the performance of ICC with that of the GCC compiler.

Our benchmarking study produced these conclusions:

  • Applications with Computationally Intense Workload (CIW) characteristics benefit from Performance Studio
    • ICC is designed for compute-intensive applications
    • Applications that are IO based can reap performance benefits where Intel IPP is used in conjunction with ICC

We invite users of Wind River Linux Performance Studio for Intel Architecture to run additional benchmarks andshare the results with us.

The C-Ray Benchmark

This is a test of C-Ray, a simple ray tracer designed to test the floating-point CPU performance. This test is multi-threaded (16 threads per core), will shoot 8 rays per pixel for anti-aliasing, and will generate a 1600x1200 image.

In this test, we compared the performance of ICC versus the GCC compiler.

Measured in seconds; fewer is better

C-Ray Benchmark

(Hardware: Intel Atom E660T @1.30 GHz; OS: Wind River Linux, glibc_std; Std. Err (GCC): +/- 0.11, Std. Err (ICC): +/- 0.01)

The Himeno Benchmark

The Himeno benchmark is a linear solver of fluid analysis pressure Poisson using a point-Jacobi method. It measures how many million floating point operations per second can be processed.

In this test, we measured the performance of ICC versus the GCC compiler.

Measured in MFLOPS; more is better

Himeno Benchmark

(Hardware: Intel Atom E660T @1.30 GHz; OS: Wind River Linux, glibc_std; Std. Err (GCC): +/- 0.06, Std. Err (ICC): +/- 0.06)

OpenSSL Benchmark

OpenSSL is an open-source toolkit that implements SSL (Secure Sockets Layer) and TLS (Transport Layer Security) protocols. This test measures the RSA 4096-bit performance of OpenSSL.

In this test, we measured the performance of ICC versus the GCC compiler, and the performance of ICC and Intel IPP versus the GCC compiler.

Measured in number of signs per second; more is better

1. This test shows little difference in compiler performance.

OpenSSL Benchmark

(Hardware: Intel Xeon (Sandy Bridge) @2.00 GHz; OS: Wind River Linux, glibc_std; Std Error: 0.08, Std. Dev: 0.54%)

2. This test shows significant performance enhancement when ICC is used with Intel IPP.

OpenSSL Benchmark

(Hardware: Intel Xeon 5645 @2.40 GHz; OS: Wind River Linux, glibc_std; Std Error: 0.03, Std. Dev: 0.13%)

Learn more: Boosting OpenSSL AES Encryption with Intel IPP