What happens when we increase cache memory size?
How does it impact my system performance?
Does increasing cache size/RAM always improve my system performance?
Answers to a couple of these questions are not straightforward as they depend on several factors such as the type of application we run, processor architecture, etc. It is practically impossible and expensive to try out different hardware combinations (in this case cache memory) on actual systems to experiment out. In such cases, we can use computer architectural simulators to analyze various aspects of computer systems.
A computer architectural simulator is a software tool, used to model and mimic the behavior of a computer system’s architecture. It allows researchers, developers, and engineers to study and analyze various aspects of computer systems – such as processors, memory hierarchies, and interconnections—without needing physical hardware. Simulators provide a platform to design, experiment with, and evaluate computer systems in a virtual environment.
A computer architectural simulator helps users to:
Based on the simulation behavior simulators can be classified into:
Example: Simulating the sequence of instructions executed by a CPU.
Example: Evaluating the latency and throughput of a pipeline.
Example: Testing how an operating system performs on new hardware.
Let us glance through some of the popular computer architectural simulators that are used for academic and industrial research purposes
In this blog we will discuss gem5 further as it stands out among these simulators due to several key advantages:
gem5 is a state-of-the-art open-source computer architecture simulator widely used in academia and industry for modeling and evaluating computer systems. It provides a flexible and modular framework for simulating diverse architectures, from simple single-core systems to complex multi-core and heterogeneous setups. gem5 is primarily used for research and development in computer architecture, system software, and hardware-software co-design. gem5 is written primarily in C++ and python. It can simulate a system with devices and an operating system in full system mode (FS mode) or user space-only programs where system services are provided directly by the simulator in syscall emulation mode (SE mode). gem5 supports executing Alpha, ARM, MIPS, Power, SPARC, RISC-V, and 64-bit x86 binaries on CPU models including two simple single CPI models, an out-of-order model, and an in-order pipelined model. It can also run precompiled binaries for performance evaluation.
gem5 provides two memory models for simulating memory systems; classic and Ruby. The table below summarizes their key features
Feature | Classic Model | Ruby Model |
Cache Coherence Protocols | Predefined (MOESI, MESI) | Fully customizable |
Ease of Use | Simple to configure and use | Complex, requires expertise |
Simulation Speed | Faster | Slower |
Flexibility | Limited | High |
Custom Protocol Support | No | Yes |
Use Case | General-purpose simulations | Advanced research and experiments |
Let’s discuss how to use gem5 and try some small exercises to familiarize yourself with the tool.
First, we need to install gem5, the following installation steps will help you with the same
Step 1: Install dependencies
sudo apt install build-essential git m4 scons zlib1g zlib1g-dev libprotobuf-dev protobuf-compiler libprotoc-dev libgoogle-perftools-dev python-dev python
Step 2: Clone gem5 repo
git clone https://github.com/gem5/gem5
Step 3: Build the system
We can build for any supported ISA, here I am taking RISCV as an example-
scons build/RISCV/gem5.opt -j9
Now, let’s try out some experiments with the RISCV system we just built and analyze its performance.
Let’s start by measuring the level 2 cache misses and using an IPC performance metric (Instructions Per Cycle). We will experiment by changing the L2 cache size and see the impact on L2 miss rate and IPC.
For this we need to select one application, and create the application binary for the required ISA; in this case RISC-V binaries
I used Canneal from the PARSEC benchmark suite. We need to build the binaries for RISC-V using RISC-V toolchain
You can find the source code and steps to create riscv binaries for various applications including canneal from the link given below
https://github.com/RALC88/riscv-vectorized-benchmark-suite
After building the benchmark binaries, run the binaries with different cache sizes, in this case, I am experimenting with L2 cache size. For running canneal benchmark with L2 cache size 512 KB you can run this command given below:
./build/RISCV/gem5.opt configs/deprecated/example/se.py –cmd=/home/siva/gem5/canneal_serial.exe –options=”1 15000 2000 input_can/200000.nets 64″ –caches –l2cache –l2_size=512kB –cpu-type=RiscvO3CPU
The possible command line arguments are given inside gem5/configs/common/Options.py file
Once the simulation ends you can check the stats file (gem5/m5out/stats.txt) for required parameters. I have given the values for the L2 hit rate and IPC for different L2 cache sizes as references. The results demonstrate the impact of cache size on IPC and hit rate for a given application.
Similarly, we can verify various architectural concepts, and design our architecture.
The gem5 simulator is a versatile and powerful tool for computer architecture research, enabling detailed simulation and analysis of hardware and software interactions. Its flexibility, support for multiple ISAs, and modular design make it invaluable for exploring emerging technologies, optimizing performance, and studying complex systems.
The gem5 community has been active over the last few years and is frequently updated with new features. Let us try out a comparison of different CPU models such as in-order and out-of-order cpus by similarly comparing their ipc values. Let us try out different CPU models [ Timing, Atomic, Out of Order] and analyze the difference in IPC to start working with gem5.
Feel free to reach out to our team for any further discussion. Write to us at sales@vayavyalabs.com or you can reach us here.