All Articles

Computer architecture at a high level

Preface

Ever since I started reading more about malware reverse engineering, the topics of compiler optimization have came up a few times. Being fascinated by the topic, when I had the opportunity to take the class for my Masters, I signed up! Most of my time was spent in the software world so this was a great chance to get to know hardware a bit better!

Since I’ll be dedicating lots of time studying this topic for the next semester, I thought I’d write an intro as a review for myself (as I begin to study for the midterm), and for those who may be interested as well!

Computer Hardware

If you have ever built a computer yourself, you probably remember having these chips that are the CPU, RAM, etc… When we plug our power cord into the outlet, current flows and give us power. The numbers of transistors connected to the output and the technology determines the capacitance of wires and transistors. For a fixed task, slowing clock rate (frequency) reduces power but not energy. So to save energy & dynamic power, most CPUs now will turn off clock of inactive modules. Then you go into memory management, the more RAM, the faster your computer can execute programs. That’s due to it having more memory slots…I started thinking about memory slot similarly to cars in a parking lot. How you park your car, and if you can find it each time. If it’s easier if you have ONE spot you always park at and knowing exactly where to find it each time vs. you can park anywhere but take longer to find. Computer struggle with similar search and place problems that we have. So the better the architectural design that we can come up with to optimize how fast instructions can execute, and accessing data in the register faster, then the more seamless the user experience will be. So let’s talk architecture!

Instruction Set Architecture (ISA)

In order to have a functioning computing system, the instruction set architecture is the interface between hardware and software. You’ve probably heard about companies such as Intel or AMD. They both have processors implementing x86 instruction sets with very different internal designs. Depending on the use case, often times, engineers have to make trade offs between the added overhead vs. speed (also cost). Intel are the chips frequently used in our computer while ARM are on most mobile phone.

A lot of new hotness on computing system is all about optimization. When you optimize something, you can increase speed, save time, save power consumptions, save money…etc. So there are a lot of research that are currently in this space. Amdahl’s Law define the limitation of optimization - the part of the system that cannot be optimized. For example, when optimizing a processor, speeding up one type of instruction will typically slow down another (Incorporating a fast FP unit takes space, and may move another unit further away (ALU)).

The different types of instruction sets are focused on different type of optimization. So let’s get into it!

RISC

RISC stands for Reduced Instruction Set Computer and uses a load/store architecture which divides its instruction to one that can access memory, and another that can perform arithmetic logic unit (ALU) operations. RISC came out of Berkeley and was commercialized as SPARC by Sun Microsystem. MIPS is another project that’s also based on the RISC ISA and is also commercialized. There’re MANY implementations of RISC. Each is optimized for different things. So let’s talk optimizations! Its two main performance techniques are: instructional level parallelism and through the use of caches.

Optimization techniques

Pipelining

Sometimes, when one code run, it has to wait for another instruction to be finished to take its value and continue to compute its final value. Sometimes, optimization is simply moving instruction around to let multiple instructions that can be run simultaneously to run in parallel to save time. So pipelining divides instructions into sub-steps for instructions to be properly executed in parallel. Under that, there are additional techniques such as out of order execution which executes instructions in an order different from the program. Or Instruction-Level Parallelism (ILP) which is how many instructions in a computer program can be executed simultaneously. Superscalar architectures dispatch individual instructions to be executed independently in different parts of the processor.

Principle of Locality

Fun fact: to tie this with security, in most investigations, we pivot off of these two principles in order to find additional evidence and artifacts. So…Same thing with hardware. These concept of locality also apply! Programs tend to reuse data and instructions they have used recently.

  • Temporal locality: locality in time - referenced item will tend to be referenced again soon (loops, reuse)
  • Spatial locality: locality in space - items close to referenced item tend to be referenced soon (array access, straight line code)

That’s all I have for this blog (that’s interesting anyways!). Enjoy and I shall go back to study the (less-interesting) stuffs! ;)