Skip to main content
Understanding computer fundamentals is critical for system design interviews. This guide covers the core concepts you need to know.

Memory Hierarchy

Types of Memory Memory types vary by speed, size, and function, creating a multi-layered architecture that balances cost with the need for rapid data access. By grasping the roles and capabilities of each memory type, developers and system architects can design systems that effectively leverage the strengths of each storage layer, leading to improved overall system performance and user experience.

Common Memory Types

Tiny, ultra-fast storage within the CPU for immediate data access.
  • Speed: Fastest
  • Size: Smallest (bytes)
  • Location: Inside CPU
  • Use Case: Active instruction execution
Small, quick memory located close to the CPU to speed up data retrieval.
  • Speed: Very fast
  • Size: Small (KB to MB)
  • Levels: L1, L2, L3 cache
  • Use Case: Frequently accessed data
Larger, primary storage for currently executing programs and data.
  • Speed: Fast
  • Size: Moderate (GB)
  • Volatile: Data lost on power off
  • Use Case: Active program execution
Fast, reliable storage with no moving parts, used for persistent data.
  • Speed: Moderate
  • Size: Large (GB to TB)
  • Persistent: Data retained
  • Use Case: Operating systems, applications
Mechanical drives with large capacities for long-term storage.
  • Speed: Slower
  • Size: Very large (TB)
  • Cost: Lower per GB
  • Use Case: Bulk storage, backups
Offsite storage for data backup and archiving, accessible over a network.
  • Speed: Slowest
  • Size: Unlimited
  • Access: Network-dependent
  • Use Case: Backups, archives, disaster recovery

Process vs Thread

Process vs Thread

Understanding Programs, Processes, and Threads

To better understand this question, let’s first take a look at what a Program is. A Program is an executable file containing a set of instructions and passively stored on disk. One program can have multiple processes. For example, the Chrome browser creates a different process for every single tab. A Process means a program is in execution. When a program is loaded into the memory and becomes active, the program becomes a process. The process requires some essential resources such as registers, program counter, and stack. A Thread is the smallest unit of execution within a process.

The Relationship Flow

1

Program Creation

The program contains a set of instructions.
2

Loading into Memory

The program is loaded into memory. It becomes one or more running processes.
3

Thread Execution

When a process starts, it is assigned memory and resources. A process can have one or more threads. For example, in the Microsoft Word app, a thread might be responsible for spelling checking and the other thread for inserting text into the doc.

Key Differences

Independence

Processes are usually independent, while threads exist as subsets of a process.

Memory

Each process has its own memory space. Threads that belong to the same process share the same memory.

Weight

A process is a heavyweight operation. It takes more time to create and terminate.

Context Switching

Context switching is more expensive between processes.
Performance Tip: Inter-thread communication is faster for threads since they share the same memory space.

Memory Management

Paging vs Segmentation

Paging vs Segmentation
Paging is a memory management scheme that eliminates the need for contiguous allocation of physical memory. The process’s address space is divided into fixed-size blocks called pages, while physical memory is divided into fixed-size blocks called frames.Address Translation Process:
  1. Logical Address Space: The logical address (generated by the CPU) is divided into a page number and a page offset.
  2. Page Table Lookup: The page number is used as an index into the page table to find the corresponding frame number.
  3. Physical Address Formation: The frame number is combined with the page offset to form the physical address in memory.
Advantages:
  • Eliminates external fragmentation
  • Simplifies memory allocation
  • Supports efficient swapping and virtual memory

Concurrency vs Parallelism

Concurrency vs Parallelism In system design, it is important to understand the difference between concurrency and parallelism. As Rob Pike (one of the creators of GoLang) stated: “Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.” This distinction emphasizes that concurrency is more about the design of a program, while parallelism is about the execution.

Concurrency

About Dealing With

Concurrency is about dealing with multiple things at once. It involves structuring a program to handle multiple tasks simultaneously, where the tasks can start, run, and complete in overlapping time periods, but not necessarily at the same instant.Key Points:
  • About program design and composition
  • Can work on single-core processors
  • Useful for I/O-bound operations
  • Enables responsiveness
Use Cases:
  • Web servers handling multiple requests
  • UI applications remaining responsive
  • File I/O operations
  • Network communication

Parallelism

About Doing

Parallelism refers to the simultaneous execution of multiple computations. It is the technique of running two or more tasks or computations at the same time, utilizing multiple processors or cores within a computer to perform several operations concurrently.Key Points:
  • About program execution
  • Requires multiple processing units
  • Useful for CPU-bound operations
  • Increases throughput
Use Cases:
  • Heavy mathematical computations
  • Data analysis
  • Image processing
  • Real-time processing
Concurrency enables a program to remain responsive to input, perform background tasks, and handle multiple operations in a seemingly simultaneous manner, even on a single-core processor.

How Computer Programs Execute

How Programs Run

Execution Flow

1

User Interaction

By double-clicking a program, a user is instructing the operating system to launch an application via the graphical user interface.
2

Program Preloading

Once the execution request has been initiated, the operating system first retrieves the program’s executable file. The operating system locates this file through the file system and loads it into memory in preparation for execution.
3

Dependency Resolution

Most modern applications rely on a number of shared libraries, such as dynamic link libraries (DLLs). These dependencies must be resolved and loaded.
4

Memory Allocation

The operating system is responsible for allocating space in memory for the program’s code, data, and stack.
5

Runtime Initialization

After allocating memory, the operating system and execution environment (e.g., Java’s JVM or the .NET Framework) will initialize various resources needed to run the program.
6

Execution Begin

The entry point of a program (usually a function named main) is called to begin execution of the code written by the programmer.
7

Von Neumann Architecture

In the Von Neumann architecture, the CPU executes instructions stored in memory, following the fetch-decode-execute cycle.
8

Program Termination

When the program has completed its task, or the user actively terminates the application, the program begins a cleanup phase. This includes closing open file descriptors, freeing up network resources, and returning memory to the system.

Key Takeaways

Memory Hierarchy

Understanding the trade-offs between speed, size, and cost across different memory types is crucial for system design.

Process Management

Know when to use processes vs threads based on isolation needs, resource sharing, and performance requirements.

Memory Management

Both paging and segmentation have their place - paging for simplicity and efficiency, segmentation for logical organization.

Concurrency Patterns

Design for concurrency to handle multiple tasks; leverage parallelism to speed up computation-heavy operations.

Interview Tips

Common Pitfall: Don’t confuse concurrency with parallelism. You can have concurrency without parallelism (single-core systems), but parallelism always implies some form of concurrency.
When discussing system design, always consider the memory hierarchy. Decisions about caching, data structures, and access patterns should account for the speed differences between memory levels.

Build docs developers (and LLMs) love