CPU Caches

A cache is a high-speed data storage layer which stores a subset of data, typically transient portions, so that future requests for that data are served up faster than by accessing the data's primary storage location.

Info

In modern computer systems, cache is typically implemented using SRAM. This is because SRAM is faster and has lower latency compared to other types of memory. So, when data is frequently accessed, it is stored in the cache (which is made of SRAM) to allow for faster retrieval. Basically, the SRAM serves in this situation as an extension of the register set of the processor

A minimum cache configuration

!what-every-programmer-should-know-about-memory, p.14

A modern Configuration of Memory Cache:

![[what-every-programmer-should-know-about-memory.pdf#page=15&rect=84,588,267,767&color=note]]

A modern configuration of memory cache with SMP architecture

![[what-every-programmer-should-know-about-memory.pdf#page=15&rect=58,222,294,371&color=note]]

In this figure, we have two processors, each with two cores, each of which has two thread (darker gray block). The threads share the level 1 caches (L1-i and L1-d). All cores of the CPU share the higher-level caches, like L2 and L3.

Relative speeds of access types of memory

Register < SRAM < DRAM