Cache Operation
How does a cache operates ?
By default most loads/stores to cacheable memory are served by the cache hierarchy. Exceptions include MMIO, unreachable or write-combining regions, and non-temporal operations. A core looks up cache lines in L1 first and on a miss proceeds to lower levels.
sequence plot for a read operation
Can an application programmer directly touch the cache ?
Not by directly addressing sets/ways—caches are hardware-managed. However, user-mode code can influence behavior with prefetch instructions and cache maintenance/streaming ops (e.g., CLFLUSH/CLFLUSHOPT/CLWB, non-temporal loads/stores on x86), and via OS-exposed page cacheability attributes. Use carefully; semantics are architecture- and OS-dependent.
sequence plot for a write operation
Variants
- Write-through: on any write, forward to next level; fewer dirty lines, more bandwidth.
- No-write-allocate: on miss, do not fill; write around to lower level.
- Non-temporal stores: bypass some caches and use write-combining; still require ownership/invalidate but avoid filling lines.
Cache Inclusion Policy
exclusive v.s. inclusive cache :
- Inclusive: Lower level (e.g., LLC) is a superset of upper levels. Evicting from LLC invalidates copies in L1/L2, simplifying snoops but duplicating data.
- Exclusive: A line resides in only one level at a time; capacities add, and evictions typically move lines between levels (victim behavior).
- Non-inclusive (NINE): Neither strictly inclusive nor exclusive; common in modern designs. Policy varies by microarchitecture.