Caches:the Theory
The theory of caching is very simple. Put a small amount of fast, expensive memory in a computer, and arrange automatically for that memory to store the data which are accessed frequently. One can then define a cache hit rate, that is, the number of memory accesses which go to the cache divided by the total number of memory accesses. This is usually expressed as a percentage.
One of the reasons for the expense of the fast SRAM used in caches is that it requires around six transistors per bit, not one.
The first paper to describe caches was published in 1965 by Maurice Wilkes (Cambridge).
The first commercial computer to use a cache was the IBM 360/85 in 1968.

Caches: the Problem
Main memory may be slow compared to a CPU, but it is not that slow. Any cache control logic must be very fast (and therefore very simple) or there will be no improvement in speed.



A badly designed cache controller can be worse than no cache at all.