I've read that the CPU cache has a line size; e.g. Haswell is expected to expand the cache line size to 128B compared to the 64B of Sandy Bridge and Ivy Bridge.
What is the smallest block of data that can be read from dual-channel DDR3? If this block is larger than 64B, what mechanisms are in place to optimize the cache line fills? For example, if the CPU intends to fill two lines of cache from RAM that can be read in a single access/block, will the CPU memory controller "waste" cache lines it doesn't intend to use so that a single RAM access is used? Or will the controller discard other data from the RAM access read block, thus optimizing the cache utilization? Or some other scheme that I haven't considered?
Is the Hennessy/Patterson the best (current) resource, here? Or will my 2005 Patterson/Hennessy contain the basic design methodology? Or some other text?