Tuesday, February 7, 2017

PENTIUM 4 AND POWERPC CACHE ORGANIZATIONDS

  • 80386 – no on chip cache 
  • 80486 – 8k using 16 byte lines and four way set 

associative organization

    Pentium (all versions) – two on chip L1 caches

  • Data & instructions 


    Pentium 4 – L1 caches

  • 8k bytes 
  • 64 byte lines 
  • four way set associative 

    L2 cache

  • Feeding both L1 caches 
  • 256k 
  • 128 byte lines 
  • 8 way set associative 


Figure 4.1 Pentium 4 Blocks Diagram

    Fetch/Decode Unit

  • Fetches instructions from L2 cache 
  • Decode into micro-ops 
  • Store micro-ops in L1 cache 

    Out of order execution logic

  • Schedules micro-ops 
  • Based on data dependence and resources 
  • May speculatively execute 

    Execution units

  • Execute micro-ops 
  • Data from L1 cache 
  • Results in registers 

    Memory subsystem

  • L2 cache and systems bus



Pentium 4 Design Reasoning

  • Decodes instructions into RISC like micro-ops before L1 cache
  • Micro-ops fixed length 
        Superscalar pipelining and scheduling

  • Pentium instructions long & complex 
  • Performance improved by separating decoding from scheduling & pipelining 
        (More later – ch14) 
  • Data cache is write back 
        Can be configured to write through 
  • L1 cache controlled by 2 bits in register 
        CD = cache disable 
        NW = not write through 
        2 instructions to invalidate (flush) cache and write back then Invalidate


PowerPC Cache Organization


    The PowerPC cache organization has evolved with the overall architecture of the PowerPC family, reflecting the relentless pursuit of performance that is the driving force for all microprocessor designers.

The original model

  • 601 – single 32kB 8 way set associative 
  • 603 – 16kB (2 x 8kB) two way set associative 
  • 604 – 32kB 
  • 610 – 64kB 
  • G3 & G4 
        64kB L1 cache – 8 way set associative

        256kB, 512kB or 1M L2 cache – two way set associative



PowerPC Internal Caches


Table 2 PowerPC Internal caches


PowerPC G4



Figure 4.2 Pentium 4 Block Diagram

    f
igure 4.2 Provides a simplified view of the PowerPC G4 organization, highlighting the placement of the two caches. The core execution unit are two integer arithmetic and logic units, which can execute in parallel, and a floating-point unit with its own multiply, add, and divide components. The data cache feeds both integer and floating-point operations via a load/store unit. The instruction cache, which is read only, feeds into and instruction unit, whose operation is discussed in Chapter 14.

    The L1 caches are eight-way set associative. The L2 cache is a two-way set associative cache with 256K, 512K, or 1MB if memory.


Comparison of Cache Sizes


Table 3 Cache Sizes of Some Processors





Reference: William Stallings. (2003). Computer Organization & Architecture DESIGN FOR PERFORMANCE(6th ed.):                    Pentium 4 and PowerPC Cache Organizations(pp 120-125). Upper Saddle River, NJ: Pearson

No comments:

Post a Comment