Listing 2. psprocess Output from the Cache-Unfriendly Version of the Loop PerfSuite Hardware Performance Summary Report Version : 1.0 Created : Thu Feb 19 22:43:01 2004 Generator : psprocess 0.2 XML Source : psrun.22865.xml Processor and System Information ==================================================== Node CPUs : 2 Vendor : Intel Family : Pentium Pro (P6) CPU Revision : 6 Clock (MHz) : 997.173 Memory (MB) : 1510.82 Pagesize (KB) : 4 Cache Information ==================================================== Cache levels : 2 -------------------------------- Level 1 Type : instruction Size (KB) : 16 Linesize (B) : 32 Assoc : 4 Type : data Size (KB) : 16 Linesize (B) : 32 Assoc : 4 -------------------------------- Level 2 Type : unified Size (KB) : 256 Linesize (B) : 32 Assoc : 8 Index Description Counter Value =================================================== 1 Conditional branch instructions........ 52663367 2 Branch instructions.................... 52650952 3 Conditional branch ins mispredicted...... 112009 4 Conditional branch instructions taken.. 52610596 5 Branch target address cache misses........ 31020 6 Requests for excl acc to clean cache line.. 1165 7 Requests for cache line invalidation.......... 0 8 Requests for cache line intervention...... 32801 9 Requests for excl acc to shared cache ln.. 26537 10 Floating point multiply instructions.......... 0 11 Floating point divide instructions............ 0 12 Floating point instructions........... 208155552 13 Hardware interrupts....................... 22134 14 Total cycles........................ 21407855039 15 Instructions issued.................. 2010041200 16 Instructions completed................ 624104056 17 Vector/SIMD instructions...................... 0 18 Level 1 data cache accesses........... 678945043 19 Level 1 data cache misses............. 244760094 20 Level 1 instruction cache accesses.. 21332388384 21 Level 1 instruction cache misses.......... 22546 22 Level 1 instruction cache reads..... 21309322857 23 Level 1 load misses................... 244318153 24 Level 1 store misses....................... 9852 25 Level 1 cache misses.................. 243826788 26 Level 2 data cache reads.............. 243745402 27 Level 2 data cache writes................. 10317 28 Level 2 instruction cache accesses........ 24335 29 Level 2 instruction cache reads........... 21362 30 Level 2 cache misses.................. 212665026 31 Cycles stalled on any resource...... 21057880641 32 Instruction TLB misses....................... 64 Statistics =================================================== Counting domain............................... user Multiplexed.................................... yes Graduated floating point ins. per cycle...... 0.010 Vector ins. per cycle........................ 0.000 Floating point ins per graduated ins ........ 0.334 Vector ins per graduated ins ................ 0.000 Floating point ins per L1 data cache access.. 0.307 Graduated ins per cycle...................... 0.029 Issued ins per cycle......................... 0.094 Graduated ins per issued ins................. 0.310 Issued ins per L1 ins cache miss......... 89152.896 Graduated ins per L1 ins cache miss...... 27681.365 Level 1 ins cache miss ratio................. 0.000 Level 1 data cache access per graduated ins.. 1.088 % floating point ins of all graduated ins... 33.353 % cycles stalled on any resource............ 98.365 Level 1 ins cache misses per issued ins...... 0.000 Level 1 cache read miss ratio (instruction).. 0.000 Level 1 cache miss ratio (data).............. 0.361 Level 1 cache miss ratio (instruction)....... 0.000 Bandwidth used to level 1 cache (MB/s)..... 363.437 Bandwidth used to level 2 cache (MB/s)..... 316.988 MFLIPS (cycles).............................. 9.696 MFLIPS (wall clock).......................... 9.530 MVOPS (cycles)............................... 0.000 MVOPS (wall clock)........................... 0.000 MIPS (cycles)............................... 29.071 MIPS (wall clock)........................... 28.572 CPU time (seconds).......................... 21.469 Wall clock time (seconds)................... 21.843 % CPU utilization........................... 98.285