451ab91e |
1 | How to profile R4300 instructions with mupen64plus: |
2 | |
3 | Pre-requisites: |
4 | - either 32-bit (x86) or 64-bit (amd64) Linux system |
5 | - Lots of memory. It doesn't cause swapping in a machine with 2GB of ram, but it probably would with 1GB |
6 | - OProfile (http://oprofile.sourceforge.net/) |
7 | - Mupen64plus source code |
8 | |
9 | Procedure: |
10 | 1. Install OProfile linux tool |
11 | |
12 | 2. Build r4300 profile tool with "gcc -o r4300prof r4300prof.c" |
13 | |
14 | 3. Build mupen64plus with "make DBGSYM=1 DBG_PROFILE=1" |
15 | |
16 | 4. Delete any pre-existing profiling files: "rm instructionaddrs.dat" |
17 | |
18 | 5. Clear any residual profiling data in oprofile: "sudo opcontrol --reset" |
19 | |
20 | 6. Make profiling run by typing in console: |
21 | sudo opcontrol --start ; ./mupen64plus --nogui --emumode 2 --audio ./plugins/dummyaudio.so <path-to-n64-rom> ; sudo opcontrol --stop |
22 | |
23 | 7. Exit emulator with Escape key after running for desired time |
24 | |
25 | 7. Move data file into tools folder: "mv instructionaddrs.dat tools/" |
26 | |
27 | 8. Dump instruction-level profiling data for mupen64plus with oprofile: |
28 | opreport -d -l ./mupen64plus > ./tools/prof-mupen64-detail.txt |
29 | |
30 | 9. Run tool to generate r4300 instruction profile report: |
31 | cd tools ; ./r4300prof instructionaddrs.dat prof-mupen64-detail.txt |
32 | |
33 | |
34 | Example profile output: |
35 | |
36 | Loading instructionaddrs.dat... |
37 | 283844 r4300 instruction locations read. |
38 | 247911 non-empty MIPS instructions. |
39 | Loading prof-mupen64-detail.txt... |
40 | 118181 lines in sample data file. |
41 | Found 117905 profile hits. |
42 | |
43 | Instruction time (samples): |
44 | reserved: 00007515 NI: 00000000 J: 00000043 JAL: 00003345 BEQ: 00004498 |
45 | BNE: 00003424 BLEZ: 00000222 BGTZ: 00000029 ADDI: 00000585 ADDIU: 00017439 |
46 | SLTI: 00000765 SLTIU: 00000089 ANDI: 00003847 ORI: 00000381 XORI: 00000035 |
47 | LUI: 00010389 BEQL: 00001873 BNEL: 00002617 BLEZL: 00000013 BGTZL: 00000010 |
48 | DADDI: 00000000 DADDIU: 00000000 LDL: 00000000 LDR: 00000000 LB: 00002600 |
49 | LH: 00006653 LW: 00024840 LWL: 00000001 LBU: 00003525 LHU: 00004954 |
50 | LWU: 00000000 LWR: 00000003 SB: 00003007 SH: 00004133 SW: 00023221 |
51 | SWL: 00000000 SWR: 00000000 SDL: 00000000 SDR: 00000000 LWC1: 00012405 |
52 | LDC1: 00001855 LD: 00000815 LL: 00000000 SWC1: 00007671 SDC1: 00001326 |
53 | SD: 00001233 SC: 00000000 BLTZ: 00000383 BGEZ: 00000331 BLTZL: 00000161 |
54 | BGEZL: 00000168 BLTZAL: 00000000 BGEZAL: 00000115 BLTZALL: 00000000 BGEZALL: 00000000 |
55 | SLL: 00003376 SRL: 00000604 SRA: 00000686 SLLV: 00000015 SRLV: 00000039 |
56 | SRAV: 00000000 JR: 00001358 JALR: 00000004 SYSCALL: 00000000 MFHI: 00000058 |
57 | MTHI: 00000002 MFLO: 00000277 MTLO: 00000004 DSLLV: 00000000 DSRLV: 00000000 |
58 | DSRAV: 00000000 MULT: 00000000 MULTU: 00000583 DIV: 00000941 DIVU: 00000000 |
59 | DMULT: 00000000 DMULTU: 00000000 DDIV: 00000000 DDIVU: 00000000 ADD: 00000170 |
60 | ADDU: 00001706 SUB: 00000037 SUBU: 00000833 AND: 00000909 OR: 00006360 |
61 | XOR: 00000079 NOR: 00000004 SLT: 00000925 SLTU: 00001401 DADD: 00000000 |
62 | DADDU: 00000000 DSUB: 00000000 DSUBU: 00000000 DSLL: 00000000 DSRL: 00000000 |
63 | DSRA: 00000000 TEQ: 00000000 DSLL32: 00000000 DSRL32: 00000000 DSRA32: 00000000 |
64 | BC1F: 00000286 BC1T: 00000032 BC1FL: 00000775 BC1TL: 00000077 TLBWI: 00000000 |
65 | TLBP: 00000000 TLBR: 00000000 TLBWR: 00000000 ERET: 00000036 MFC0: 00000396 |
66 | MTC0: 00000034 MFC1: 00000929 DMFC1: 00000000 CFC1: 00000141 MTC1: 00003081 |
67 | DMTC1: 00000000 CTC1: 00000163 f.CVT: 00001113 f.CMP: 00002149 f.ADD: 00001387 |
68 | f.SUB: 00000947 f.MUL: 00002696 f.DIV: 00000315 f.SQRT: 00000025 f.ABS: 00000000 |
69 | f.MOV: 00000631 f.NEG: 00000323 f.ROUND: 00000000 f.TRUNC: 00000867 f.CEIL: 00000000 |
70 | f.FLOOR: 00000000 |
71 | |
72 | Special code samples: |
73 | Regcache flushing: 12371 |
74 | Jump wrappers: 15520 |
75 | NOTCOMPILED: 33 |
76 | block postfix & link samples: 619 |
77 | |
78 | Unaccounted samples: 19929 |
79 | Total accounted instruction samples: 221836 |
80 | Load: 35.2% (68040) |
81 | Store: 21.0% (40591) |
82 | Data move/convert: 03.5% (6829) |
83 | 32-bit math: 16.0% (30861) |
84 | 64-bit math: 05.7% (10948) |
85 | Float Math: 04.5% (8709) |
86 | Jump: 02.5% (4750) |
87 | Branch: 07.8% (15014) |
88 | Exceptions: 00.0% (36) |
89 | Reserved: 03.9% (7515) |
90 | Other: 00.0% (0) |
91 | |