| 1 | \r |
| 2 | _____ __ \r |
| 3 | / ___/__ __ ____ / /___ ___ ___ ___________________ \r |
| 4 | / /__ / // // __// // _ \ / _ \/ -_) ___________________ \r |
| 5 | \___/ \_, / \__//_/ \___//_//_/\__/ ___________________ \r |
| 6 | /___/ \r |
| 7 | ___________________ ____ ___ ___ ___ ___ \r |
| 8 | ___________________ / __// _ \ / _ \ / _ \ / _ \ \r |
| 9 | ___________________ / _ \/ _ // // // // // // / \r |
| 10 | \___/\___/ \___/ \___/ \___/ \r |
| 11 | \r |
| 12 | ___________________________________________________________________________\r |
| 13 | \r |
| 14 | \r |
| 15 | This code is licensed under the GNU General Public License version 2.0 and the MAME License.\r |
| 16 | You can choose the license that has the most advantages for you.\r |
| 17 | \r |
| 18 | ___________________________________________________________________________\r |
| 19 | \r |
| 20 | \r |
| 21 | What is it?\r |
| 22 | -----------\r |
| 23 | \r |
| 24 | Cyclone 68000 is an emulator for the 68000 microprocessor, written in ARM 32-bit assembly.\r |
| 25 | It is aimed at chips such as ARM7 and ARM9 cores, StrongARM and XScale, to interpret 68000\r |
| 26 | code as fast as possible.\r |
| 27 | \r |
| 28 | Flags are mapped onto ARM flags whenever possible, which speeds up the processing of opcode.\r |
| 29 | \r |
| 30 | Developers:\r |
| 31 | -----------\r |
| 32 | \r |
| 33 | Dave / FinalDave: emudave(atsymbol)gmail.com\r |
| 34 | \r |
| 35 | \r |
| 36 | What's New\r |
| 37 | ----------\r |
| 38 | v0.069\r |
| 39 | + Added SBCD and the flags for ABCD/SBCD. Score and time now works in games such as\r |
| 40 | Rolling Thunder 2, Ghouls 'N Ghosts\r |
| 41 | + Fixed a problem with addx and subx with 8-bit and 16-bit values.\r |
| 42 | Ghouls 'N' Ghosts now works!\r |
| 43 | \r |
| 44 | v0.068\r |
| 45 | + Added ABCD opcode (Streets of Rage works now!)\r |
| 46 | \r |
| 47 | v0.067\r |
| 48 | + Added dbCC (After Burner)\r |
| 49 | + Added asr EA (Sonic 1 Boss/Labyrinth Zone)\r |
| 50 | + Added andi/ori/eori ccr (Altered Beast)\r |
| 51 | + Added trap (After Burner)\r |
| 52 | + Added special case for move.b (a7)+ and -(a7), stepping by 2\r |
| 53 | After Burner is playable! Eternal Champions shows more\r |
| 54 | + Fixed lsr.b/w zero flag (Ghostbusters)\r |
| 55 | Rolling Thunder 2 now works!\r |
| 56 | + Fixed N flag for .b and .w arithmetic. Golden Axe works!\r |
| 57 | \r |
| 58 | v0.066\r |
| 59 | + Fixed a stupid typo for exg (orr r10,r10, not orr r10,r8), which caused alignment\r |
| 60 | crashes on Strider\r |
| 61 | \r |
| 62 | v0.065\r |
| 63 | + Fixed a problem with immediate values - they weren't being shifted up correctly for some\r |
| 64 | opcodes. Spiderman works, After Burner shows a bit of graphics.\r |
| 65 | + Fixed a problem with EA:"110nnn" extension word. 32-bit offsets were being decoded as 8-bit\r |
| 66 | offsets by mistake. Castlevania Bloodlines seems fine now.\r |
| 67 | + Added exg opcode\r |
| 68 | + Fixed asr opcode (Sonic jumping left is fixed)\r |
| 69 | + Fixed a problem with the carry bit in rol.b (Marble Madness)\r |
| 70 | \r |
| 71 | v0.064\r |
| 72 | + Added rtr\r |
| 73 | + Fixed addq/subq.l (all An opcodes are 32-bit) (Road Rash)\r |
| 74 | + Fixed various little timings\r |
| 75 | \r |
| 76 | v0.063\r |
| 77 | + Added link/unlk opcodes\r |
| 78 | + Fixed various little timings\r |
| 79 | + Fixed a problem with dbCC opcode being emitted at set opcodes\r |
| 80 | + Improved long register access, the EA fetch now does ldr r0,[r7,r0,lsl #2] whenever\r |
| 81 | possible, saving 1 or 2 cycles on many opcodes, which should give a nice speed up.\r |
| 82 | + May have fixed N flag on ext opcode?\r |
| 83 | + Added dasm for link opcode.\r |
| 84 | \r |
| 85 | v0.062\r |
| 86 | * I was a bit too keen with the Arithmetic opcodes! Some of them should have been abcd,\r |
| 87 | exg and addx. Removed the incorrect opcodes, pending re-adding them as abcd, exg and addx.\r |
| 88 | + Changed unknown opcodes to act as nops.\r |
| 89 | Not very technical, but fun - a few more games show more graphics ;)\r |
| 90 | \r |
| 91 | v0.060\r |
| 92 | + Fixed divu (EA intro)\r |
| 93 | + Added sf (set false) opcode - SOR2\r |
| 94 | * Todo: pea/link/unlk opcodes\r |
| 95 | \r |
| 96 | v0.059: Added remainder to divide opcodes.\r |
| 97 | \r |
| 98 | \r |
| 99 | ARM Register Usage\r |
| 100 | ------------------\r |
| 101 | \r |
| 102 | See source code for up to date of register usage, however a summary is here:\r |
| 103 | \r |
| 104 | r0-3: Temporary registers\r |
| 105 | r4 : Current PC + Memory Base (i.e. pointer to next opcode)\r |
| 106 | r5 : Cycles remaining\r |
| 107 | r6 : Pointer to Opcode Jump table\r |
| 108 | r7 : Pointer to Cpu Context\r |
| 109 | r8 : Current Opcode\r |
| 110 | r9 : Flags (NZCV) in highest four bits\r |
| 111 | (r10 : Temporary source value or Memory Base)\r |
| 112 | (r11 : Temporary register)\r |
| 113 | \r |
| 114 | \r |
| 115 | How to Compile\r |
| 116 | --------------\r |
| 117 | \r |
| 118 | Like Starscream and A68K, Cyclone uses a 'Core Creator' program which calculates and outputs\r |
| 119 | all possible 68000 Opcodes and a jump table into files called Cyclone.s and .asm\r |
| 120 | It then assembles these files into Cyclone.o and .obj\r |
| 121 | \r |
| 122 | Cyclone.o is the GCC assembled version and Cyclone.obj is the Microsoft assembled version.\r |
| 123 | \r |
| 124 | First unzip "Cyclone.zip" into a "Cyclone" directory.\r |
| 125 | If you are compiling for Windows CE, find ARMASM.EXE (the Microsoft ARM assembler) and\r |
| 126 | put it in the directory as well or put it on your path.\r |
| 127 | \r |
| 128 | Open up Cyclone.dsw in Visual Studio 6.0, compile and run the project.\r |
| 129 | Cyclone.obj and Cyclone.o will be created.\r |
| 130 | \r |
| 131 | Compiling without Visual C++\r |
| 132 | ----------------------------\r |
| 133 | If you aren't using Visual C++, it still shouldn't be too hard to compile, just get a C compiler,\r |
| 134 | compile all the CPPs and C file, link them into an EXE, and run the exe.\r |
| 135 | \r |
| 136 | e.g. gcc Main.cpp OpAny.cpp OpArith.cpp OpBranch.cpp OpLogic.cpp OpMove.cpp Disa.c\r |
| 137 | Main.exe\r |
| 138 | \r |
| 139 | \r |
| 140 | Adding to your project\r |
| 141 | ----------------------\r |
| 142 | \r |
| 143 | To add Cyclone to you project, add Cyclone.o or obj, and include Cyclone.h\r |
| 144 | There is one structure: 'struct Cyclone', and one function: CycloneRun\r |
| 145 | \r |
| 146 | Don't worry if this seem very minimal - its all you need to run as many 68000s as you want.\r |
| 147 | It works with both C and C++.\r |
| 148 | \r |
| 149 | Byteswapped Memory\r |
| 150 | ------------------\r |
| 151 | \r |
| 152 | If you have used Starscream, A68K or Turbo68K or similar emulators you'll be familiar with this!\r |
| 153 | \r |
| 154 | Any memory which the 68000 can access directly must be have every two bytes swapped around.\r |
| 155 | This is to speed up 16-bit memory accesses, because the 68000 has Big-Endian memory\r |
| 156 | and ARM has Little-Endian memory.\r |
| 157 | \r |
| 158 | Now you may think you only technically have to byteswap ROM, not RAM, because\r |
| 159 | 16-bit RAM reads go through a memory handler and you could just return (mem[a]<<8) | mem[a+1].\r |
| 160 | \r |
| 161 | This would work, but remember some systems can execute code from RAM as well as ROM, and\r |
| 162 | that would fail.\r |
| 163 | So it's best to use byteswapped ROM and RAM if the 68000 can access it directly.\r |
| 164 | It's also faster for the memory handlers, because you can do this:\r |
| 165 | \r |
| 166 | return *(unsigned short *)(mem+a)\r |
| 167 | \r |
| 168 | \r |
| 169 | Declaring a Memory handlers\r |
| 170 | ---------------------------\r |
| 171 | \r |
| 172 | Before you can reset or execute 68000 opcodes you must first set up a set of memory handlers.\r |
| 173 | There are 7 functions you have to set up per CPU, like this:\r |
| 174 | \r |
| 175 | static unsigned int MyCheckPc(unsigned int pc)\r |
| 176 | static unsigned char MyRead8 (unsigned int a)\r |
| 177 | static unsigned short MyRead16 (unsigned int a)\r |
| 178 | static unsigned int MyRead32 (unsigned int a)\r |
| 179 | static void MyWrite8 (unsigned int a,unsigned char d)\r |
| 180 | static void MyWrite16(unsigned int a,unsigned short d)\r |
| 181 | static void MyWrite32(unsigned int a,unsigned int d)\r |
| 182 | \r |
| 183 | You can think of these functions representing the 68000's memory bus.\r |
| 184 | The Read and Write functions are called whenever the 68000 reads or writes memory.\r |
| 185 | For example you might set MyRead8 like this:\r |
| 186 | \r |
| 187 | unsigned char MyRead8(unsigned int a)\r |
| 188 | {\r |
| 189 | a&=0xffffff; // Clip address to 24-bits\r |
| 190 | \r |
| 191 | if (a<RomLength) return RomData[a^1]; // ^1 because the memory is byteswapped\r |
| 192 | if (a>=0xe00000) return RamData[(a^1)&0xffff];\r |
| 193 | return 0xff; // Out of range memory access\r |
| 194 | }\r |
| 195 | \r |
| 196 | The other 5 read/write functions are similar. I'll describe the CheckPc function later on.\r |
| 197 | \r |
| 198 | Declaring a CPU Context\r |
| 199 | -----------------------\r |
| 200 | \r |
| 201 | To declare a CPU simple declare a struct Cyclone in your code. For example to declare\r |
| 202 | two 68000s:\r |
| 203 | \r |
| 204 | struct Cyclone MyCpu;\r |
| 205 | struct Cyclone MyCpu2;\r |
| 206 | \r |
| 207 | It's probably a good idea to initialise the memory to zero:\r |
| 208 | \r |
| 209 | memset(&MyCpu, 0,sizeof(MyCpu));\r |
| 210 | memset(&MyCpu2,0,sizeof(MyCpu2));\r |
| 211 | \r |
| 212 | Next point to your memory handlers:\r |
| 213 | \r |
| 214 | MyCpu.checkpc=MyCheckPc;\r |
| 215 | MyCpu.read8 =MyRead8;\r |
| 216 | MyCpu.read16 =MyRead16;\r |
| 217 | MyCpu.read32 =MyRead32;\r |
| 218 | MyCpu.write8 =MyWrite8;\r |
| 219 | MyCpu.write16=MyWrite16;\r |
| 220 | MyCpu.write32=MyWrite32;\r |
| 221 | \r |
| 222 | You also need to point the fetch handlers - for most systems out there you can just\r |
| 223 | point them at the read handlers:\r |
| 224 | MyCpu.fetch8 =MyRead8;\r |
| 225 | MyCpu.fetch16 =MyRead16;\r |
| 226 | MyCpu.fetch32 =MyRead32;\r |
| 227 | \r |
| 228 | ( Why a different set of function pointers for fetch?\r |
| 229 | Well there are some systems, the main one being CPS2, which return different data\r |
| 230 | depending on whether the 'fetch' line on the 68000 bus is high or low.\r |
| 231 | If this is the case, you can set up different functions for fetch reads.\r |
| 232 | Generally though you don't need to. )\r |
| 233 | \r |
| 234 | Now you are nearly ready to reset the 68000, except you need one more function: checkpc().\r |
| 235 | \r |
| 236 | The checkpc() function\r |
| 237 | ----------------------\r |
| 238 | \r |
| 239 | When Cyclone reads opcodes, it doesn't use a memory handler every time, this would be\r |
| 240 | far too slow, instead it uses a direct pointer to ARM memory.\r |
| 241 | For example if your Rom image was at 0x3000000 and the program counter was $206,\r |
| 242 | Cyclone's program counter would be 0x3000206.\r |
| 243 | \r |
| 244 | The difference between an ARM address and a 68000 address is also stored in a variable called\r |
| 245 | 'membase'. In the above example it's 0x3000000. To retrieve the real PC, Cyclone just\r |
| 246 | subtracts 'membase'.\r |
| 247 | \r |
| 248 | When a long jump happens, Cyclone calls checkpc(). If the PC is in a different bank,\r |
| 249 | for example Ram instead of Rom, change 'membase', recalculate the new PC and return it:\r |
| 250 | \r |
| 251 | static int MyCheckPc(unsigned int pc)\r |
| 252 | {\r |
| 253 | pc-=MyCpu.membase; // Get the real program counter\r |
| 254 | \r |
| 255 | if (pc<RomLength) MyCpu.membase=(int)RomMem; // Jump to Rom\r |
| 256 | if (pc>=0xff0000) MyCpu.membase=(int)RamMem-0xff0000; // Jump to Ram\r |
| 257 | \r |
| 258 | return MyCpu.membase+pc; // New program counter\r |
| 259 | }\r |
| 260 | \r |
| 261 | Notice that the membase is always ARM address minus 68000 address.\r |
| 262 | \r |
| 263 | The above example doesn't consider mirrored ram, but for an example of what to do see\r |
| 264 | PicoDrive (in Memory.cpp).\r |
| 265 | \r |
| 266 | \r |
| 267 | Almost there - Reset the 68000!\r |
| 268 | -------------------------------\r |
| 269 | \r |
| 270 | Next we need to Reset the 68000 to get the initial Program Counter and Stack Pointer. This\r |
| 271 | is obtained from addresses 000000 and 000004.\r |
| 272 | \r |
| 273 | Here is code which resets the 68000 (using your memory handlers):\r |
| 274 | \r |
| 275 | MyCpu.srh=0x27; // Set supervisor mode\r |
| 276 | MyCpu.a[7]=MyCpu.read32(0); // Get Stack Pointer\r |
| 277 | MyCpu.membase=0;\r |
| 278 | MyCpu.pc=MyCpu.checkpc(MyCpu.read32(4)); // Get Program Counter\r |
| 279 | \r |
| 280 | And that's ready to go.\r |
| 281 | \r |
| 282 | \r |
| 283 | Executing the 68000\r |
| 284 | -------------------\r |
| 285 | \r |
| 286 | To execute the 68000, set the 'cycles' variable to the number of cycles you wish to execute,\r |
| 287 | and then call CycloneRun with a pointer to the Cyclone structure.\r |
| 288 | \r |
| 289 | e.g.:\r |
| 290 | // Execute 1000 cycles on the 68000:\r |
| 291 | MyCpu.cycles=1000; CycloneRun(&MyCpu);\r |
| 292 | \r |
| 293 | For each opcode, the number of cycles it took is subtracted and the function returns when\r |
| 294 | it reaches 0.\r |
| 295 | \r |
| 296 | e.g.\r |
| 297 | // Execute one instruction on the 68000:\r |
| 298 | MyCpu.cycles=0; CycloneRun(&MyCpu);\r |
| 299 | printf(" The opcode took %d cycles\n", -MyCpu.cycles);\r |
| 300 | \r |
| 301 | You should try to execute as many cycles as you can for maximum speed.\r |
| 302 | The number actually executed may be slightly more than requested, i.e. cycles may come\r |
| 303 | out with a small negative value:\r |
| 304 | \r |
| 305 | e.g.\r |
| 306 | int todo=12000000/60; // 12Mhz, for one 60hz frame\r |
| 307 | MyCpu.cycles=todo; CycloneRun(&MyCpu);\r |
| 308 | printf(" Actually executed %d cycles\n", todo-MyCpu.cycles);\r |
| 309 | \r |
| 310 | To calculate the number of cycles executed, use this formula:\r |
| 311 | Number of cycles requested - Cycle counter at the end\r |
| 312 | \r |
| 313 | \r |
| 314 | Interrupts\r |
| 315 | ----------\r |
| 316 | \r |
| 317 | Causing an interrupt is very simple, simply set the irq variable in the Cyclone structure\r |
| 318 | to the IRQ number.\r |
| 319 | To lower the IRQ line, set it to zero.\r |
| 320 | \r |
| 321 | e.g:\r |
| 322 | MyCpu.irq=6; // Interrupt level 6\r |
| 323 | MyCpu.cycles=20000; CycloneRun(&MyCpu);\r |
| 324 | \r |
| 325 | Note that the interrupt is not actually processed until the next call to CycloneRun,\r |
| 326 | and the interrupt may not be taken until the 68000 interrupt mask is changed to allow it.\r |
| 327 | \r |
| 328 | ( The IRQ isn't checked on exiting from a memory handler: I don't think this will cause\r |
| 329 | me any trouble because I've never needed to trigger an interrupt from a memory handler,\r |
| 330 | but if someone needs to, let me know...)\r |
| 331 | \r |
| 332 | Accessing Cycle Counter\r |
| 333 | -----------------------\r |
| 334 | \r |
| 335 | The cycle counter in the Cyclone structure is not, by default, updated before\r |
| 336 | calling a memory handler, only at the end of an execution.\r |
| 337 | \r |
| 338 | If you do need to read the cycle counter inside memory handlers, there is a\r |
| 339 | bitfield called 'Debug' in Cyclone/Main.cpp.\r |
| 340 | You can try setting Debug to 1 and then making the Cyclone library.\r |
| 341 | This will add extra instructions so Cyclone writes register r5 back into the structure.\r |
| 342 | \r |
| 343 | If you need to *modify* cycles in a memory handler, set Debug to 3, this will read back\r |
| 344 | the cycle counter as well.\r |
| 345 | \r |
| 346 | Accessing Program Counter and registers\r |
| 347 | ---------------------------------------\r |
| 348 | \r |
| 349 | You can read Cyclone's registers directly from the structure at any time (as far as I know).\r |
| 350 | \r |
| 351 | The Program Counter, should you need to read or write it, is stored with membase\r |
| 352 | added on. So use this formula to calculate the real 68000 program counter:\r |
| 353 | \r |
| 354 | pc = MyCpu.pc - MyCpu.membase;\r |
| 355 | \r |
| 356 | The program counter is stored in r4 during execution, and isn't written back to the\r |
| 357 | structure until the end of execution, which means you can't read normally real it from\r |
| 358 | a memory handler.\r |
| 359 | However you can try setting Debug to 4 and then making the Cyclone library, this will\r |
| 360 | write back r4 to the structure.\r |
| 361 | \r |
| 362 | You can't access the flags from a handler either. I can't imagine why anyone would particularly\r |
| 363 | need to do this, but if you do e-mail me and I'll add another bit to 'Debug' ;)\r |
| 364 | \r |
| 365 | \r |
| 366 | Emulating more than one CPU\r |
| 367 | ---------------------------\r |
| 368 | \r |
| 369 | Since everything is based on the structures, emulating more than one cpu at the same time\r |
| 370 | is just a matter of declaring more than one structures and timeslicing. You can emulate\r |
| 371 | as many 68000s as you want.\r |
| 372 | Just set up the memory handlers for each cpu and run each cpu for a certain number of cycles.\r |
| 373 | \r |
| 374 | e.g.\r |
| 375 | // Execute 1000 cycles on 68000 #1:\r |
| 376 | MyCpu.cycles=1000; CycloneRun(&MyCpu);\r |
| 377 | \r |
| 378 | // Execute 1000 cycles on 68000 #2:\r |
| 379 | MyCpu2.cycles=1000; CycloneRun(&MyCpu2);\r |
| 380 | \r |
| 381 | \r |
| 382 | Thanks to...\r |
| 383 | ------------\r |
| 384 | \r |
| 385 | * All the previous code-generating assembler cpu core guys!\r |
| 386 | Who are iirc... Neill Corlett, Neil Bradley, Mike Coates, Darren Olafson\r |
| 387 | and Bart Trzynadlowski\r |
| 388 | \r |
| 389 | * Charles Macdonald, for researching just about every console ever\r |
| 390 | * MameDev+FBA, for keeping on going and going and going\r |