; This can be correctly assembled into binary data with the java-based assembler found here: ; http://www.obelisk.demon.co.uk/dev65/index.html ; CREDITS: ; Disassembly and comments of code from $C1, credit to: Imzogelmo, assassin, Lenophis, Novalia Spirit ; New code and comments by Eggers ;=============== ; IF YOU WANT TO ASSEMBLE THIS YOURSELF IN ORDER TO MOVE ITS ROM LOCATION, EDIT THE CONSTANTS ; JUST BELOW HERE. MY_DATA_BANK .EQU $F0 ; <---- CHANGE THIS CONSTANT ; The data bank is the most significant byte of the address where the code will be stored. This should ; be something between $C0 - $EF for a non-expanded ROM, and can also be between $F0-$FF if you've ; expanded your rom to 32 Mb. ; Note that this is a CPU address. The CPU data bank is offset from the actual ROM data address by +$C0. ; So, in other words, if you want to paste this code into $05XXXX of your ROM (as seen in a hex editor), ; you would want to use $C5 as your data bank. $F0 corresponds to $30. And so on. MY_CODE_OFFSET .EQU $0000 ; <---- CHANGE THIS CONSTANT ; The less significant two bytes of the location you want to paste the new code. If you plan to paste this ; into the very start of a new data bank, it should be $0000. If you want to paste it into unused space in ; an existing bank, this should be the offset where the new code will start, within that bank. ;=============== ; Below, we have code which essentially duplicates a function originally found at C1/3D43, ; followed by new code that does the byte switching that allows expanded palettes. .65816 .CODE bank_beginning: .REPEAT MY_CODE_OFFSET NOP .ENDR ; The routine that loads character battle graphics and palettes C1_3D43: ;C9FF CMP #$FF (from C1_316F, C1_3B1A, C1_3B2A, C1_3B3A, C1_3B4A) CMP #$FF ; Accumulator is expected to hold the character sprite ID. ; if the character ID is FF, that means empty. C1_3D45: ;D001 BNE $3D48 BNE C1_3D48 C1_3D47: ;60 RTS RTL ; rts replaced with rtl C1_3D48: ;8514 STA $14 (from only C1_3D45) STA $14 C1_3D4A: ;861C STX $1C STX $1C C1_3D4C: ;AA TAX TAX C1_3D4D: ;A510 LDA $10 LDA $10 C1_3D4F: ;48 PHA PHA C1_3D50: ;48 PHA PHA C1_3D51: ;DA PHX PHX C1_3D52: ;A61C LDX $1C LDX $1C C1_3D54: ;861A STX $1A STX $1A C1_3D56: ;A514 LDA $14 LDA $14 ;=======\\new code\\ STA $13 ; There is a sprite ID stored in accum right now ; (just loaded from $14). Want to store it here for later ; use. I believe nothing should interfere with this location. ; 8-bit. ;======= C1_3D58: ;0A ASL A ASL A C1_3D59: ;18 CLC CLC C1_3D5A: ;6514 ADC $14 ADC $14 C1_3D5C: ;AA TAX TAX ; X now holds 3 * the character ID passed in in the accumulator. This is to ; offset a 3-byte long pointer. C1_3D5D: ;A9C2 LDA #$C2 LDA #$C2 ; This will be the high byte of a 3-byte pointer (to data in C2) C1_3D5F: ;8516 STA $16 STA $16 C1_3D61: ;8B PHB PHB C1_3D62: ;A97F LDA #$7F LDA #$7F C1_3D64: ;48 PHA PHA C1_3D65: ;AB PLB PLB ; We're now using $7f as our data bank ; .DBREG $7F C1_3D66: ;BF45CEC2 LDA $C2CE45,X (High byte of pointer to start of character battle graphics) LDA >$C2CE45,X C1_3D6A: ;8512 STA $12 STA $12 C1_3D6C: ;C220 REP #$20 REP #$20 ; set the accumulator to 16 bit .LONGA ON C1_3D6E: ;BF43CEC2 LDA $C2CE43,X (Pointer to start of character battle graphics) LDA >$C2CE43,X C1_3D72: ;8510 STA $10 STA $10 C1_3D74: ;A945C7 LDA #$C745 LDA #$C745 ; Note that we already stored #$C2 in $16 C1_3D77: ;8514 STA $14 STA $14 ; Combined with the new 16-bit address, $14-$16 now hold a ; 24-bit address. This is a data section in C2. I believe it contains ; the data on how to compose 8x8 tiles into sprite tiles. C1_3D79: ;A61A LDX $1A LDX $1A ; This re-loads the original value which was stored in X when we entered the function. ; This should be $0000, $2000, $4000, or $6000, depending on whether this is ; battle-character 1, 2, 3 or 4. C1_3D7B: ;A90001 LDA #$0100 LDA #$0100 ; $1A will be a counter for an outer loop, counting down from 256 C1_3D7E: ;851A STA $1A STA $1A C1_3D80: ;A91000 LDA #$0010 LDA #$0010 ; This begins an inner loop... C1_3D83: ;8518 STA $18 STA $18 ; And the inner loop counter counts down from 16 C1_3D85: ;A714 LDA [$14] LDA [$14] ; Load from the pointer stored in DP addresses $14-16. Again, these point to data in ; C2. C1_3D87: ;C9FFFF CMP #$FFFF CMP #$FFFF ; #$FFFF appears to represent a blank tile (or blank line?). Special code handles this ; case. C1_3D8A: ;D00C BNE $3D98 BNE C1_3D98 ; Otherwise, skip ahead to the normal case. C1_3D8C: ;7B TDC TDC ; (the =#$FFFF case) ; I think this loads 0 into the accumulator. I'm not sure why they didn't just use ; the literal zero? C1_3D8D: ;9D0000 STA $0000,X STA |$0000,X ; Now we write #$0000 16 times (for a total of 32 bytes) C1_3D90: ;E8 INX INX C1_3D91: ;E8 INX INX C1_3D92: ;C618 DEC $18 DEC $18 C1_3D94: ;D0F7 BNE $3D8D BNE C1_3D8D ; loop... C1_3D96: ;800E BRA $3DA6 BRA C1_3DA6 ; when loop ends, skip over the "normal" code and continue C1_3D98: ;A8 TAY TAY ; Here begins the normal != #$FFFF case C1_3D99: ;B710 LDA [$10],Y LDA [$10],Y ; Load into A the long address stored in $10-$12, offset by Y, which contains 3 * ; the character spire ID. (This gives us the pointer to the start of the character ; battle graphics) C1_3D9B: ;9D0000 STA $0000,X STA |$0000,X ; Store the loaded graphics in memory C1_3D9E: ;E8 INX INX C1_3D9F: ;E8 INX INX C1_3DA0: ;C8 INY INY C1_3DA1: ;C8 INY INY ; ... and increment the counters by 2 bytes each C1_3DA2: ;C618 DEC $18 DEC $18 ; loop 16 times C1_3DA4: ;D0F3 BNE $3D99 BNE C1_3D99 C1_3DA6: ;E614 INC $14 INC $14 C1_3DA8: ;E614 INC $14 INC $14 ; Increment this pointer into C2 (tile-related data) C1_3DAA: ;C61A DEC $1A DEC $1A ; Decrement the larger, outer counter, which had the initial value of 256 (#$0100) C1_3DAC: ;D0D2 BNE $3D80 BNE C1_3D80 ; This ends the outer loop. In total, with the inner and outer loop, we will load data ; 256 * 16 times, * 2 bytes per loop, = 8192 bytes (#$2000). C1_3DAE: ;7B TDC TDC ; This, again is meant to set accum to 0, I think C1_3DAF: ;E220 SEP #$20 SEP #$20 ; Set the accumulator to 8-bit mode .LONGA OFF C1_3DB1: ;A61C LDX $1C LDX $1C ; Restore X to its original value, viz. #$0000, #$2000, #$4000, or #$6000, depending ; on which battle character this is. C1_3DB3: ;A940 LDA #$40 LDA #$40 ; We're going to perform this loop 64 times C1_3DB5: ;8512 STA $12 STA $12 ; Loop counter stored in $12 ; Below was code to reverse the order of the bits in two bytes, back-to-back. However, by *interspersing* ; these tasks, each shift or rotate now does double duty, and we get our goals done in about half the time! ; (7 * 8) - 5 + 3 = 54 cycles are saved per loop iteration (of 64). C1_3DB7: ;BDC003 LDA $03C0,X LDA $03C0,X ; Starting with byte $03C0 ... C1_3DBA: ;0A ASL A ; ASL A ; ... we repeatedly ASL A ... C1_3DBB: ;6610 ROR $10 ; ROR $10 ; ... and ROR $10, with carry ... ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ... repeated 8 times ... ; ROR $10 ; ... this has the effect of reversing the order of the bits. C1_3DD2: ;A510 LDA $10 ; LDA $10 ; Then we take the reversed-bit result... C1_3DD4: ;9DC003 STA $03C0,X ; STA $03C0,X ; ... and save it back to to this memory location. C1_3DD7: ;BDC010 LDA $10C0,X ; LDA $10C0,X ; Then we repeat the same thing for another memory location. ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 ; ASL A ; ROR $10 STA $10 LDA $10C0,X ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL ROR $10 ROL STA $03C0,X C1_3DF2: ;A510 LDA $10 LDA $10 C1_3DF4: ;9DC010 STA $10C0,X STA $10C0,X ; ...and save it C1_3DF7: ;E8 INX INX C1_3DF8: ;C612 DEC $12 DEC $12 C1_3DFA: ;D0BB BNE $3DB7 BNE C1_3DB7 ; Perform this loop 64 times, for 64 bytes in each of two locations. C1_3DFC: ;AB PLB PLB C1_3DFD: ;FA PLX PLX C1_3DFE: ;68 PLA PLA ; Restoring a bunch of stuff from the stack C1_3DFF: ;0A ASL A ASL A ; Shifting bits by 5. I think this has to do with multiplying the character ID by ; 32, in order to get an offset. C1_3E00: ;0A ASL A ASL A C1_3E01: ;0A ASL A ASL A C1_3E02: ;0A ASL A ASL A C1_3E03: ;0A ASL A ASL A load_battle_graphics_new: ; new code begins, in which we mess with the tile and palette values to make it ; so we can use the last four palette colors for battle PCs. PHP ; push status register REP #$30 ; Make accum/index 16 bit .LONGA ON .LONGI ON PHA ; push A PHX ; and X PHY ; and Y ; We need to know which palette entries are used by which battle sprites. Reading the whole sprite sheet takes time, ; and we should try to skip it by loading data we've pre-calculated. ; This info can be stored for all 16 battle sprites. I'm storing it in a location in SRAM that isn't a part of any ; save file and is (I'm almost certain) unused. Using this location has positive side-effects: it's retained when ; the emulator/system is turned off, but is also independent of a particular save-game. ; A negative side effect is that a "developer" has to be able to clear it when he edits his sprites. SRAM_START .EQU $306000 ; Start of SRAM location in snes memory map ; 6000-7FFF: first 8 KB of SRAM (which is entire SRAM, for this game) BSPRITE_DATA_START .EQU SRAM_START + $1E00 ; Our data starts here, right after the end of the third/final save file LAST_BASIC_SPRITE_OFFSET .EQU 22 ; 22 sprites are used as battle sprites in the original game. Gestahl ; is number 22. NUMBER_EXTRA_SPRITES .EQU 32 ; In case someone adds new battle sprites, we'll handle this (in a very ; slightly less efficient way). 32 should be more than enough. SRAM_DATA_BYTES_USED .EQU (LAST_BASIC_SPRITE_OFFSET+1)*2 + NUMBER_EXTRA_SPRITES*3 ; If the constants are unchanged, we'll use 142 bytes of SRAM (at max) SRAM_CACHE_NONEMPTY_BYTE .EQU BSPRITE_DATA_START + SRAM_DATA_BYTES_USED LDA #0 SEP #$30 .LONGA OFF ; indices and accum temporarily 8 bit .LONGI OFF LDA >SRAM_CACHE_NONEMPTY_BYTE BNE lbg_load_sprite_id LDA #$FF STA >SRAM_CACHE_NONEMPTY_BYTE ; This just marks that we have at least one cache entry lbg_load_sprite_id: LDA $13 ; This is the character sprite ID we need REP #$30 .LONGA ON ; back to 16 bit .LONGI ON CMP #(LAST_BASIC_SPRITE_OFFSET + 1) BPL do_extra_sprite_offset ; If this is a new sprite, do the slightly more complicated thing ; to handle that ; Otherwise, the data location is just based on the sprite ; number. ASL ; Mutiply by 2, since there is a word of data per sprite TAX ; Transfer to X PHA ; Push this onto the stack, so we can get it back later LDA >BSPRITE_DATA_START,X ; Get the data from SRAM BEQ lbg_find_palette_entries_used_unused ; If it's all zeros, it still needs to be calculated, so do that now. STA $1A ; Otherwise, we have this data. Store it where our code can use it PLX ; Pull the unneeded value of X from the stack JMP lbg_calculate_replacement_colors ; Jump to where we use this info do_extra_sprite_offset: ; After the basic sprite cache data ends, we'll have a list of extra sprite IDs which we have cached. ; We'll look through this list, and use the index to find the location of the data. EXTRA_SPRITE_LIST_START .EQU BSPRITE_DATA_START + ( (LAST_BASIC_SPRITE_OFFSET+1)*2) EXTRA_SPRITE_DATA_START .EQU EXTRA_SPRITE_LIST_START + NUMBER_EXTRA_SPRITES ; First check if we've already cached this data SEP #$20 .LONGA OFF LDX #0 TAY ; Store ID in Y temporarily loop_through_extra_sprite_list: TXA CMP #(NUMBER_EXTRA_SPRITES-1) ; If we've reached the last entry (unlikely) ... BEQ not_yet_cached_extra_sprite ; Store in this location (whether or not it's previously used) TYA ; Load ID CMP >EXTRA_SPRITE_LIST_START,X ; Check this entry for a match BEQ extra_sprite_entry_found LDA >EXTRA_SPRITE_LIST_START,X ; load the data from memory to check if it's zero BEQ not_yet_cached_extra_sprite ; ...which means we'll store a new entry here INX ; Move on to the next entry BRA loop_through_extra_sprite_list not_yet_cached_extra_sprite: TYA ; transfer ID back to A STA >EXTRA_SPRITE_LIST_START,X ; Store the ID to this location REP #$20 .LONGA ON TXA ASL A ; Double the offset stored in X (to index word data) CLC ADC #( (LAST_BASIC_SPRITE_OFFSET+1)*2 + NUMBER_EXTRA_SPRITES ) ; Add the bytes that come before the extra bytes data section PHA ; Push this onto the stack to be retrieved below, as the location to save data BRA lbg_find_palette_entries_used_unused extra_sprite_entry_found: REP #$20 .LONGA ON TXA ASL A ; Double the value of X, because we're now indexing word data TAX LDA >EXTRA_SPRITE_DATA_START,X ; load the word data from the corresponding offset STA $1A JMP lbg_calculate_replacement_colors ; Jump to where we use this info lbg_find_palette_entries_used_unused: ; We're going to loop through every byte of graphics data we just stored in memory, checking for each palette value whether that value ; is used. When we're done, we'll have information on which palette values are used anywhere in the sprite data. ; We do this by looping through palette values 0 thru 15, and looping through the four bitplanes associated with a single set ; of pixels. In the end, we'll know whether a palette value was used anywhere in that "row" of 8 pixels. ; $1A-$1B will be 16 bits, each representing a palette entry, telling us which ; palette numbers have been used in any pixel of this sprite data. ; Palette entries 1, 2, 7 & 8 should always be considered used... LDA #%0000000011000011 ; (these are clear, black / outline, and two skin colors) STA $1A ; Since these are manipulated dynamically in battle, we shouldn't mess with them. LDX #30 ; X holds the palette number we are checking for, times two. ; Indexed starting with zero, so #15 (x2 = #30) is the last. We check the last first. STZ $18 ; $18 will represent the number of definitely unused colors in the main palette, ; which can safely be replaced with expanded colors. ; $19 will represent the number of used colors in the expanded palette which we ; have found, and need to switch in to the main palette. ; SEP #$20 ; 8 bit accumulator ; .LONGA OFF ; STZ $19 ; $19 will represent the number of used colors in the expanded palette which we ; ; have found, and need to switch in to the main palette. ; STZ $18 ; $18 will represent the number of definitely unused colors in the main palette, ; ; which can safely be replaced with expanded colors. lbg_check_memory_for_palette_loop: ; REP #$20 ; 16 bit accumulator ; .LONGA ON LDA $1C ; Load into A the value originally stored in X, viz. $0000, $2000... &c ; depending on which of the four battle characters this is. CLC ADC #$2000 ; Point to the end of the data we just loaded for this character STA $10 ; Store into dp $10-$11, to use as index ; an added instruction and directive: SEP #$20 ; 8 bit accumulator .LONGA OFF LDA #$7F ; the third byte of the pointer STA $12 ; the original code overwrote $13 as well here, due to being in 16-bit mode! ; i suspect Eggers intended 8-bit mode, given the format and comment of the ; previous instruction, and the upcoming, otherwise-redunant "REP #$20". ; now, because $13 apparently gets zeroed before being read from again below, ; it may well have been harmless. and chucking the "SEP #$20" and "REP #$20" ; surrounding this instruction pair would speed up execution a little... load_battle_graphics_outer_byte_loop: REP #$20 ; 16 bit accumulator .LONGA ON LDA $10 ; AND #$000F ; Check if this is a multiple of 16 BIT #$000F ; Check if this is a multiple of 16 BNE load_battle_graphics_outer_byte_loop_2 ; if not, don't do anything special ; LDA $10 SEC SBC #16 ; If so, subtract an additional 16 (to account for the weird bitplane format) ; STA $10 load_battle_graphics_outer_byte_loop_2: ; DEC $10 ; DEC $10 DEC A DEC A STA $10 LDY #0 ; Y is the byte offset for the bitplane (0, 1, 16, or 17) LDA #%1111111100000010 STA $14 ; $14 is a temporary bitmask, which gives us one bit, and shifts left with each ; iteration of Y. Its purpose is to give us the Nth bit stored in X. ; $15 is a temporary value, composed of the logical AND of all bitplanes, telling us ; whether each one of them has the "correct" value for the palette value we're ; currently checking. In other words, it tells us if a given palette value is used ; in the current set of pixels. SEP #$20 ; 8 bit accumulator .LONGA OFF ; LDA #%10 ; STA $14 ; $14 is a temporary bitmask, which gives us one bit, and shifts left with each ; iteration of Y. Its purpose is to give us the Nth bit stored in X. ; LDA #%11111111 ; $15 is a temporary value, composed of the logical AND of all bitplanes, telling us ; STA $15 ; whether each one of them has the "correct" value for the palette value we're ; currently checking. In other words, it tells us if a given palette value is used ; in the current set of pixels. lbg_palette_check_loop: TXA AND $14 ; Find out if the Nth bit of X is a zero (N=1..4, depending on bitplane) BEQ lbg_palette_check_not_set lbg_palette_check_set: ; If the bit is a 1... LDA #%00000000 ; Load all 0's into 8-bit A, so EOR will do nothing BRA lbg_palette_check_loop_2 lbg_palette_check_not_set: ; If the bit is a 0... LDA #%11111111 ; Load all 1's into 8-bit A, so EOR will flip all bits lbg_palette_check_loop_2: EOR [$10],Y ; EOR with the gfx byte, either getting its value or the logical-NOT of its value, ; depending on whether we "want" these bits to be on, or off. AND $15 ; AND with $15. The resulting byte will have 1's for only those bits/pixels which ; have the "right" value across all bitplanes so far tested. STA $15 ; And store back. ; BEQ lbg_palette_check_bitplane_done ; If all bits are already zero, we don't have to check the rest of the bitplanes. BEQ lbg_palette_check_bitplane_done_2 ; If all bits are already zero, we don't have to check the rest of the bitplanes. ASL $14 ; shift the bitmask left, so we check the next bit next iteration INY ; move on to the next bitplane TYA CMP #18 ; check if we're done with all four bitplanes ; BEQ lbg_palette_check_bitplane_done BEQ lbg_palette_check_bitplane_done_1b CMP #2 ; If Y increments past 1, we want to jump straight to 16, to account for the weird BNE lbg_palette_check_loop ; offsets of the four bitplanes LDY #16 BRA lbg_palette_check_loop lbg_palette_check_bitplane_done: ; SEP #$20 ; .LONGA OFF ; I believe this and the "BEQ lbg_palette_check_bitplane_done_2" have been orphaned by my lengthened, more targeted ; branches above. :'( LDA $15 ; Load the data for this row of pixels ; REP #$20 ; 16 bit accumulator . This was a space-saving move (replacing two other ; "REP #$20"s), but nixed when lengthening the branch after "STA $15". ; .LONGA ON BEQ lbg_palette_check_bitplane_done_2 ; If all are 0, the palette color was not found in any of these 8 bits, so do not set ; Otherwise, it was found, and we should set it as "used" lbg_palette_check_bitplane_done_1b: REP #$20 ; 16 bit accumulator .LONGA ON ; LDA $1A ; Load the cumulative "palette used" data, 16 bits ; ORA >BITMASK_LONG,X ; Use the appropriate bitmask to set the bit ; STA $1A ; Store the new value, with the bit turned on LDA >BITMASK_LONG,X ; Use the appropriate bitmask to set the bit TSB $1A ; Enable it in the cumulative "palette used" data, 16 bits TXA CMP #24 ; Check if this is one of the expanded palette numbers ; BMI all_gfx_bytes_checked_done ; If not, go ahead, we're finished here BMI all_gfx_bytes_checked_done_2b ; If not, go ahead, we're finished here SEP #$20 .LONGA OFF INC $19 ; If so, increment the counter for used expanded colors before continuing ; BRA all_gfx_bytes_checked_done ; Since we just found this palette #, no need to keep searching BRA all_gfx_bytes_checked_done_2 ; Since we just found this palette #, no need to keep searching lbg_palette_check_bitplane_done_2: REP #$20 ; 16 bit accumulator .LONGA ON LDA $10 ; Load the data pointer CMP $1C ; Have we reached the start of this character's gfx data? BNE load_battle_graphics_outer_byte_loop ; If not, then loop all_gfx_bytes_checked_done: ; After we've checked all bytes for a particular palette value... REP #$20 ; 16 bit accumulator .LONGA ON TXA CMP #24 ; Check if this was an expanded palette color ; BPL all_gfx_bytes_checked_done_2 ; If so, continue.... BPL all_gfx_bytes_checked_done_2b ; If so, continue.... LDA $1A ; If a main palette color, check if it was set AND >BITMASK_LONG,X ; If so, continue... ; BNE all_gfx_bytes_checked_done_2 BNE all_gfx_bytes_checked_done_2b SEP #$20 .LONGA OFF INC $18 ; If an unused main palette color, increment the number of unused palette numbers all_gfx_bytes_checked_done_2: REP #$20 ; 16 bit accumulator .LONGA ON all_gfx_bytes_checked_done_2b: ; entry point without the CPU flag write DEX ; Decrement X twice, so we check the previous palette # DEX ; TXA ; CMP #0 BEQ lbg_palette_check_completely_done ; If X is 0 (which we don't need to check) we are done. LDA $1A ; Check if the new X represents a palette color that is already "used" AND >BITMASK_LONG,X ; (This will happen for colors we set to 1 at the beginning, because they are ; manipulated by the gfx engine and we don't want to mess with them) ; BNE all_gfx_bytes_checked_done_2 ; If so, decrement again BNE all_gfx_bytes_checked_done_2b ; If so, decrement again TXA CMP #24 ; Check if this was an expanded palette color BPL all_gfx_bytes_checked_continue_loop ; If so, just continue.... SEP #$20 .LONGA OFF LDA $18 ; If this is a normal palette number... CMP $19 ; Check the used expanded against the unused normal... BNE all_gfx_bytes_checked_continue_loop ; If not equal, we can't stop yet. lbg_palette_check_fill_with_ones_pre: TDC ; start with an empty list of used colors to add REP #$20 ; 16 bit accumulator .LONGA ON lbg_palette_check_fill_with_ones: ; REP #$20 ; 16 bit accumulator ; .LONGA ON ; LDA $1A ; But if equal, we can stop. ; ORA >BITMASK_LONG,X ; Mark all remaining colors as "used", since we don't need them. ; STA $1A ORA >BITMASK_LONG,X ; Mark all remaining colors as "used", since we don't need them. ; But if equal, we can stop. (to ???) DEX DEX ; TXA BPL lbg_palette_check_fill_with_ones ; Loop until we reach 0 TSB $1A ; Add our marked remaining colors to list of "used" colors BRA lbg_palette_check_completely_done all_gfx_bytes_checked_continue_loop: REP #$20 ; 16 bit accumulator .LONGA ON JMP lbg_check_memory_for_palette_loop lbg_palette_check_completely_done: ; Now we're completely done with all bytes of graphics data ; REP #$20 ; .LONGA ON LDA $1A PLX ; Pull the index into SRAM we pushed earlier STA >BSPRITE_DATA_START,X ; Save this into into SRAM, so we won't have to get it again. ; $13, $14, $15 and $16 (single-byte values) will store data on which bitmap / palette values to switch. The low 4-bits should contain ; the original value (found in the first 12 entries, but unused), and the high 4-bits should contain the switched value (found in the ; last 4 entries, but used) lbg_calculate_replacement_colors: ; SEP #$20 ; 8-bit accumulator ; .LONGA OFF ; LDA #0 ; STA $13 ; initialize the four replacements to zero ; STA $14 ; STA $15 ; STA $16 STZ $13 ; initialize the four replacements in $13 - $16 to zero STZ $15 ; REP #$20 ; 16-bit accumulator ; .LONGA ON lbg_find_unused_color_pre: LDX #0 ; X will be used to count 1-12 (word data) for the first 12 palette entries lbg_find_unused_color: LDA $1A AND >BITMASK_LONG,X ; Check against the appropriate bitmask, to see if the color is used BEQ lbg_find_expanded_color_pre ; If zero (unused), look for a replacement in palette numbers 13-16 lbg_find_unused_color_resume: INX INX ; If not, move to the next palette entry TXA CMP #24 ; Check if we've reached the end of the first 12 (non-expanded colors) BNE lbg_find_unused_color ; If not, continue to check JMP lbg_done_replacement_colors ; If so, there are no more unused colors, and we're done. lbg_find_expanded_color_pre: TXY ; Store the original (1-12) color in Y temporarily LDX #24 ; now X will be used to index the "expanded" colors lbg_find_expanded_color: LDA $1A AND >BITMASK_LONG,X ; Check against the appropriate bitmask, to see if the color is used BNE lbg_set_replacement ; If non-zero (used), store the replacement INX INX ; else, move to the next palette entry TXA CMP #32 ; Check if we've reached the last palette entry BNE lbg_find_expanded_color ; If not, continue to loop through the expanded colors ; JMP lbg_done_replacement_colors ; If so, there are no more used expanded colors, and we are done BRA lbg_done_replacement_colors ; If so, there are no more used expanded colors, and we are done lbg_set_replacement: SEP #$30 ; 8-bit accumulator/index .LONGA OFF .LONGI OFF lbg_replace_in_13: LDA $13 BNE lbg_replace_in_14 ; If it's != 0 (value already set), check the next one TXA LSR ; Divide X (the expanded palette index) by 2, to get the palette number ASL ASL ASL ASL ; Shift left by 4, because we want to store this in the higher four bits STA $13 TYA LSR ; Divide Y (the original palette index) by 2, to get the palette number ; ORA $13 ; Transfer the high bits we already stored ; STA $13 ; Store the total number TSB $13 ; Combine with the high bits we already stored ; JMP lbg_replace_done BRA lbg_replace_done lbg_replace_in_14: LDA $14 BNE lbg_replace_in_15 ; If it's != 0 (value already set), check the next one TXA LSR ; Divide X (the expanded palette index) by 2, to get the palette number ASL ASL ASL ASL ; Shift left by 4, because we want to store this in the higher four bits STA $14 TYA LSR ; Divide Y (the original palette index) by 2, to get the palette number ; ORA $14 ; Transfer the high bits we already stored ; STA $14 ; Store the total number TSB $14 ; Combine with the high bits we already stored ; JMP lbg_replace_done BRA lbg_replace_done lbg_replace_in_15: LDA $15 BNE lbg_replace_in_16 ; If it's != 0 (value already set), check the next one TXA LSR ; Divide X (the expanded palette index) by 2, to get the palette number ASL ASL ASL ASL ; Shift left by 4, because we want to store this in the higher four bits STA $15 TYA LSR ; Divide Y (the original palette index) by 2, to get the palette number ; ORA $15 ; Transfer the high bits we already stored ; STA $15 ; Store the total number TSB $15 ; Combine with the high bits we already stored ; JMP lbg_replace_done BRA lbg_replace_done lbg_replace_in_16: LDA $16 BNE lbg_done_replacement_colors ; If it's != 0 (value already set), we can't do any more replacements TXA LSR ; Divide X (the expanded palette index) by 2, to get the palette number ASL ASL ASL ASL ; Shift left by 4, because we want to store this in the higher four bits STA $16 TYA LSR ; Divide Y (the original palette index) by 2, to get the palette number ; ORA $16 ; Transfer the high bits we already stored ; STA $16 ; Store the total number TSB $16 ; Combine with the high bits we already stored ; JMP lbg_replace_done lbg_replace_done: REP #$30 ; 16-bit accumulator/index .LONGA ON .LONGI ON LDA >BITMASK_LONG,X ; Get the bitmask for the replacement color ; EOR $1A ; Switch this value from used to unused, so we don't make the same replacement ; STA $1A TRB $1A ; Switch this value from used to unused, so we don't make the same ; replacement. ; NOT necessarily identical to previous code, but since we know the $1A bit ; was originally set, it'll work correctly. ; PHX TYX LDA >BITMASK_LONG,X ; Get the bitmask for the original color ; EOR $1A ; Switch this value from unused to used, so we don't replace the same one ; STA $1A TSB $1A ; Switch this value from unused to used, so we don't replace the same one. ; NOT necessarily identical to previous code, but since we know the $1A bit ; was originally clear, it'll work correctly. ; PLX ; JMP lbg_find_unused_color_pre ; Return to the beginning of the loop JMP lbg_find_unused_color_resume ; Resume checking for unused non-expanded palettes lbg_done_replacement_colors: ;====================== ; This part of the code goes through each pixel of the sprite, checks if it is one that we need to change, and changes it ; if necessary. lbg_pixel_changing_loop_pre: SEP #$30 ; 8-bit accumulator/index .LONGA OFF .LONGI OFF LDA $13 ; First, check if we found ANY colors we need to switch ; AND #$00FF ; look only at bottom byte ; (was added instruction) BNE lbg_pixel_changing_loop_pre_2 ; If we did find any, continue JMP lbg_bitmap_loop_done ; Otherwise, skip this whole thing lbg_pixel_changing_loop_pre_2: LDA #$7F ; the third byte of the pointer STA $12 REP #$30 ; 16-bit accumulator/index .LONGA ON .LONGI ON LDA $1C ; Load into A the value originally stored in X, viz. $0000, $2000... &c ; depending on which of the four battle characters this is. CLC ADC #$2000 ; Point to the end of the data we loaded for this character STA $10 ; Store into dp $10-$11, to use as index ; SEP #$30 ; 8-bit accumulator/index ; .LONGA OFF ; .LONGI OFF ; LDA #$7F ; the third byte of the pointer ; STA $12 ; REP #$30 ; 16-bit accumulator/index ; .LONGA ON ; .LONGI ON lbg_bitmap_outer_byte_loop: ; REP #$30 ; 16-bit accumulator/index ; .LONGA ON ; .LONGI ON LDA $10 ; AND #$000F ; Check if this is a multiple of 16 BIT #$000F ; Check if this is a multiple of 16 BNE lbg_bitmap_outer_byte_loop_2 ; if not, don't do anything special ; LDA $10 SEC SBC #16 ; If so, subtract an additional 16 (to account for the weird bitplane format) ; STA $10 lbg_bitmap_outer_byte_loop_2: ; DEC $10 ; DEC $10 DEC A DEC A STA $10 SEP #$30 ; 8-bit accumulator/index .LONGA OFF .LONGI OFF LDX #0 ; X holds an offset for the "color to switch" information. This information is stored ; in direct page memory $13-$16, and X indexes into these values. lbg_bitmap_bitplane_loop_start: LDY #17 ; Y holds the bitplane (0, 1, 16, or 17). We start at 17 and decrement, for efficiency ; reasons. LDA #%11111111 ; $18 holds a cumulative logical-AND of all the bitplanes we've checked so far, STA $18 ; telling us which bits match the palette number we're looking for LDA #%10000000 ; $19 holds a bitmask telling us which bit to check for equality in the "color to STA $19 ; switch" data. Starts at #%10000000 because the "extended" palette number is in the ; high bits. Shifts right when Y decreases. lbg_bitmap_check_replacement: LDA >$13,X ; Load dp byte 13+X, which tells us colors to switch ; BNE lbg_bitmap_check_replacement_cont ; Only continue if there is a value set here ; JMP lbg_bitmap_outer_byte_done ; If it's all zero, there are no more replacements to check BEQ lbg_bitmap_outer_byte_done ; If it's all zero, there are no more replacements to check. ; Only continue if there is a value set here. lbg_bitmap_check_replacement_cont: AND $19 ; Use the bitmask to check the value of the Nth bit, which corresponds to the Yth ; bitplane BEQ lbg_bitmap_check_replacement_not_set ; If it's zero, we are looking for bits NOT set lbg_bitmap_check_replacement_set: LDA #%00000000 ; Load all 0's, so an EOR will do nothing BRA lbg_bitmap_check_replacement_2 lbg_bitmap_check_replacement_not_set: LDA #%11111111 ; Load all 1's, so an EOR will flip all bits lbg_bitmap_check_replacement_2: EOR [$10],Y ; EOR with the gfx byte, current bitplane. We either get the byte, or the negation ; of the byte. This tells us whether each bit's value is "right". AND $18 ; AND with the cumulative check from previous bitplanes. BEQ lbg_bitmap_bitplane_loop_done ; If all bits are 0, there are no matches in this row of gfx data. STA $18 ; Store this value back ; TYA ; CMP #0 CPY #0 BEQ lbg_bitmap_match_found ; If we've reached the lowest bitplane, and there are still 1's in $18, we have ; matching pixels. lbg_bitmap_check_replacement_3: ; If we haven't reached the lowest bitplane, continue... ; CMP #16 CPY #16 BNE lbg_bitmap_check_replacement_4 ; LDY #1 ; If y is currently 16, we want to skip down to 1, due to the format of gfx data ; BRA lbg_bitmap_check_replacement_5 LDY #2 ; If y is currently 16, we want to skip down to 1, due to the format of gfx data lbg_bitmap_check_replacement_4: DEY ; Decrement (from previous index value, or from 2) lbg_bitmap_check_replacement_5: LSR $19 ; Finally, shift the bitmask in $19 right ; JMP lbg_bitmap_check_replacement ; And reiterate loop BRA lbg_bitmap_check_replacement ; And reiterate loop lbg_bitmap_match_found: ; If some of the bits in $18 are still on after checking all bitplanes, a match ; Now switch the matching bits in all bitplanes ; LDY #0 ; Y holds the bitplane offset ; this is only reached when Y was already 0 LDA #1 STA $19 ; $19 holds a bitmask lbg_bitmap_match_found_loop: LDA >$13,X ; load the "color to replace" data AND $19 ; check the appropriate bit with the bitmask BEQ lbg_bitmap_match_found_loop_not_set lbg_bitmap_match_found_loop_set: ; If the bit is supposed to be 1 LDA [$10],Y ; Load the data for this bitplane ORA $18 ; Turn all the "matched" bits on STA [$10],Y ; Store it back BRA lbg_bitmap_match_found_loop_2 ; And continue lbg_bitmap_match_found_loop_not_set: ; If the bit is supposed to be 0 LDA #%11111111 EOR $18 ; Load the logical negation of the "matched" bits AND [$10],Y ; AND this with the data, setting all the matched bits to 0 STA [$10],Y ; store it back ; and continue lbg_bitmap_match_found_loop_2: INY ; TYA ; CMP #18 CPY #18 BEQ lbg_bitmap_bitplane_loop_done ; If we've surpassed 17, we're done ; CMP #2 ASL $19 ; Shift the bitmask CPY #2 ; BNE lbg_bitmap_match_found_loop_3 ; If we've just surpassed offset 1, skip to offset 16 BNE lbg_bitmap_match_found_loop ; Continue looping normally, unless we've just surpassed ... LDY #16 ; ... offset 1, in which case we skip to offset 16 first lbg_bitmap_match_found_loop_3: ; ASL $19 ; Shift the bitmask ; JMP lbg_bitmap_match_found_loop ; continue to loop BRA lbg_bitmap_match_found_loop ; continue to loop lbg_bitmap_bitplane_loop_done: INX ; Increment the offset of the "color to switch" data ; TXA ; CMP #4 ; Check if we've done all four bytes of data CPX #4 ; Check if we've done all four bytes of data ; BEQ lbg_bitmap_outer_byte_done ; If so, we're done with this loop ; JMP lbg_bitmap_bitplane_loop_start ; Otherwise, continue BNE lbg_bitmap_bitplane_loop_start ; if not, continue with this loop lbg_bitmap_outer_byte_done: REP #$30 ; 16 bit accumulator/index .LONGA ON .LONGI ON LDA $10 ; Load the data pointer CMP $1C ; Have we reached the start of this character's gfx data? ; BEQ lbg_bitmap_loop_done BEQ lbg_bitmap_loop_done_2 JMP lbg_bitmap_outer_byte_loop ; If not, then loop lbg_bitmap_loop_done: REP #$30 ; 16 bit accumulator/index .LONGA ON .LONGI ON lbg_bitmap_loop_done_2: PLY PLX PLA ; Pull everything we pushed onto the stack PLP ;====================== ; Finally, first section of new code is done, and we pick up where we left off .LONGA OFF C1_3E04: ;DA PHX PHX C1_3E05: ;AA TAX TAX C1_3E06: ;BDAE2E LDA $2EAE,X LDA $2EAE,X ; This checks the character ID... C1_3E09: ;C90E CMP #$0E CMP #$0E ; I believe this section has something to do with handling the palette in the ; special case where the character has imp status, and possibly other special status. C1_3E0B: ;D012 BNE $3E1F BNE C1_3E1F ; In any case, we skip it in the general case... C1_3E0D: ;BDC62E LDA $2EC6,X LDA $2EC6,X C1_3E10: ;C901 CMP #$01 CMP #$01 C1_3E12: ;D00B BNE $3E1F BNE C1_3E1F C1_3E14: ;ADA01E LDA $1EA0 LDA $1EA0 C1_3E17: ;2908 AND #$08 AND #$08 C1_3E19: ;F004 BEQ $3E1F BEQ C1_3E1F C1_3E1B: ;FA PLX PLX C1_3E1C: ;7B TDC TDC C1_3E1D: ;8005 BRA $3E24 BRA C1_3E24 C1_3E1F: ;FA PLX PLX C1_3E20: ;BF2BCEC2 LDA $C2CE2B,X LDA >$C2CE2B,X ; $C2CE2B: Battle Character Palette Assignments (1 byte each) C1_3E24: ;C220 REP #$20 REP #$20 ; Accumulator back to 16 bit .LONGA ON C1_3E26: ;0A ASL A ASL A ; Again ASL 5 times. A is the character battle palette assignment. This means we will C1_3E27: ;0A ASL A ; multiply that number by 32 when using it as an offset. ASL A C1_3E28: ;0A ASL A ASL A C1_3E29: ;0A ASL A ASL A C1_3E2A: ;0A ASL A ASL A C1_3E2B: ;AA TAX ; store the offset in X TAX STA $1A ; save it for later. added instruction. C1_3E2C: ;7B TDC TDC C1_3E2D: ;E220 SEP #$20 SEP #$20 ; Long A off again .LONGA OFF C1_3E2F: ;68 PLA PLA C1_3E30: ;0A ASL A ASL A C1_3E31: ;0A ASL A ASL A C1_3E32: ;0A ASL A ASL A C1_3E33: ;0A ASL A ASL A C1_3E34: ;0A ASL A ASL A C1_3E35: ;A8 TAY TAY C1_3E36: ;5A PHY PHY C1_3E37: ;A918 LDA #$18 LDA #$18 ; Load the number #$18 -- 24 in decimal. Probably for 12 palette colors. C1_3E39: ;8510 STA $10 STA $10 ; $10 is being used as the loop index C1_3E3B: ;BF0063ED LDA $ED6300,X LDA >$ED6300,X C1_3E3F: ;99AD81 STA $81AD,Y STA $81AD,Y C1_3E42: ;E8 INX INX C1_3E43: ;C8 INY INY C1_3E44: ;C610 DEC $10 DEC $10 C1_3E46: ;D0F3 BNE $3E3B BNE C1_3E3B ; new code for switching the palette assignments lbg_new_switch_palette: PHP ; push status register REP #$30 .LONGA ON .LONGI ON PHA ; push A PHX ; and X PHY ; and Y ; X and Y have both been incremented #$18 (24) times, and we want to restore them to their original state ; TXA ; SEC ; SBC #$18 ; STA $1A ; this one's now dealt with above in Square-Land, no math needed. TYA SEC SBC #$18 STA $1C LDA $13 ; load the first switcheroo, located in $13 ; AND #%0000000011111111 ; we want to use only the low byte AND #%0000000011110000 ; isolate the low byte, the "expanded palette" value stored in its upper bits LSR LSR LSR ; get "expanded palette" * 2 (as it's a word per color), and a cleared Carry ; LSR ; shift right 4 times, to get the "expanded palette" value stored in the upper bits ; ASL ; shift left, since it's a word per color ; CLC ADC $1A TAX LDA $13 AND #%00001111 ; use only the low bits, which have the "original palette" information ASL ; again, multiply by 2. and will clear Carry. ; CLC ADC $1C TAY ; SEP #$20 ; Long A off ; .LONGA OFF LDA >$ED6300,X ; load the expanded colors... STA $81AD,Y ; ...and replace the colors at the correct locations ; INX ; INY ; LDA >$ED6300,X ; load the expanded color... ; STA $81AD,Y ; ...and replace the color at the correct location ; REP #$20 ; Long A off again ; .LONGA ON ; Repeat for $14 LDA $14 ; now the one located in $14 ; AND #%0000000011111111 ; we want to use only the low byte AND #%0000000011110000 ; isolate the low byte, the "expanded palette" value stored in its upper bits LSR LSR LSR ; get "expanded palette" * 2 (as it's a word per color), and a cleared Carry ; LSR ; shift right 4 times, to get the "expanded palette" value stored in the upper bits ; ASL ; shift left, since it's a word per color ; CLC ADC $1A ; Add to this the "starting" value of X, stored earlier TAX LDA $14 AND #%00001111 ; use only the low bytes, which have the "original palette" information ASL ; again, multiply by 2. and will clear Carry. ; CLC ADC $1C ; Add the "starting" value of Y TAY ; SEP #$20 ; Long A off ; .LONGA OFF LDA >$ED6300,X ; load the expanded colors... STA $81AD,Y ; ...and replace the colors at the correct locations ; INX ; INY ; LDA >$ED6300,X ; load the expanded color... ; STA $81AD,Y ; ...and replace the color at the correct location ; REP #$20 ; Long A off again ; .LONGA ON ; Repeat for $15 LDA $15 ; now the one located in $15 ; AND #%0000000011111111 ; we want to use only the low byte AND #%0000000011110000 ; isolate the low byte, the "expanded palette" value stored in its upper bits LSR LSR LSR ; get "expanded palette" * 2 (as it's a word per color), and a cleared Carry ; LSR ; shift right 4 times, to get the "expanded palette" value stored in the upper bits ; ASL ; shift left, since it's a word per color ; CLC ADC $1A ; Add to this the "starting" value of X, stored earlier TAX LDA $15 AND #%00001111 ; use only the low bytes, which have the "original palette" information ASL ; again, multiply by 2. and will clear Carry. ; CLC ADC $1C ; Add the "starting" value of Y TAY ; SEP #$20 ; Long A off ; .LONGA OFF LDA >$ED6300,X ; load the expanded colors... STA $81AD,Y ; ...and replace the colors at the correct locations ; INX ; INY ; LDA >$ED6300,X ; load the expanded color... ; STA $81AD,Y ; ...and replace the color at the correct location ; REP #$20 ; Long A off again ; .LONGA ON ; Repeat for $16 LDA $16 ; now the one located in $16 ; AND #%0000000011111111 ; we want to use only the low byte AND #%0000000011110000 ; isolate the low byte, the "expanded palette" value stored in its upper bits LSR LSR LSR ; get "expanded palette" * 2 (as it's a word per color), and a cleared Carry ; LSR ; shift right 4 times, to get the "expanded palette" value stored in the upper bits ; ASL ; shift left, since it's a word per color ; CLC ADC $1A ; Add to this the "starting" value of X, stored earlier TAX LDA $16 AND #%00001111 ; use only the low bytes, which have the "original palette" information ASL ; again, multiply by 2. and will clear Carry. ; CLC ADC $1C ; Add the "starting" value of Y TAY ; SEP #$20 ; Long A off ; .LONGA OFF LDA >$ED6300,X ; load the expanded colors... STA $81AD,Y ; ...and replace the colors at the correct locations ; INX ; INY ; LDA >$ED6300,X ; load the expanded color... ; STA $81AD,Y ; ...and replace the color at the correct location ; REP #$20 ; Long A off again ; .LONGA ON PLY PLX PLA PLP ; done with palette switching code C1_3E48: ;FA PLX PLX C1_3E49: ;FEC461 INC $61C4,X INC $61C4,X C1_3E4C: ;60 RTS RTL ; changed to rtl from rts lbg_data_shifted_bit: .WORD %1, %10, %100, %1000, %10000, %100000, %1000000, %10000000, %100000000, %1000000000, %10000000000, %100000000000, %1000000000000, %10000000000000, %100000000000000, %1000000000000000 BITMASK .EQU lbg_data_shifted_bit BITMASK_LONG .EQU (MY_DATA_BANK*$10000)+(BITMASK-bank_beginning) ; After this, the new code to paste into C3 .REPEAT 64 NOP .ENDR ; NOPs just for a visible separator between sections of code ; Here's the code that will clear the gfx cache with a press of L+R+SELECT ; We should JSR here from C3/1DA4. The first command of that function (which happens to be a JSR) ; should be modified to jump here, to this new function. I am assuming that this new code will ; be located in bank C3 with the old code. So these will be short jumps. ; I pasted this to C3/F0A0 main_menu_cache_clear_code: .LONGA OFF JSR $3548 ; This is just the first line of the new function, which we've moved here LDA $04 ; We're just checking three bytes of memory that store button presses AND #$30 ; To see if the L+R+SELECT combination is pressed CMP #$30 BNE cache_clear_go_home LDA $05 AND #$20 CMP #$20 BNE cache_clear_go_home LDA $06 AND #$30 CMP #$30 BNE cache_clear_go_home LDA >SRAM_CACHE_NONEMPTY_BYTE BEQ cache_clear_go_home ; If this byte is zero, it means there is nothing cached yet. Go home. PHX ; If all the buttons were pressed, loop through the SRAM cache LDX #0 ; Zeroing everything clear_cache_loop: LDA #0 STA >BSPRITE_DATA_START,X INX TXA CMP #SRAM_DATA_BYTES_USED BNE clear_cache_loop LDA #0 STA >SRAM_CACHE_NONEMPTY_BYTE PLX JSR $0EB9 ; Play the "healing" sound cache_clear_go_home: RTS .END