For the last couple of years I’ve been working (on and off) on reverse engineering a DOS game from my childhood - Maupiti Island.
I’ve still haven’t made enough progress, but the other evening I was playing around with the disassembler Cutter, just to see what the newer version are like, and noticed it’s showing me a group of strings Ghidra and IDA didn’t.
These strings looked like debug symbols especially after seeing they mention source code files.
And… turns out they actually are!
Googling around, I found thet debug information for MZ executables was usually stuck at the end of the file, and this stack overflow answer pointed me to a page that had the structure of the debug information header produced by Borland Tools.
Format of Borland debugging information header (following load image): Offset Size Description ) 00h WORD signature 52FBh 02h WORD version ID 04h DWORD size of name pool in bytes 08h WORD number of names in name pool 0Ah WORD number of type entries 0Ch WORD number of structure members 0Eh WORD number of symbols 10h WORD number of global symbols 12h WORD number of modules 14h WORD number of locals (optional) 16h WORD number of scopes in table 18h WORD number of line-number entries 1Ah WORD number of include files 1Ch WORD number of segment records 1Eh WORD number of segment/file correlations 20h DWORD size of load image after removing uninitialized data and debug information 24h DWORD debugger hook; pointer into debugged program whose meaning depends on program flags 28h BYTE program flags bit 0: case-sensitive link bit 1: pascal overlay program 29h WORD no longer used 2Bh WORD size of data pool in bytes 2Dh BYTE padding 2Eh WORD size of following header extension (currently 00h, 10h, or 20h) 30h WORD number of classes 32h WORD number of parents 34h WORD number of global classes (currently unused) 36h WORD number of overloads (currently unused) 38h WORD number of scope classes 3Ah WORD number of module classes 3Ch WORD number of coverage offsets 3Eh DWORD offset relative to symbol base of name pool 42h WORD number of browser information records 44h WORD number of optimized symbol records 46h WORD debugging flags 48h 8 BYTEs padding Note: additional information on the Borland debugging info may be found in Borland's Open Architecture Handbook SeeAlso: #01600
Pretty well documented. And the extra data at the end of the game’s executable did start with the
52FBh magic number!
After playing around with it for an evening, I got to a point where I see all the symbol debug information for functions and globals!
I’ll probably write a Ghidra script to use this info, but for now I’m happy to manually modify the labels when I need to.
If anyone is interested in the template for 010 Editor, it is here. not a full MZ template as it only parses whatever is needed to parse Borland’s debug information.
The debug info in the games isn’t complete, but had enough to clarify some things I was struggling with, and probably saved my a lot of time.
Hope you find this useful :)