Sherlock, and the Symtab library upon which it is based, are intended as diagnostic tools to aid software development on RISC OS, particularly for debugging awkward failures on released software/OS builds.
The linker included in the Norcroft toolchain has always had the capacity to emit a list of symbol addresses/offsets alongside the generated binary code and this can be invaluable when mapping the raw address of a failure into a location within the source/data of a program. This has always been a laborious manual process, however, which the Symtab library aims to simplify greatly.
The library concerns itself with the loading, parsing and searching of a number of symbol tables, potentially one for each and every loaded RISC OS relocatable module, including the ROM modules, and one for the application code itself. Then, from this set of symbol tables is may be used to look up the address of a failing instruction/memory access and map that address into a symbol triplet of 'module name', 'symbol name' and 'address offset.' Without additional debugging information built into the failing binary, this is about as far as the mapping can really proceed, but it is usually enough to locate the line/datum in question very easily.
Symtab is deliberately designed to be self-contained using only a bare minimum of ISO C run-time library functions, with callbacks into the client code for any required complex functionality, so that it may easily be incorporated into very low-level code and used+ in circumstances when the C runtime may not be (fully) usable.
Sherlock is a RISC OS relocatable module that employs the Symtab library to provide a number of *commands which are rough parallels of those provided by the standard Debugger module, with the addition of symbolic information:
Load a symbol file called 'tsym' into memory, naming the table 'test' (in this case an application).
*SymLoad test##tsymLoaded symbol tables for test from 'tsym'
List information on the symbol table(s) that have been loaded. Information on specific tables, rather than all loaded tables, may be requested by specifying the table name(s) on the command-line. A '-V' parameter lists verbose information including all of the symbols within the table.
*SymTablesSymbol table 'test' at c90d9dec 3 blocks Block 0 (ReadOnly): 0x8080 to 0xCDF4 476 symbols Block 1 (ReadWrite): 0xCDF4 to 0xCE70 6 symbols Block 2 (ZeroInit): 0xCE70 to 0xDF74 77 symbols
To produce a symbolic disassembly of an address range, the *SMemoryI command parallels the standard *MemoryI command of the Debugger module, supporting the same syntax, but the first address may also be symbolic, as showing in the second example below.
*smemoryi 817c + 400000817C : E3550000 : file_close+0x48 : CMP R5,#0 00008180 : 11A00005 : file_close+0x4C : MOVNE R0,R5 00008184 : E91BA870 : file_close+0x50 : LDMDB R11,{R4-R6,R11,R13,PC} 00008188 : 0000CC00 : file_close+0x54 : ANDEQ R12,R0,R0,LSL #24 0000818C : 00000066 : file_close+0x58 : ANDEQ R0,R0,R6,RRX 00008190 : 656C6966 : file_close+0x5C : STRVSB R6,[R12,#-2406]! 00008194 : 7465675F : file_close+0x60 : STRVCBT R6,[R5],#-1887 00008198 : 00000073 : file_close+0x64 : ANDEQ R0,R0,R3,ROR R0 0000819C : FF00000C : file_close+0x68 : Undefined instruction 000081A0 : E1A0C00D : file_gets : MOV R12,R13 000081A4 : E92D000F : file_gets+0x4 : STMDB R13!,{R0-R3} 000081A8 : E92DDBF0 : file_gets+0x8 : STMDB R13!,{R4-R9,R11,R12,R14,PC} 000081AC : E24CB014 : file_gets+0xC : SUB R11,R12,# 000081B0 : E15D000A : file_gets+0x10 : CMP R13,R10 000081B4 : 4B000FB5 : file_gets+0x14 : BLMI __rt_stkovf_split_small 000081B8 : E1B04001 : file_gets+0x18 : MOVS R4,R1*smemoryi area_names + 40
0000CE20 : 00008454 : area_names : ANDEQ R8,R0,R4,ASR R4 0000CE24 : 00008460 : area_names+0x4 : ANDEQ R8,R0,R0,ROR #8 0000CE28 : 0000846C : area_names+0x8 : ANDEQ R8,R0,R12,ROR #8 0000CE2C : 6E6E553C : area_names+0xC : MCRVS CP5,3,R5,C14,C12,1 0000CE30 : 64656D61 : area_names+0x10 : STRVSBT R6,[R5],#-3425 0000CE34 : 3028203E : area_names+0x14 : EORCC R2,R8,R14,LSR R0 0000CE38 : 58585878 : area_names+0x18 : LDMPLDA R8,{R3-R6,R11,R12,R14}^ 0000CE3C : 58585858 : area_names+0x1C : LDMPLDA R8,{R3,R4,R6,R11,R12,R14}^ 0000CE40 : 00002958 : area_names+0x20 : ANDEQ R2,R0,R8,ASR R9 ; *** Not R8-R14 0000CE44 : 00000000 : area_names+0x24 : ANDEQ R0,R0,R0 0000CE48 : 00000000 : area_names+0x28 : ANDEQ R0,R0,R0 0000CE4C : 00000000 : test_handle : ANDEQ R0,R0,R0 0000CE50 : 0000B2A0 : fn : ANDEQ R11,R0,R0,LSR #5 0000CE54 : 0000B390 : fn+0x4 : Undefined instruction 0000CE58 : 00008090 : fn+0x8 : Undefined instruction 0000CE5C : 00008134 : fn+0xC : ANDEQ R8,R0,R4,LSR R1
To load the symbols for a relocatable module, it is useful to specify the name of the module as a prefix to the filename. For example, the following command instructs the Sherlock module to load its own symbol table from the file 'sym' When the table is loaded, Sherlock will check for the presence of a loaded Relocatable Module with the given name, and thus maps the offsets specified in the symbol file into absolute addresses. It will also do this if the module is later loaded/reloaded, so the order in which the table and the module itself are loaded does not matter.
*SymLoad Sherlock##symSetting base of 539800212 'ReadOnly' to 202CB294 Setting base of 539431508 'ReadWrite' to 20271254 Setting base of 539431712 'ZeroInit' to 20271320 Loaded symbol tables for Sherlock from 'sym'
The *SMemoryS command provides a crude backtrace/dump of the given stack/address range. If a CPU mode is specified on the command-line, rather than an address range as for the other *commands, the current stack pointer for that mode is read and used as the start address. Here is part of the output produced when *SMemoryS is called for the SVC stack, and we can see that the Sherlock module is itself threaded and its addresses appear on the Supervisor stack because it is processing the *command. Clearly this is of limited utility at present, and requires a SWI/lower-level interface to achieve its true potential.
*SMemoryS SVCFA207F40 : 202745BD -> : .E' FA207F44 : FA208000 -> : .. . FA207F48 : FA207F40 -> : @. . FA207F4C : 202745BD -> : .E' FA207F50 : 00000001 -> : .... FA207F54 : 00000003 -> : .... FA207F58 : FB407C0C -> : .|@. FA207F5C : 23F60D4C -> : L..# FA207F60 : 23DBFF9C -> : ...# FA207F64 : 00000003 -> : .... FA207F68 : FFFFFFFF -> : .... FA207F6C : 00000000 -> : .... FA207F70 : 00000000 -> : .... FA207F74 : FA207F80 -> : .. . FA207F78 : 202CB478 -> Sherlock##__module_header+0x1E4 : x., FA207F7C : 202CE120 -> Sherlock##module_command+0xC : ., FA207F80 : 202745BD -> : .E' FA207F84 : 00000053 -> : S... FA207F88 : FB407BF4 -> : .{@. FA207F8C : FC02389C -> : .8.. FA207F90 : FFFFFFFF -> : .... FA207F94 : 202CB3F0 -> Sherlock##__module_header+0x15C : .., FA207F98 : 202745B4 -> : .E' FA207F9C : 00000110 -> : .... ...*smemoryi file_close
202CB7D8 : E1A0C00D : file_close : MOV R12,R13 202CB7DC : E92DD873 : file_close+0x4 : STMDB R13!,{R0,R1,R4-R6,R11,R12,R14,PC} 202CB7E0 : E24CB004 : file_close+0x8 : SUB R11,R12,#4 202CB7E4 : E15D000A : file_close+0xC : CMP R13,R10 202CB7E8 : 4B00190F : file_close+0x10 : BLMI __rt_stkovf_split_small 202CB7EC : E1B06001 : file_close+0x14 : MOVS R6,R1 202CB7F0 : E1A04000 : file_close+0x18 : MOV R4,R0 202CB7F4 : 059F1030 : file_close+0x1C : LDREQ R1,file_close+0x54 202CB7F8 : 024F2F11 : file_close+0x20 : ADREQ R2,file_open+0x88 202CB7FC : 028F0F0B : file_close+0x24 : ADREQ R0,file_close+0x58 202CB800 : 03A0303D : file_close+0x28 : MOVEQ R3,# 202CB804 : 0B001AB3 : file_close+0x2C : BLEQ __assert2 202CB808 : E5960000 : file_close+0x30 : LDR R0,[R6,#0] 202CB80C : EB0016E4 : file_close+0x34 : BL xosfind_closew 202CB810 : E1A05000 : file_close+0x38 : MOV R5,R0 202CB814 : E1A01006 : file_close+0x3C : MOV R1,R6 202CB818 : E1A00004 : file_close+0x40 : MOV R0,R4 202CB81C : EB00014A : file_close+0x44 : BL mem_free 202CB820 : E3550000 : file_close+0x48 : CMP R5,#0 202CB824 : 11A00005 : file_close+0x4C : MOVNE R0,R5 202CB828 : E91BA870 : file_close+0x50 : LDMDB R11,{R4-R6,R11,R13,PC} 202CB82C : 202D279C : file_close+0x54 : MLACS R13,R12,R7,R2 202CB830 : 00000066 : file_close+0x58 : ANDEQ R0,R0,R6,RRX 202CB834 : 656C6966 : file_close+0x5C : STRVSB R6,[R12,#-2406]!
Sherlock installs handlers on all of the processor exception vectors so that it can catch and log invalid memory accesses, attempts to execute Undefined Instructions etc. When an exception occurs, Sherlock writes the contents of the registers, stack and code disassembly/source into its internal log buffer, which may subsequently be streamed out to disk for a more permanent record of the failure.
A snipping from the log output produced by Sherlock in response to a Data Abort exception occurring within its own code is shown below. Looking at the log we can immediately see from the FAR (Fault Address Register), register context and disassembly that the problem is an invocation of strlen(NULL) and looking at the stack dump that follows the disassembly, we see the return addresses indicating a call to printf() which is invoking strlen, and that the printf() call occurs within the dump_bintree() function used by dump_table()/list_symbol_tables(), which is indeed what happened; the crash occurred when dumping -V(erbose) information on a loaded symbol table.
Data Abort at &20225848 Register dump (stored at &C94DEFB4) is: R0 = E59FF464 -> R1 = FA207EFC -> R2 = E59FF464 -> R3 = FA207EFC -> R4 = 00000000 -> R5 = 00000000 -> R6 = FA207EEC -> R7 = 00000008 -> R8 = 2022FE45 -> Sherlock##dump_bintree+0x99 R9 = FFFFFFFF -> R10 = FFFFFFFF -> R11 = 202307B8 -> Sherlock##out_chars R12 = 00000008 -> R13 = FA207EBC -> R14 = 2022BC30 -> Sherlock##rts_intern_printf+0x73 R15 = 20225848 -> Sherlock##strlen+0x8 Mode SVC32 flags set: NzCvqjggggeAift PSR = A0000113 FAR = E59FF464 FSR = 00000005 ... 2022582C : .`.. : 85C36001 : Sherlock##memcpy+0x85C : STRHIB R6,[R3,#1] 20225830 : .`.. : E49D6004 : Sherlock##memcpy+0x860 : LDR R6,[R13],#4 20225834 : .P.. : E49D5004 : Sherlock##memcpy+0x864 : LDR R5,[R13],#4 20225838 : .@.. : E49D4004 : Sherlock##memcpy+0x868 : LDR R4,[R13],#4 2022583C : .... : E49DF004 : Sherlock##memcpy+0x86C : LDR PC,[R13],#4 --> 20225840 : .... : E590C000 : Sherlock##strlen : LDR R12,[R0,#0] 20225844 : .0.. : E3A03001 : Sherlock##strlen+0x4 : MOV R3,#1 20225848 : .... : E1A01000 : Sherlock##strlen+0x8 : MOV R1,R0 2022584C : .4.. : E1833403 : Sherlock##strlen+0xC : ORR R3,R3,R3,LSL #8 20225850 : . .. : E2102003 : Sherlock##strlen+0x10 : ANDS R2,R0,#3 20225854 : .8.. : E1833803 : Sherlock##strlen+0x14 : ORR R3,R3,R3,LSL #16 2022585C : . \. : E05C2003 : Sherlock##strlen+0x1C : SUBS R2,R12,R3 ... ... FA207E94 : .F.. : FC1B46EC -> Debugger##Code+0x514 FA207E98 : .\# : 20235CB0 -> Sherlock##mem_blocks+0x300C FA207E9C : .... : FC020610 -> FA207EA0 : ...` : 60000193 -> FA207EA4 : .... : 00060380 -> FA207EA8 : .N" : 20224E98 -> Sherlock##xdebugger_disassemble_ FA207EAC : |.# : 20231D7C -> Sherlock##barmenu_defn+0xC FA207EB0 : ... : 00000020 -> FA207EB4 : .M. : C94DEF20 -> FA207EB8 : 0." : 2022BC30 -> Sherlock##rts_intern_printf+0x73 --> FA207EBC : d... : E59FF464 -> FA207EC0 : .... : 00000000 -> FA207EC4 : .... : 00000001 -> FA207EC8 : .... : 00000000 -> FA207ECC : .... : 00000000 -> FA207ED0 : .Z" : 20225A0C -> Sherlock##printf FA207ED4 : .... : 00000003 -> FA207ED8 : ..R. : C9520FA8 -> FA207EDC : ..R. : C9520FDC -> FA207EE0 : T... : 00000054 -> FA207EE4 : .... : 00000002 -> FA207EE8 : ..# : 20230894 -> Sherlock##_printf+0x24 FA207EEC : .. . : FA207F00 -> FA207EF0 : .." : 2022FDD8 -> Sherlock##dump_bintree+0x2C FA207EF4 : <." : 2022FE3C -> Sherlock##dump_bintree+0x90 FA207EF8 : .... : E28FD0D8 -> FA207EFC : d... : E59FF464 -> FA207F00 : .." : 2022FD96 -> Sherlock##out_spaces+0x46 FA207F04 : .... : 00000000 -> FA207F08 : t2. : 20003274 -> FPEmulator##Work FA207F0C : .Z" : 20225A0C -> Sherlock##printf FA207F10 : .... : 00000000 -> FA207F14 : .." : 2022FFF0 -> Sherlock##dump_table+0x188 FA207F18 : ..R. : C9520FA8 -> FA207F1C : .Z" : 20225A0C -> Sherlock##printf FA207F20 : .... : 00000001 -> FA207F24 : .m# : 20236DAB -> Sherlock##emuda_handler+0x27 FA207F28 : .... : 00000000 -> FA207F2C : .m# : 20236DA1 -> Sherlock##emuda_handler+0x1D FA207F30 : .... : 00000001 -> FA207F34 : .... : 00000001 -> FA207F38 : .... : 00001000 -> FA207F3C : .... : 00000004 -> FA207F40 : .." : 202294A8 -> Sherlock##direct_out FA207F44 : D." : 20229D44 -> Sherlock##list_symbol_tables+0x8 FA207F48 : .... : 00000001 -> FA207F4C : .... : 00000002 -> ...
Although the original goal of the Sherlock module was to assist with diagnosis and debugging of released code that contains no debugging information, it also supports application/module images that have been built with ASD-format Debug information included. This permits viewing code at the level of C/assembler source, greatly easing the task of locating a failure within the source code.
Sherlock will automatically spot when an application/module containing ASD information is loaded and will retain a copy of that information in case of a subsequent failure. In subsequent exception logs, or in the output of *commands such as *SMemoryI program source can be displayed alongside addresses for which ASD information is available as illustrated below:
204e8aa8 : Sherlock##module_init+0x8 : SUB R13,R13,#4 : 204e8aac : Sherlock##module_init+0xC : MOV R1,R13 : err = mem_init(max_size, &mem_base); 204e8ab0 : Sherlock##module_init+0x10 : MOV R0,#&02000000 : 204e8ab4 : Sherlock##module_init+0x14 : BL mem_init : 204e8ab8 : Sherlock##module_init+0x18 : MOVS R6,R0 : if (!err) 204e8abc : Sherlock##module_init+0x1C : BNE module_init+0x98 : { 204e8ac0 : Sherlock##module_init+0x20 : LDR R0,[R13,#0] : 204e8ac4 : Sherlock##module_init+0x24 : MOV R1,#&00040000 : err = log_init(mem_base + MEM_LOG_OFFSET, LOG_MAX... 204e8ac8 : Sherlock##module_init+0x28 : ADD R0,R0,#&6000 : 204e8acc : Sherlock##module_init+0x2C : BL log_init : 204e8ad0 : Sherlock##module_init+0x30 : MOVS R6,R0 : if (!err) 204e8ad4 : Sherlock##module_init+0x34 : BNE module_init+0x94 : { 204e8ad8 : Sherlock##module_init+0x38 : BL symtab_init : err = symtab_init(); 204e8adc : Sherlock##module_init+0x3C : MOVS R6,R0 : if (!err) 204e8ae0 : Sherlock##module_init+0x40 : BNE module_init+0x90 : { 204e8ae4 : Sherlock##module_init+0x44 : MOV R0,R4 : err = process_symbols_cmd(cmd_tail); 204e8ae8 : Sherlock##module_init+0x48 : BL process_symbols_cmd : 204e8aec : Sherlock##module_init+0x4C : MOVS R6,R0 : if (err) 204e8af0 : Sherlock##module_init+0x50 : BEQ module_init+0x5C : 204e8af4 : Sherlock##module_init+0x54 : BL symtab_fin : (void)symtab_fin();
The Sherlock module has the ability to save the memory/state of the system to a 'dump' file which may then be studied later by redirecting all of Sherlock's *commands to operate upon the file rather than the live system upon which Sherlock is running. This facility makes it possible to capture a (possibly intermittent) failure occurring on the user's machine and then study it on the remote machine of the software developer.
*SMemSave <filename> captures the current state of processor registers, memory contents, loaded symbol tables and source code etc. The result is a single 'memory dump' file which may then be loaded into Sherlock on any machine using the *SMemLoad <filename> command.
To unload a memory dump from Sherlock and return the *commands to operation upon the live system simply issue the *SMemLoad command without a filename parameter.
An in-progress development build - binary only for now, whilst I continue working on the source and tidying a few loose ends - may be downloaded here
For module code that fails but leaves the system sufficiently usable that *commands may still be entered, it should be simple to use the Sherlock module even in its current nascent state of development, since the code will necessarily already be in memory.
To investigate a fault induced within application code will currently require manual loading of the symbol table and application binary into memory, eg. from a TaskWindow, issue *SymLoad <symbol file>, followed by *Load <executable image>, bearing in mind that the binary will not be executed, and must thus be a raw (not compressed/encrypted) copy of the in-memory executable at the point of failure. If you have an utility that will produce a copy of the application memory at the point of failure, or grab the memory contents using your favourite source editor, then you may choose to load that instead using a similar *Load command.
A possibly future extension to Sherlock may be to introduce a SWI/direct interface to the routines which perform these operations. It could also be beneficial to introduce calls from the ZeroPain module into Sherlock or the underlying Symtab library, so that non-faulted accesses to zero page may be logged in a symbolic form whilst the application continues running.
Please get in touch if you have any suggestions for further development of the Sherlock module or the underlying Symtab library, to make it more useful as a diagnostic/development tool. In due course it is my intention to release all of the code as open source for the benefit of all developers, and so that the library may readily be incorporated into other tools.