Before I bore you with more text, you can get my Fault Decoder from github here. I put in a lot of comments so hopefully you can understand how it works. Please feel free to use it and I welcome suggestions for improvement.
If you have spent much time at all debugging using a Cortex-M based MCU, then you have probably encountered the Hard Fault. I'm not going to explain the Cortex-M fault system here, but it is a very useful mechanism that traps the system state when something goes wrong and provides clues needed to figure out the cause of the fault. When I worked at TI, I wrote an Application Note explaining basic debug of CM faults. Even though it is oriented to TI MCUs, it applies to any Cortex-M MCU.
App note from TI site: http://www.ti.com/lit/an/spma043/spma043.pdf
Oftentimes a debugger is available and you can diagnose the fault using the debugger. But what if the debugger is not available? Recently while working on a hobby project I was using a "Mini-M4" board from MikroElektronika. These boards have a USB boot loader and it is super easy to load your compiled and linked code from your development machine onto the target board. But unless you want to go to some trouble to connect up an ftdi (using openocd) or a J-Link or something like that, you have no debugger available. For simple programs this is usually okay - if you have a serial port then you just do printf debug.
In this particular case the serial output of my program would stop at a certain point. After looking at the code and considering some changes I made, I suspected I was faulting but I wasn't exactly sure why. I knew if I had the fault state information I would be able to figure out the problem in short order. I needed a way to send the fault info to my serial console - I needed a Fault Decoder.
I have probably written several fault decoders while working at different MCU companies. I thought there would be something I could use. I googled around but didn't see anything that was really portable and not tied to a specific MCU vendor. So I decided to take a few minutes and write another one.
Features
Here are the features that were important to me, and how I satisfied them:
- Vendor independent - should not use any particular vendor header files, APIs or register definitions. My implementation only uses standard headers and no vendor headers. It does not use any vendor provided functions.
- As self-contained as possible - you should just be able to add this to your project and go, minimal integration headaches. Instead of trying to find and include a header with the register definitions, I just made macro definitions at the top of the file. As much as possible I used ARM-defined names for all the registers.
- Portable output - decoded fault information can be routed to any kind of output device. The fault decoder relies on the existence of an application provided printf-style function named DbgPrintf(). You implement this and it can be anything you like. Most commonly you will probably redirect or redefine this to your own console serial output function. But it could go to a telnet console or some wireless connection or maybe even log to a file on a SD-card. This is the only external dependency besides the C standard header files.
- Toolchain independent - should be able to use any of several popular compilers. I didn't really accomplish this, mainly because I did not take the time to test other compilers. But there is only one small portion that is compiler dependent and that is a small function that has inline assembly. You should be able to adapt this to any Cortex-M compiler. If you send me a note on github or a pull request I'll add other compiler support.
Caveats
If a fault occurs, your system may not even be stable enough for the fault decoder to run. So, it will not decode all kinds of faults. In particular if you have a stack-related fault, it will probably not work because it relies on being able to make a function call that will use the stack. And the decoder function itself will push working registers on the stack.
Fault Decoder in Action
When I added this to my program, this was the output I saw:
*** Fault occurred ***
Stack Frame
----------
R0 R1 R2 R3 R12 LR PC xPSR
40038000 0000000c 00000000 00000000 4000c000 00000623 000017c0 01000200
MMFSR:
MMFAR: 40038014
BFSR: BFARVALID PRECISERR
BFAR: 40038014
UFSR :
The first part of the decoded output is the exception stack frame. This alone can provide useful information. For example PC = 0x17C0
so that is most likely the instruction that caused the fault. And the LR
probably (but not always) indicates the the most recent function call.
With this information I ran arm-none-eabi-objdump -d myprogram.axf
on the linked output of my program. Then I scanned through the program disassembly until I got to the PC address and saw this:
000017ba <ADCSequenceConfigure>:
17ba: b530 push {r4, r5, lr}
17bc: 0089 lsls r1, r1, #2
17be: 250f movs r5, #15
17c0: 6944 ldr r4, [r0, #20]
17c2: 408d lsls r5, r1
That instruction at 0x17C0
is a LDR
instruction and it probably means that something about the load address is bad. The instruction is using R0
as the base. From the fault decoder I see that R0=40038000
. Now I look in the TI data sheet and see that address is part of the A/D converter peripheral. Notice in the disassembly that I was in an ADC function. I know from experience that with a TI Tiva or Stellaris MCU this usually means that the peripheral has not been enabled.
I now have enough information to solve my problem, but I'll explain the rest of the fault decoder output because it shows me the same information in a different way.
MMFSR:
MMFAR: 40038014
This shows the Memory Management Fault Register. Any flags set would be listed here. The MMFAR is the MM Fault Address Register. If there were a memory management fault, then this register could contain the faulting address. In this case the address register is not meaningful.
BFSR: BFARVALID PRECISERR
BFAR: 40038014
This is the Bus Fault Status Register and BF Address Register. Two flags are set, the Bus Fault Address Valid bit, and the Precise Error bit. The valid bit means the address register has a valid address, and the precise bit means that the address is exactly the address that caused the fault. There can also be an imprecise error which means the address is not exact (but still provides a clue).
In my case, this indicates that I had a bus fault at address 0x40038014
. This is one of the ADC registers which just confirms what I concluded from looking at the exception stack frame (described above).
UFSR :
This last line is the Usage Fault Status Register. It also has a set of flags that indicate various kinds of usage faults, none of which occurred in this case.
Finally, here is the code that caused the problem:
// Set up ADC ...
SysCtlPeripheralEnable(SYSCTL_PERIPH_ADC0);
ADCSequenceConfigure(ADC0_BASE, 3, ADC_TRIGGER_PROCESSOR, 0);
ADCSequenceStepConfigure(ADC0_BASE, 3, 0, ADC_CTL_TS | ADC_CTL_END);
ADCSequenceEnable(ADC0_BASE, 3);
The call to ADCSequenceConfigure()
is causing the fault. I saw this in the disassembled code. For Tiva and Stellaris MCUs, you need to wait a few cycles after enabling a peripheral before you can use it. Because the "Configure" function executes several other instructions before actually accessing the peripheral I thought it would not be a problem, but I was wrong and I still needed a little more time. The fix is to either insert a delay or some other code after the "Enable" function, or to move the enable to earlier in the initialization code so that it is well-enabled by the time I actually start to initialize the peripheral.
So thats it. There is another Cortex-M fault decoder function out in the world. Maybe it will help you. Thanks for reading.
Comments
There are no comments yet.