Limitations

The debugging system, consisting of the hardware debugger, the GDB server, and GDB itself, has a number of inherent limitations. Some aspects of the hardware may not be debuggable at all or only with some extra effort. And sometimes the behavior of the MCU in a debugging environment is significantly different from the behavior shown in a non-debugging environment.

Bootloader

Bootloaders will usually be erased when running the debugger.

In a debugWIRE context, the entire chip needs to be erased if some lock bits are set. Further, the BOOTRST fuse is disabled so that execution always starts at location 0x0000. If one wants to have a bootloader present, because it may provide services, such as writing to flash memory, one needs to load it before starting a debugging session without setting any lock bits. If one, in addition, wants to debug the bootloader, one can disallow that PyAvrOCD manages the BOOTRST fuse by using the command line option --manage nobootrst.

When debugging with JTAG, the chip will be erased each time a new binary is loaded. Suppose you want to keep the bootloader in memory. In that case, you can request not to erase the chip before loading a binary, erasing each flash page only when some code needs to be loaded into this page: --erasebeforeload disable. However, this will severely slow down the process of loading a binary.

Low CPU clock frequency

Low CPU clock frequencies can make the debugging process sluggish or even impossible.

When using debugWIRE, the communication speed with the target is determined by the clock frequency of the target. It is usually clock frequency divided by 8, but not higher than 250k bps. If you use a clock frequency of 128 kHz, then the communication speed will be 16000 bps, which is quite slow. If the CKDIV8 fuse is programmed, then this would be only 2000 bps, at which point some of the hardware programmers may time out.

With JTAG, things are similar. While the JTAG programming clock frequency is independent of the clock frequency of the target, the JTAG debugging frequency should not be higher than one-quarter of the MCU clock frequency.

In general, one should not choose CPU clock frequencies below 1 MHz while debugging.

Low-power properties

Almost all power-saving features of the MCU are disabled while debugging.

The reason for this is that all clocks need to be running so that the communication between the hardware debugger and the OCD does not break down. So, the functional behavior of the MCU can be debugged, but low-power properties cannot be tested while debugging.

Compiler optimizations

When trying to debug a program compiled with the 'usual' compiler options, one often ends up in unexpected places or receives warnings.

Usually, the compiler optimizes for space, trying to fit as much program code as possible into the limited amount of flash memory. This, however, might imply that some code is reordered and inlined. This means that single-stepping can be confusing, that one cannot stop at some places, or that a finish command will lead to an error message.

When using the -Og compiler optimization option, the compiler aims at preserving the structure of the program at the expense of perhaps using more flash memory. In the Arduino IDE 2, this is forced by enabling Optimize for Debugging in the Sketch menu.

Disappearing bugs

The -Og compiler optimization option supports debugging as long as one is hunting bugs in the program logic. However, some bugs may silently disappear, or the effects of a bug may change. These bugs, which are called Heisenbugs, often appear in connection with (illegal) access to data on the stack, stack overflows, with volatile data, or with race conditions. Thus, if a bug disappears when optimizing for debugging is enabled, one should watch out for such a Heisenbug and debug a binary that has been compiled without the optimization for debugging option.

Bugs appearing out of thin air

Another effect of using the -Og compiler optimization could be that all of a sudden, new bugs show up. Since code generation is different, some time-critical parts of the code could change and lead to erroneous behavior. This happened, for instance, in the case of the FastLED library. Here, a chain of assembly inline code statements was compiled differently with the -Og option. It is actually a sign that the author of the original library code did not do a good job, since inline assembly code should precisely shield you from different ways to generate code. However, in this case, a large number of assembly inline statements were chained and gave the compiler the freedom to do some 'crazy' things between these statements.

Link-time optimization can optimize away important structural debug information about C++ objects and global variables.

Link-time optimization is a relatively new technique and was introduced into the Arudinio IDE only in 2020. It optimizes across all compilation units and is able to prune away unused functions and data structures, as well as inlining functions across compilation units.

The disadvantage is that link-time optimization prunes away essential information about C++ objects so that class instances all of a sudden seem to be variables of a structure type. Furthermore, they prune away the info that variables are global, which means that in the VARIABLES debugging pane of the Arduino IDE 2, no variables are displayed. Finally, because of aggressive inlining, this technique can provoke stack overflows.

All these problems disappear when link-time optimization is disabled. However, in this case, much more code space is needed.

Breakpoints in interrupt routines

Breakpoints in interrupt routines can throw off the timing of time-critical code.

It is a good thing that one can put breakpoints in interrupt service routines. However, usually interrupt routines are meant to react fast to time-critical events from the environment. After having stopped in an interrupt routine, it might not be meaningful to continue executing, because the events that the routine was supposed to handle are long gone.

Conditional breakpoints

Using conditional breakpoints can slow down execution significantly.

One can attach conditions to breakpoints using the GDB command condition or by right-clicking on the breakpoint in an IDE/GUI. Every time the breakpoint is hit, the condition is evaluated, and execution stops only when the expression evaluates to true. Similarly, with the GDB command ignore, one can request that a stop be performed only after a given number of breakpoint hits. Again, this is also possible through an IDE/GUI. While this is a handy tool, it is also very costly in terms of execution time. Each stop can take 100 milliseconds or more, meaning that a simple loop with 1000 iterations can easily take 100 seconds (roughly two minutes). In other words, never try to do that with a loop that will iterate 10000 times.

Single-stepping

Single-stepping is not the same as executing the instruction in its usual context.

Single-stepping throws off timing

If you single-step over instructions (either explicitly or implicitly), then it will take much longer than if it were executed directly. This means that under some circumstances, some code is successful when single-stepped, but not when executed:

OUT PORTB, 0x66
IN R2, PINB

When running this code normally, the value 0x66 would not read back to the R2 register. However, when single-stepping this code, there is enough time for the new value to settle and be readable. In order to make 0x66 readable under normal execution, a NOP is necessary between OUT and IN.

On the other hand, often instructions need to be executed closely together. Since the I/O clock and peripherals continue to run at full speed in stopped mode, single-stepping through such code will not meet the timing requirements. To successfully read or write registers with such timing requirements, the whole read or write sequence should be performed as an atomic operation running the device at full speed.

Single-stepping BREAK instructions

BREAK instructions are used to implement software breakpoints. However, it can happen that the debugger is asked to single-step over a BREAK instruction that has not been inserted as a software breakpoint. Either the user has placed the instruction explicitly into the code (for unknown reasons), or this instruction is there from a previous debugging session that has been ended abruptly (more likely). In any case, it does not make sense to continue executing the code, which is reported back to the user.

Single-stepping SLEEP instructions

Single-stepping means that a single instruction is executed and then control is immediately returned to the debugger. This does not work with a SLEEP instruction since executing it means waiting for some external event to end it. For this reason, when single-stepping a SLEEP instruction, it is treated as a NOP instruction. When you want to debug the sleep state, use a breakpoint.

I/O register access

Some I/O registers cannot be accessed from the debugging UI.

Certain I/O registers cannot be read without side effects, such as clearing flags or reading buffered data (e.g., the registers UDR and SPDR). These registers are write-only for the debugger and will always show a 0x00 when reading in the debugging user interface. If you use the Arduino IDE 2 or PlatformIO, then the PERIPHERALS debugger pane will show you a comment to this effect.

Other I/O registers cannot be written to without side-effects, e.g., registers where a flag is cleared by writing a '1' to a particular bit. These are read-only to the debugger, and any write attempt will fail silently (but PyAvrOCD will issue a warning). Again, if you use the Arduino IDE or PlatformIO, the PERIPHERALS pane will inform you about the fact that the register is read-only to the debugger.

Unsafe exits from debugging

MCU and hardware debugger can be in an undefined state after abrupt exits.

When a debugging session is abruptly ended by removing power, removing the connection to the debugger, or by other means, the MCU might end up in a 'dirty' state:

  • Some fuses, such as OCDEN, might still be in the programmed state.
  • Software breakpoints (that are implemented as BREAK instructions) may not have been removed.

It may be enough to start a new debugging session and reflash the program. However, the hardware debugger might also be in a confused state and reject any communication attempt. In this case, the best way to proceed is to disconnect and reconnect all the devices before starting a new debugging session.

Undebuggable MCUs

Some AVR MCUs are not debuggable or offer only limited debug support.

MCUs without a debugging interface (e.g., ATtiny15, ATmega8) can, of course, not be debugged. In addition, there exist a few variants that cannot be debugged because they have special features that make them undebuggable by GDB. These are:

  • ATmega48,
  • ATmega88,
  • ATmega16,

all without any A- or P-suffix. These MCUs have a stuck-at-1 bit in their program counter, which confuses GDB. The Microchip debugging solutions have apparently found a solution around it. Since these chips have the same chip signature as their cousins with an A-suffix, it takes some effort to identify and reject them. Furthermore, I have seen a relabeled ATmega16 chip, which was sold as an ATmega16A, but the internal revision number did not match.

Finally, we have the ATmega128(A), which offers only hardware breakpoints. This is a bit funny since the data sheet explicitly states that the BREAK instruction can be used to implement software breakpoints. However, all manuals of the more recent Atmel debuggers note that one can use only the hardware breakpoints on an ATmega128(A). And a call to software_breakpoint_set throws indeed an exception. For this reason, PyAvrOCD will automatically select the 'hardware breakpoint only' mode (not yet implemented).