One of the most common problem reports related to Virtual Platforms running Linux goes something like:
I run the simulation and the terminal says "Uncompressing Linux... done, booting the kernel" and nothing happens.
One of my favorite books is Embedded Linux Primer: A Practical Real-World Approach (2nd Edition) by Chris Hallinan. I was happy to see this same situation also occurs on real boards running Linux, and it is covered in Chapter 14 with a section called "When It Doesn't Boot".
Here is a picture of the problem:
During system boot the kernel uses printk() calls to display messages. These messages are useful clues to monitor boot progress. Unfortunately, for the problem described above, there is nothing on the screen to provide any clues.
The reason is because printk() messages go into the Linux kernel message ring buffer, and are not immediately printed on the terminal. The messages are available, but they do not appear on the screen until the buffer is flushed.
Once the system is booted, the dmesg command can be used to display the kernel messages.
Other daemons such as syslogd may also be used to save the messages into a file such as /var/log/* or even log the messages over the network to another machine.
Specifics of how a particular Linux implementation is configured and how messages are stored can be very different. Some products may save logs into files and some may not. Some hardware may have network interfaces to retrieve log files and some may not.
There are actually two challenges to consider when simulating embedded Linux systems:
- Sometimes there is no way to capture the messages from a simulation to the host machine. For systems using a ramdisk as the root file system, the messages may be lost on simulation exit. If the UART output is not saved it could be lost, prohibiting inspection of the messages after simulation has finished.
- Messages are created during the early phases of the Linux boot, but until important device drivers are initialized, the messages sit in the memory buffer and are not visible on the screen. Seeing them sooner would indicate if the kernel is progressing or not and how far it gets before something goes wrong.
Details of Operation:
One approach that requires no Linux knowledge is to make sure models that interface with UART models also write the characters to a log file or special log channel at the same time they are written to the terminal. This solution requires the model creator to put in logging functionality.
The Cadence Virtual Platform for the Xilinx Zynq-7000 EPP does exactly this. The terminal model that connects to the UARTs has the ability to log the output written to the terminal, as well as feed input into the terminal for test automation. To help with timing, the model can wait for expected output before sending input, it can run forward a defined simulation time, send a Control-C to kill a program, and synchronize with embedded software breakpoints. Since the UART model depends on the UART device driver it will only address the first concern, and early printk() messages are still not visible. Having such features to log output and send input is good, but not sufficient to address the second challenge. For more information on the UART refer to my 5 article series.
The printk() function is located in kernel/printk.c in the Linux source tree.
The ring buffer is a static array of characters:static char __log_buf[__LOG_BUF_LEN];
static char *log_buf = __log_buf;
The size of the buffer can be configured at build time via kernel configuration or at runtime using log_buf_len=n where n is the size in bytes, kbytes, or Mbytes.
The variable log_buf_len holds the size of the buffer and the default is 0x4000 (16k).
To find the memory buffer we can use a software debugger like gdb to see the addresses of the global variable. This can be done during a simulation session when gdb is connected or it can be done without actually running and connecting gdb.
For example, for the Zynq Virtual Platform with the dual-core ARM Cortex-A9 CPU I used gdb to find the address of the log_buf array as shown below:
Finding the address of the ring buffer requires vmlinux (with debug symbols) to be available. This is not required to run the system, but is required for source level debugging so I normally have a kernel with debug symbols available during development.
The messages can be seen in the buffer before any messages come on the screen using the Virtual System Platform Memory Viewer. Since gdb prints the virtual address of the buffer, the leading 0xC must be removed to get the physical address to use in the memory viewer.
I ran the Zynq Virtual Platform through some of the early startup functions and nothing is on the terminal, but I can already see the messages in the buffer. It's a common situation for anybody doing source level debug on the kernel to wonder why they are stepping through printk() messages, but nothing is coming on the terminal.
After some time the ring buffer is flushed to the UART and you can see the same messages on the terminal.
The buffer can also be seen in the Embedded Software Debug Memory Sidebar, this memory view is looking at the system from the CPU point of view so the full virtual addresses are used. Since the display is 32-bits per line it does not turn out to very readable as an ASCI view.
One solution to automate the capture of the messages as soon as they are created in the buffer is to put a breakpoint on the C function that writes the character into the ring buffer. The function name is emit_log_char(char c) where the character is passed as a function argument. Since the ARM function call convention places the first function argument in R0 the character can be read from R0 when the breakpoint is hit and printed. Software debuggers can immediately continue so there is no reason to actually stop on every character or hit run again.
When you are running the Zynq Virtual System Platform, a command is available to turn on kernel message logging and it can be turned on and off at any time during the simulation. The kernel messages are shown in the simulation console and also go into the simulation log file. The name of the command is klog, and I circled the buttons on the console used to turn it on and off.
The Imperas VAP (Verification, Analysis, and Processing) tools are also available for the Zynq Virtual Platform and provide another way to automate early access to the message buffer. The Zynq Virtual Platform provides a button on the simulation console to turn on kernel message logging. This application of VAP tools is only a small example of what is possible. More advanced features provide Linux process tracing, profiling, and kernel code coverage, but these are topics for future articles.
VAP tools don't place a breakpoint on the C function, but use a more advanced technique to intercept function calls. This technique is faster than the breakpoint technique, and doesn't require a debugger such as gdb or the built-in VSP embedded software debugger to be connected. VAP tool commands are extensions of the VSP provided tlmcpu command as shown in the console below.
Once a new system is booting successfully and reaching the first message flush, usually nobody cares about this topic any more. But I have found that even after a system is working well, there is always a chance of something going wrong and returning to the fatal "Uncompressing Linux..." blank screen. Sometimes it is due to a change in the kernel configuration, a change in the Linux device tree, or introduction of a new device driver. Because of the dynamic nature of Linux, change is inevitable, and the risk of the boot failure is always there, so learning how to recover is essential to keep projects on track and avoid time consuming debug sessions.
Feel free to share your experiences debugging early system bring-up and any techniques you use to do it productively.