Compilation process with Linker and Startup code
The compilation is a method whereby the source code is converted into object code. There are some steps involved to produce the final executable file. Linker and Startup code are two important parts of the compilation and boot up to make sure binary code is orginzed and system can run.
Last update: 2022-06-04
Table of Content
Overview#
Following are the steps that a program goes through until it is translated into an executable form:
- Preprocessing
- Compilation
- Assembly
- Linking
In general, compilation for Microprocessors is the same as the Compilation process for executables on an Operating System. However, there are some main different points:
- Cross-compilation: MCUs can not run a compiler itself, therefore, there must be a cross-compiler for MCUs
- Library: MCUs use a light-weight version of libraries to reduce the program footprint and might increase performance
- Hardware-depend: Many libraries only implement minimal code which mainly does nothing, such as the standard I/O. On a specific hardware, the actual implementation must be done.
- Linking: The executables have to define sections stored in different memory spaces in runtime (Flash/ RAM). On MCU, CPU can directly execute instructions on Flash device.
For the general steps in compilation, refer to the Compilation for C/C++ on OS.
ARM Toolchain#
If you use STM32Cube IDE, the IDE already has a toolchain for STM32 MCUs. If you start without the IDE, you can start with Arm GNU Toolchain.
A good alternative toolchain package is The xPack Build Framework:
The xPack project aims to provide a set of cross-platform tools to manage, configure and build complex, modular, multi-target (multi-architecture, multi-board, multi-toolchain) projects, with an emphasis on C/C++ and bare-metal embedded projects.
- @xpack-dev-tools/arm-none-eabi-gcc - the xPack Arm Embedded GCC toolchain
- @xpack-dev-tools/openocd - the xPack OpenOCD
Download and install them. Note to add the binary folders to the system environment.
EABI
The default ARM tool chain application binary interface is the Embedded Application Binary Interface (EABI). It defines the conventions for files, data types, register mapping, stack frame and parameter passing rules. The EABI is commonly used on ARM and PowerPC CPUs.
Example program#
This is the Blink - Hello World program: Blink a LED at 10 Hz.
You can use any STM32 board because this is just a very simple project. I choose to use a Nucleo-64 board with STM32F411RE.
The main code to blink LED on PA5 by using registers:
#include <stdint.h>
#include "delay.h"
/* Clock */
#define RCC_AHB1ENR *((volatile uint32_t*) (0x40023830))
/* GPIO A */
#define GPIOA_MODER *((volatile uint32_t*) (0x40020000))
#define GPIOA_BSRR *((volatile uint32_t*) (0x40020018))
/* Global initialized variable */
uint32_t isLoop = 1;
int main() {
/* turn on clock on GPIOA */
RCC_AHB1ENR |= (1 << 0);
/* set PA5 to output mode */
GPIOA_MODER &= ~(1 << 11);
GPIOA_MODER |= (1 << 10);
while(isLoop) {
/* set HIGH on PA5 */
GPIOA_BSRR |= (1 << 5);
delay();
/* set LOW on PA5 */
GPIOA_BSRR |= (1 << (5+16));
delay();
}
return 0;
}
The delay function using a busy loop:
#include <stdint.h>
/* Global Read-only variable */
const uint32_t DELAY_MAX = 0x0000BEEF;
/* Global Uninitialized varible */
uint32_t delay_counter;
void delay() {
for(delay_counter=DELAY_MAX; delay_counter--;);
}
Firstly, just try to compile the program without any specific option:
arm-none-eabi-gcc \
main.c delay.c \
-o main.elf
/arm-none-eabi/bin/ld.exe:
/arm-none-eabi/lib\libc.a(lib_a-exit.o): in function `exit':
(.text.exit+0x2c): undefined reference to `_exit'
Of course, you can not compile the source code!
By default, GCC tries to link the application with libc
in newlib
package, and there is no implementation for the function _exit
.
At this time, we will tell the compiler to not use the standard libraries.
arm-none-eabi-gcc \
-nostdlib \
main.c delay.c \
-o main.elf
Just ignore the warning about entry symbol. We’ll fix it later.
Compiler options#
In the Step 5: Check the compilation settings, the STM32Cube IDE automatically sets some compilation flags. How do you select those flags?
GNU Online Documentation is available for different versions.
- Target Architecture
-
We can either use
-march=
or-mcpu=
options, but-mcpu=cortex-m4
is easy to understand and remember. Note that Cortex-M only supports Thumb instruction set, so you have to use-mthumb
option. Let use soft Floating Point at this moment.-mcpu=cortex-m4 -mthumb -mfloat-abi=soft
- Target GNU standard
-
-std=gnu11
- Target Libraries
-
GNU ARM libraries use newlib to provide standard implementation of C libraries. To reduce the code size and make it independent to hardware, there is a lightweight version
newlib-nano
used in MCUs.However,
newlib-nano
does not provide an implementation of low-level system calls which are used by C standard libraries, such asprint()
orscan()
. To make the application compilable, a new library namednosys
should be added. This library just provide a simple implementation of low-level system calls which mostly return a by-pass value.To use
newlib-nano
andnosys
libs:--specs=nano.specs --specs=nosys.specs
At this moment, we ignore the standard libraries, and check on it later.
- Compilation warnings
-
To see potential errors, enable Warning for all:
-Wall
- Debug level
-
Turn on debug if needed:
-g
Hence, the build command will be:
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-nostdlib \
-Wall \
main.c delay.c \
-o main.elf
Just ignore the warning about the entry symbol Reset_Handler. We’ll fix it later.
Program sections#
Run arm-none-eabi-objdump
to see the sections and code of the output
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-nostdlib \
-Wall \
-c main.c \
-o main.o
arm-none-eabi-objdump -h main.o > main.o.obj_h
main.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 00000068 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000004 00000000 00000000 0000009c 2**2
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000000 00000000 00000000 000000a0 2**0
ALLOC
3 .comment 0000003a 00000000 00000000 000000a0 2**0
CONTENTS, READONLY
4 .ARM.attributes 0000002e 00000000 00000000 000000da 2**0
CONTENTS, READONLY
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb \
-nostdlib \
-std=gnu11 \
-Wall \
-c delay.c \
-o delay.o
arm-none-eabi-objdump -h delay.o > delay.o.obj_h
delay.o: file format elf32-littlearm
Sections:
Idx Name Size VMA LMA File off Algn
0 .text 0000002c 00000000 00000000 00000034 2**2
CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
1 .data 00000000 00000000 00000000 00000060 2**0
CONTENTS, ALLOC, LOAD, DATA
2 .bss 00000004 00000000 00000000 00000060 2**2
ALLOC
3 .rodata 00000004 00000000 00000000 00000060 2**2
CONTENTS, ALLOC, LOAD, READONLY, DATA
4 .comment 0000003a 00000000 00000000 00000064 2**0
CONTENTS, READONLY
5 .ARM.attributes 0000002e 00000000 00000000 0000009e 2**0
CONTENTS, READONLY
.text
: Code and Data- The code containing instructions which are located in Flash. The text code also store constant values which are encoded as raw bytes at the end of a function.
.data
: Initialized variable-
Variables can change their values, so variables are copied from Flash to RAM by the startup code.
In this example, in
main.o
, there are 4 bytes foruint32_t isLoop = 1;
. .bss
: Uninitialized variables-
Variables can change their values, so variables are copied from Flash to RAM. However, because these values are uninitialized, so we do not need to store their values, we just need to reserve memory for them.
The entire
.bss
segment is described by a single number, probably 4 bytes or 8 bytes, that gives its size in the running process, whereas the.data
section is as big as the sum of sizes of the initialized variables.In this example, in
delay.o
, there are 4 bytes foruint32_t delay_counter;
. .rodata
: Read-only data-
Constant variables are stored in Flash.
In this example, in
delay.o
, there are 4 bytes forconst uint32_t DELAY_MAX = 0x0000BEEF;
.
Data (variable) | Load time | Run time | Section | Note |
---|---|---|---|---|
Global initialized | Flash | RAM | .data | Copy from Flash to RAM by startup code |
Global static initialized | ||||
Local static initialized | ||||
Global uninitialized | - | RAM | .bss | Reserved space by startup code |
Global static uninitialized | ||||
Local static uninitialized | ||||
All global constants | Flash | - | .rodata | |
All other local | - | RAM (Stack) | - | App code uses stack to store |
Linker and Locator#
Linker is used to merge all sections from different binaries into the final executable file.
main.c --> main.o {
.text,
.data,
.bss,
.rodata
}
delay.c --> delay.o {
.text,
.data,
.bss,
.rodata}
main.elf = main.o + delay.o = {
.text = .text(main) + .text(delay)}
.data = .data(main) + .data(delay)}
.bss = .bss(main) + .bss(delay)}
.rodata = .rodata(main) + .rodata(delay)}
}
A linker script is used to decribe the Memory Layout:
ENTRY
command-
Set the Entry point address in the header, which tell GDB to know the first instruction to be executed
ENTRY(address)
MEMORY
command-
Describe different memory parts in the system. Linker uses this information to calculate address
MEMORY { name (attribute): ORIGIN = <address>, LENGTH = <size> }
SECTIONS
command-
Create memory layout by creating section name, section order. In each section, choose which data is used, how data is stored, and loaded.
Location Counter
is a special symbol denoted by a dot.
. Linker will automatically update it with current location information. A variable can be used to save location to mark boundaries. Location counter can be set also.SECTIONS { <symbol> = LOADADDR(<symbol>); .<section>: { <symbol> = .; *(.sub_section); . = ALIGN(n); } ><Run Location> [AT> Storage Location] }
Here is the linker script:
ENTRY(Reset_Handler)
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
}
_estack = ORIGIN(RAM) + LENGTH(RAM);
SECTIONS
{
.isr_vector :
{
*(.isr_vector)
} >FLASH
.text :
{
*(.text)
_etext = .;
} >FLASH
.rodata :
{
*(.rodata)
} >FLASH
_lddata = LOADADDR(.data);
.data :
{
_sdata = .;
*(.data)
_edata = .;
} >RAM AT> FLASH
.bss :
{
_sbss = .;
*(.bss)
_ebss = .;
} >RAM
}
In the Linker Script, we define some symbols:
_etext
: End address of.text
section_lddata
: Load address (from Flash) of.data
section_sdata
: Start address of.data
section_edata
: End address of.data
section_sbss
: Start address of.bss
section_ebss
: End address of.bss
section
To build with Linker script, use -T <linkerfile>
. The option -Wl,-Map=<output>
to show the full memory mapping.
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-nostdlib \
-Wall \
-T linker.ld -Wl,-Map=main.tmp.map \
main.o delay.o \
-o main.tmp
Open the file main.tmp.map
to see the addresses assigned to symbols in the linker scripts.
Memory Configuration
Name Origin Length Attributes
RAM 0x0000000020000000 0x0000000000020000 xrw
FLASH 0x0000000008000000 0x0000000000080000 xr
*default* 0x0000000000000000 0xffffffffffffffff
Linker script and memory map
LOAD main.o
LOAD delay.o
0x0000000020020000 _estack = (ORIGIN (RAM) + LENGTH (RAM))
.isr_vector
*(.isr_vector)
.text 0x0000000008000000 0x94
*(.text)
.text 0x0000000008000000 0x68 main.o
0x0000000008000000 main
.text 0x0000000008000068 0x2c delay.o
0x0000000008000068 delay
0x0000000008000094 _etext = .
.glue_7 0x0000000008000094 0x0
.glue_7 0x0000000008000094 0x0 linker stubs
.glue_7t 0x0000000008000094 0x0
.glue_7t 0x0000000008000094 0x0 linker stubs
.vfp11_veneer 0x0000000008000094 0x0
.vfp11_veneer 0x0000000008000094 0x0 linker stubs
.v4_bx 0x0000000008000094 0x0
.v4_bx 0x0000000008000094 0x0 linker stubs
.iplt 0x0000000008000094 0x0
.iplt 0x0000000008000094 0x0 main.o
.rodata 0x0000000008000094 0x4
*(.rodata)
.rodata 0x0000000008000094 0x4 delay.o
0x0000000008000094 DELAY_MAX
0x0000000008000098 _lddata = LOADADDR (.data)
.rel.dyn 0x0000000008000098 0x0
.rel.iplt 0x0000000008000098 0x0 main.o
.data 0x0000000020000000 0x4 load address 0x0000000008000098
0x0000000020000000 _sdata = .
*(.data)
.data 0x0000000020000000 0x4 main.o
0x0000000020000000 isLoop
.data 0x0000000020000004 0x0 delay.o
0x0000000020000004 _edata = .
.igot.plt 0x0000000020000004 0x0 load address 0x000000000800009c
.igot.plt 0x0000000020000004 0x0 main.o
.bss 0x0000000020000004 0x4 load address 0x000000000800009c
0x0000000020000004 _sbss = .
*(.bss)
.bss 0x0000000020000004 0x0 main.o
.bss 0x0000000020000004 0x4 delay.o
0x0000000020000004 delay_counter
0x0000000020000008 _ebss = .
OUTPUT(main.tmp elf32-littlearm)
LOAD linker stubs
.comment 0x0000000000000000 0x39
.comment 0x0000000000000000 0x39 main.o
0x3a (size before relaxing)
.comment 0x0000000000000039 0x3a delay.o
.ARM.attributes
0x0000000000000000 0x2e
.ARM.attributes
0x0000000000000000 0x2e main.o
.ARM.attributes
0x000000000000002e 0x2e delay.o
The section .data
started at 0x20000000
(loaded at 0x08000098
) is only 4 bytes for uint32_t isLoop = 1;
.
The section .bss
started at 0x20000004
(loaded at 0x0800009c
) is 4 bytes for uint32_t delay_counter
.
The section .rodata
started at 0x08000094
is 4 bytes for const uint32_t DELAY_MAX = 0x0000BEEF;
To find the symbols and their addresses:
arm-none-eabi-nm main.tmp
20000008 B _ebss
20000004 D _edata
20020000 T _estack
08000094 T _etext
08000098 A _lddata
20000004 B _sbss
20000000 D _sdata
08000068 T delay
20000004 B delay_counter
08000094 R DELAY_MAX
20000000 D isLoop
08000000 T main
U Reset_Handler
Linker Symbols#
Accessing a linker script defined variable from source code is not intuitive. In particular a linker script symbol is not equivalent to a variable declaration in a high level language, it is instead a symbol that does not have a value.
Before going further, it is important to note that compilers often transform names in the source code into different names when they are stored in the symbol table. For example, Fortran compilers commonly prepend or append an underscore, and C++ performs extensive name mangling. Therefore there might be a discrepancy between the name of a variable as it is used in source code and the name of the same variable as it is defined in a linker script. For example in C a linker script variable might be referred to as:
extern int foo;
But in the linker script it might be defined as:
_foo = 1000;
In the remaining examples however it is assumed that no name transformation has taken place.
When a symbol is declared in a high level language such as C, two things happen:
- The first is that the compiler reserves enough space in the program’s memory to hold the value of the symbol.
- The second is that the compiler creates an entry in the program’s symbol table which holds the symbol’s address. ie the symbol table contains the address of the block of memory holding the symbol’s value.
So for example the following C declaration, at file scope:
int foo = 1000;
creates an entry called foo
in the symbol table. This entry holds the address of an int
sized block of memory where the number 1000
is initially stored.
When a program references a symbol the compiler generates code that first accesses the symbol table to find the address of the symbol’s memory block and then code to read the value from that memory block. So:
foo = 1;
looks up the symbol foo
in the symbol table, gets the address associated with this symbol and then writes the value 1 into that address. Whereas:
int * a = & foo;
looks up the symbol foo
in the symbol table, gets its address and then copies this address into the block of memory associated with the variable a
.
Linker scripts symbol declarations, by contrast, create an entry in the symbol table but do not assign any memory to them. Thus they are an address without a value. So for example the linker script definition:
foo = 1000;
creates an entry in the symbol table called foo
which holds the address of memory location 1000
, but nothing special is stored at address 1000
. This means that you cannot access the value of a linker script defined symbol - it has no value - all you can do is access the address of a linker script defined symbol.
Hence, when you are using a linker script defined symbol in source code you should always take the address of the symbol, and never attempt to use its value. For example suppose you want to copy the contents of a section of memory called .ROM
into a section called .FLASH
and the linker script contains these declarations:
start_of_ROM = .ROM;
end_of_ROM = .ROM + sizeof (.ROM);
start_of_FLASH = .FLASH;
Then the C source code to perform the copy would be as below. Note the use of the &
operators. These are correct.
extern char start_of_ROM, end_of_ROM, start_of_FLASH;
memcpy (& start_of_FLASH, & start_of_ROM, & end_of_ROM - & start_of_ROM);
Alternatively the symbols can be treated as the names of vectors or arrays and then the code will again work as expected:
extern char start_of_ROM[], end_of_ROM[], start_of_FLASH[];
memcpy (start_of_FLASH, start_of_ROM, end_of_ROM - start_of_ROM);
Note how using this method does not require the use of &
operators.
Vector Table#
On reset, the processor loads the MSP with the value from address 0x00000000
, then starts code execution from the memory at 0x00000004
which must be the Reset_Handler
function.
There are 15 system exceptions, included Reset Handler, and there are up-to 240 interruptions.
The Table 37. Vector table for STM32F411xC/E in the document RM0383: Reference manual STM32F411xC/E advanced Arm®-based 32-bit MCUs shows the supported Exceptions and Interrupts:
#include <stdint.h>
#define RAM_START 0x20000000
#define RAM_SIZE 128 * 1024
#define RAM_END ((RAM_START) + (RAM_SIZE))
void Default_Handler(void) {
while(1) {}
}
void Reset_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void NMI_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void HardFault_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void MemManage_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void BusFault_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void UsageFault_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void SVC_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void DebugMon_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void PendSV_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void SysTick_Handler(void) __attribute__ ((weak, alias("Default_Handler")));
void WWDG_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void PVD_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TAMP_STAMP_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void RTC_WKUP_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void FLASH_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void RCC_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI0_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI1_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI3_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI4_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream0_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream1_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream3_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream4_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream5_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream6_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void ADC_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI9_5_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM1_BRK_TIM9_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM1_UP_TIM10_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM1_TRG_COM_TIM11_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM1_CC_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM3_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM4_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C1_EV_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C1_ER_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C2_EV_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C2_ER_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SPI1_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SPI2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void USART1_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void USART2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void EXTI15_10_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void RTC_Alarm_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void OTG_FS_WKUP_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA1_Stream7_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SDIO_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void TIM5_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SPI3_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream0_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream1_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream2_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream3_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream4_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void OTG_FS_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream5_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream6_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void DMA2_Stream7_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void USART6_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C3_EV_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void I2C3_ER_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void FPU_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SPI4_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
void SPI5_IRQHandler(void) __attribute__ ((weak, alias("Default_Handler")));
__attribute__ ((section(".isr_vector")))
uint32_t vector_table[] = {
(uint32_t) RAM_END,
(uint32_t) Reset_Handler,
(uint32_t) NMI_Handler,
(uint32_t) HardFault_Handler,
(uint32_t) MemManage_Handler,
(uint32_t) BusFault_Handler,
(uint32_t) UsageFault_Handler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) SVC_Handler,
(uint32_t) DebugMon_Handler,
(uint32_t) 0,
(uint32_t) PendSV_Handler,
(uint32_t) SysTick_Handler,
(uint32_t) WWDG_IRQHandler,
(uint32_t) PVD_IRQHandler,
(uint32_t) TAMP_STAMP_IRQHandler,
(uint32_t) RTC_WKUP_IRQHandler,
(uint32_t) FLASH_IRQHandler,
(uint32_t) RCC_IRQHandler,
(uint32_t) EXTI0_IRQHandler,
(uint32_t) EXTI1_IRQHandler,
(uint32_t) EXTI2_IRQHandler,
(uint32_t) EXTI3_IRQHandler,
(uint32_t) EXTI4_IRQHandler,
(uint32_t) DMA1_Stream0_IRQHandler,
(uint32_t) DMA1_Stream1_IRQHandler,
(uint32_t) DMA1_Stream2_IRQHandler,
(uint32_t) DMA1_Stream3_IRQHandler,
(uint32_t) DMA1_Stream4_IRQHandler,
(uint32_t) DMA1_Stream5_IRQHandler,
(uint32_t) DMA1_Stream6_IRQHandler,
(uint32_t) ADC_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) EXTI9_5_IRQHandler,
(uint32_t) TIM1_BRK_TIM9_IRQHandler,
(uint32_t) TIM1_UP_TIM10_IRQHandler,
(uint32_t) TIM1_TRG_COM_TIM11_IRQHandler,
(uint32_t) TIM1_CC_IRQHandler,
(uint32_t) TIM2_IRQHandler,
(uint32_t) TIM3_IRQHandler,
(uint32_t) TIM4_IRQHandler,
(uint32_t) I2C1_EV_IRQHandler,
(uint32_t) I2C1_ER_IRQHandler,
(uint32_t) I2C2_EV_IRQHandler,
(uint32_t) I2C2_ER_IRQHandler,
(uint32_t) SPI1_IRQHandler,
(uint32_t) SPI2_IRQHandler,
(uint32_t) USART1_IRQHandler,
(uint32_t) USART2_IRQHandler,
(uint32_t) 0,
(uint32_t) EXTI15_10_IRQHandler,
(uint32_t) RTC_Alarm_IRQHandler,
(uint32_t) OTG_FS_WKUP_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) DMA1_Stream7_IRQHandler,
(uint32_t) 0,
(uint32_t) SDIO_IRQHandler,
(uint32_t) TIM5_IRQHandler,
(uint32_t) SPI3_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) DMA2_Stream0_IRQHandler,
(uint32_t) DMA2_Stream1_IRQHandler,
(uint32_t) DMA2_Stream2_IRQHandler,
(uint32_t) DMA2_Stream3_IRQHandler,
(uint32_t) DMA2_Stream4_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) OTG_FS_IRQHandler,
(uint32_t) DMA2_Stream5_IRQHandler,
(uint32_t) DMA2_Stream6_IRQHandler,
(uint32_t) DMA2_Stream7_IRQHandler,
(uint32_t) USART6_IRQHandler,
(uint32_t) I2C3_EV_IRQHandler,
(uint32_t) I2C3_ER_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) FPU_IRQHandler,
(uint32_t) 0,
(uint32_t) 0,
(uint32_t) SPI4_IRQHandler,
(uint32_t) SPI5_IRQHandler,
};
weak
and alias
attribute
The exception handlers are user defined, so the Default Handler
is only used in case the corresponding Handler is not implemented.
section
attribute
Code can be assigned to a memory location by labeling the code with sections.
Now, include the vector table in linker, you will see the section .isr_vector
is now filled:
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-nostdlib \
-Wall \
-T linker.ld -Wl,-Map=vector.tmp.map \
main.o delay.o vector.o \
-o vector.tmp
.isr_vector 0x0000000008000000 0x198
*(.isr_vector)
.isr_vector 0x0000000008000000 0x198 vector.o
0x0000000008000000 vector_table
Startup code#
The startup code is responsible for setting up the right environment for the main code to run.
- Provide the vector table
- Implement Reset Handler
- Copy
.data
section from Flash to RAM - Reserve memory for
.bss
section - Call to
main
function
- Copy
#include <stdint.h>
extern uint32_t _sdata;
extern uint32_t _edata;
extern uint32_t _lddata;
extern uint32_t _sbss;
extern uint32_t _ebss;
extern void main(void);
void Reset_Handler(void) {
// copy .data section from flash to ram
uint32_t size = (uint32_t)&_edata - (uint32_t)&_sdata;
uint8_t *pRAM = (uint8_t*)&_sdata;
uint8_t *pFlash = (uint8_t*)&_lddata;
for(int i=0; i<size; i++) {
pRAM[i] = pFlash[i];
}
// initialize .bss section
size = (uint32_t)&_ebss - (uint32_t)&_sbss;
pRAM = (uint8_t*)&_sbss;
for(int i=0; i<size; i++) {
pRAM[i] = 0;
}
// call to main
main();
}
Examine the binary file#
Build all files:
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-nostdlib \
-Wall \
-T linker.ld -Wl,-Map=main.elf.map \
main.o delay.o vector.o startup.o \
-o main.elf
The elf
file is a wrapper of a binary file because it contains extra metadata, such as the symbol table:
arm-none-eabi-nm main.elf > main.elf.sym
20000008 B _ebss
20000004 D _edata
20020000 D _estack
080002b8 T _etext
080002bc A _lddata
20000004 B _sbss
20000000 D _sdata
0800022c W ADC_IRQHandler
0800022c W BusFault_Handler
0800022c W DebugMon_Handler
0800022c T Default_Handler
08000200 T delay
20000004 B delay_counter
080002b8 R DELAY_MAX
0800022c W DMA1_Stream0_IRQHandler
0800022c W DMA1_Stream1_IRQHandler
0800022c W DMA1_Stream2_IRQHandler
0800022c W DMA1_Stream3_IRQHandler
0800022c W DMA1_Stream4_IRQHandler
0800022c W DMA1_Stream5_IRQHandler
0800022c W DMA1_Stream6_IRQHandler
0800022c W DMA1_Stream7_IRQHandler
0800022c W DMA2_Stream0_IRQHandler
0800022c W DMA2_Stream1_IRQHandler
0800022c W DMA2_Stream2_IRQHandler
0800022c W DMA2_Stream3_IRQHandler
0800022c W DMA2_Stream4_IRQHandler
0800022c W DMA2_Stream5_IRQHandler
0800022c W DMA2_Stream6_IRQHandler
0800022c W DMA2_Stream7_IRQHandler
0800022c W EXTI0_IRQHandler
0800022c W EXTI1_IRQHandler
0800022c W EXTI15_10_IRQHandler
0800022c W EXTI2_IRQHandler
0800022c W EXTI3_IRQHandler
0800022c W EXTI4_IRQHandler
0800022c W EXTI9_5_IRQHandler
0800022c W FLASH_IRQHandler
0800022c W FPU_IRQHandler
0800022c W HardFault_Handler
0800022c W I2C1_ER_IRQHandler
0800022c W I2C1_EV_IRQHandler
0800022c W I2C2_ER_IRQHandler
0800022c W I2C2_EV_IRQHandler
0800022c W I2C3_ER_IRQHandler
0800022c W I2C3_EV_IRQHandler
20000000 D isLoop
08000198 T main
0800022c W MemManage_Handler
0800022c W NMI_Handler
0800022c W OTG_FS_IRQHandler
0800022c W OTG_FS_WKUP_IRQHandler
0800022c W PendSV_Handler
0800022c W PVD_IRQHandler
0800022c W RCC_IRQHandler
08000234 T Reset_Handler
0800022c W RTC_Alarm_IRQHandler
0800022c W RTC_WKUP_IRQHandler
0800022c W SDIO_IRQHandler
0800022c W SPI1_IRQHandler
0800022c W SPI2_IRQHandler
0800022c W SPI3_IRQHandler
0800022c W SPI4_IRQHandler
0800022c W SPI5_IRQHandler
0800022c W SVC_Handler
0800022c W SysTick_Handler
0800022c W TAMP_STAMP_IRQHandler
0800022c W TIM1_BRK_TIM9_IRQHandler
0800022c W TIM1_CC_IRQHandler
0800022c W TIM1_TRG_COM_TIM11_IRQHandler
0800022c W TIM1_UP_TIM10_IRQHandler
0800022c W TIM2_IRQHandler
0800022c W TIM3_IRQHandler
0800022c W TIM4_IRQHandler
0800022c W TIM5_IRQHandler
0800022c W UsageFault_Handler
0800022c W USART1_IRQHandler
0800022c W USART2_IRQHandler
0800022c W USART6_IRQHandler
08000000 D vector_table
0800022c W WWDG_IRQHandler
The Reset_Handler
is at 0x08000234
, the Default_Handler
is at 0x0800022c
.
Extract binary content:
arm-none-eabi-objcopy -O binary main.elf main.bin
Examine binary file
Check isr_vector
at 0x08000000
:
xxd -g4 -e -s0 -l32 main.bin
00000000: 20020000 08000235 0800022d 0800022d ... 5...-...-...
00000010: 0800022d 0800022d 0800022d 00000000 -...-...-.......
You will notice that:
- The MSP value at the address
0x00000000
is the RAN_END value0x20020000
. - The Reset Handler address is written at
0x00000004
, which is0x08000235
(note that the LSB bit is 1 to indicate Thumb state).
Let check the value of DELAY_MAX
at the address 0x080002bc
:
xxd -g4 -e -s0x2bc -l4 main.bin
000002bc: 0000beef
You will notice the constant value 0x0000BEEF
is stored at that address.
Review assembly code
You can read the assembly code from the elf file using
arm-none-eabi-objdump -S main.elf
main.elf: file format elf32-littlearm
Disassembly of section .text:
08000198 <main>:
8000198: b580 push {r7, lr}
800019a: af00 add r7, sp, #0
800019c: 4b14 ldr r3, [pc, #80] ; (80001f0 <main+0x58>)
800019e: 681b ldr r3, [r3, #0]
80001a0: 4a13 ldr r2, [pc, #76] ; (80001f0 <main+0x58>)
80001a2: f043 0301 orr.w r3, r3, #1
80001a6: 6013 str r3, [r2, #0]
80001a8: 4b12 ldr r3, [pc, #72] ; (80001f4 <main+0x5c>)
80001aa: 681b ldr r3, [r3, #0]
80001ac: 4a11 ldr r2, [pc, #68] ; (80001f4 <main+0x5c>)
80001ae: f423 6300 bic.w r3, r3, #2048 ; 0x800
80001b2: 6013 str r3, [r2, #0]
80001b4: 4b0f ldr r3, [pc, #60] ; (80001f4 <main+0x5c>)
80001b6: 681b ldr r3, [r3, #0]
80001b8: 4a0e ldr r2, [pc, #56] ; (80001f4 <main+0x5c>)
80001ba: f443 6380 orr.w r3, r3, #1024 ; 0x400
80001be: 6013 str r3, [r2, #0]
80001c0: e00f b.n 80001e2 <main+0x4a>
80001c2: 4b0d ldr r3, [pc, #52] ; (80001f8 <main+0x60>)
80001c4: 681b ldr r3, [r3, #0]
80001c6: 4a0c ldr r2, [pc, #48] ; (80001f8 <main+0x60>)
80001c8: f043 0320 orr.w r3, r3, #32
80001cc: 6013 str r3, [r2, #0]
80001ce: f000 f817 bl 8000200 <delay>
80001d2: 4b09 ldr r3, [pc, #36] ; (80001f8 <main+0x60>)
80001d4: 681b ldr r3, [r3, #0]
80001d6: 4a08 ldr r2, [pc, #32] ; (80001f8 <main+0x60>)
80001d8: f443 1300 orr.w r3, r3, #2097152 ; 0x200000
80001dc: 6013 str r3, [r2, #0]
80001de: f000 f80f bl 8000200 <delay>
80001e2: 4b06 ldr r3, [pc, #24] ; (80001fc <main+0x64>)
80001e4: 681b ldr r3, [r3, #0]
80001e6: 2b00 cmp r3, #0
80001e8: d1eb bne.n 80001c2 <main+0x2a>
80001ea: 2300 movs r3, #0
80001ec: 4618 mov r0, r3
80001ee: bd80 pop {r7, pc}
80001f0: 40023830 .word 0x40023830
80001f4: 40020000 .word 0x40020000
80001f8: 40020018 .word 0x40020018
80001fc: 20000000 .word 0x20000000
08000200 <delay>:
8000200: b480 push {r7}
8000202: af00 add r7, sp, #0
8000204: f64b 62ef movw r2, #48879 ; 0xbeef
8000208: 4b07 ldr r3, [pc, #28] ; (8000228 <delay+0x28>)
800020a: 601a str r2, [r3, #0]
800020c: bf00 nop
800020e: 4b06 ldr r3, [pc, #24] ; (8000228 <delay+0x28>)
8000210: 681b ldr r3, [r3, #0]
8000212: 1e5a subs r2, r3, #1
8000214: 4904 ldr r1, [pc, #16] ; (8000228 <delay+0x28>)
8000216: 600a str r2, [r1, #0]
8000218: 2b00 cmp r3, #0
800021a: d1f8 bne.n 800020e <delay+0xe>
800021c: bf00 nop
800021e: bf00 nop
8000220: 46bd mov sp, r7
8000222: bc80 pop {r7}
8000224: 4770 bx lr
8000226: bf00 nop
8000228: 20000004 .word 0x20000004
0800022c <Default_Handler>:
800022c: b480 push {r7}
800022e: af00 add r7, sp, #0
8000230: e7fe b.n 8000230 <Default_Handler+0x4>
...
08000234 <Reset_Handler>:
8000234: b580 push {r7, lr}
8000236: b086 sub sp, #24
8000238: af00 add r7, sp, #0
800023a: 4a1a ldr r2, [pc, #104] ; (80002a4 <Reset_Handler+0x70>)
800023c: 4b1a ldr r3, [pc, #104] ; (80002a8 <Reset_Handler+0x74>)
800023e: 1ad3 subs r3, r2, r3
8000240: 60fb str r3, [r7, #12]
8000242: 4b19 ldr r3, [pc, #100] ; (80002a8 <Reset_Handler+0x74>)
8000244: 60bb str r3, [r7, #8]
8000246: 4b19 ldr r3, [pc, #100] ; (80002ac <Reset_Handler+0x78>)
8000248: 607b str r3, [r7, #4]
800024a: 2300 movs r3, #0
800024c: 617b str r3, [r7, #20]
800024e: e00a b.n 8000266 <Reset_Handler+0x32>
8000250: 697b ldr r3, [r7, #20]
8000252: 687a ldr r2, [r7, #4]
8000254: 441a add r2, r3
8000256: 697b ldr r3, [r7, #20]
8000258: 68b9 ldr r1, [r7, #8]
800025a: 440b add r3, r1
800025c: 7812 ldrb r2, [r2, #0]
800025e: 701a strb r2, [r3, #0]
8000260: 697b ldr r3, [r7, #20]
8000262: 3301 adds r3, #1
8000264: 617b str r3, [r7, #20]
8000266: 697b ldr r3, [r7, #20]
8000268: 68fa ldr r2, [r7, #12]
800026a: 429a cmp r2, r3
800026c: d8f0 bhi.n 8000250 <Reset_Handler+0x1c>
800026e: 4a10 ldr r2, [pc, #64] ; (80002b0 <Reset_Handler+0x7c>)
8000270: 4b10 ldr r3, [pc, #64] ; (80002b4 <Reset_Handler+0x80>)
8000272: 1ad3 subs r3, r2, r3
8000274: 60fb str r3, [r7, #12]
8000276: 4b0f ldr r3, [pc, #60] ; (80002b4 <Reset_Handler+0x80>)
8000278: 60bb str r3, [r7, #8]
800027a: 2300 movs r3, #0
800027c: 613b str r3, [r7, #16]
800027e: e007 b.n 8000290 <Reset_Handler+0x5c>
8000280: 693b ldr r3, [r7, #16]
8000282: 68ba ldr r2, [r7, #8]
8000284: 4413 add r3, r2
8000286: 2200 movs r2, #0
8000288: 701a strb r2, [r3, #0]
800028a: 693b ldr r3, [r7, #16]
800028c: 3301 adds r3, #1
800028e: 613b str r3, [r7, #16]
8000290: 693b ldr r3, [r7, #16]
8000292: 68fa ldr r2, [r7, #12]
8000294: 429a cmp r2, r3
8000296: d8f3 bhi.n 8000280 <Reset_Handler+0x4c>
8000298: f7ff ff7e bl 8000198 <main>
800029c: bf00 nop
800029e: 3718 adds r7, #24
80002a0: 46bd mov sp, r7
80002a2: bd80 pop {r7, pc}
80002a4: 20000004 .word 0x20000004
80002a8: 20000000 .word 0x20000000
80002ac: 080002bc .word 0x080002bc
80002b0: 20000008 .word 0x20000008
80002b4: 20000004 .word 0x20000004
Download and Debug#
Run OpenOCD
Each target has its own configurations, such as _CPUTAPID
, _ENDIAN
, or Debug registers. You will need this configuration file to work with your target.
For example, target an STM32F411 MCU:
# script for stm32f4x family
#
# stm32 devices support both JTAG and SWD transports.
#
source [find target/swj-dp.tcl]
source [find mem_helper.tcl]
if { [info exists CHIPNAME] } {
set _CHIPNAME $CHIPNAME
} else {
set _CHIPNAME stm32f4x
}
set _ENDIAN little
# Work-area is a space in RAM used for flash programming
# By default use 32kB (Available RAM in smallest device STM32F410)
if { [info exists WORKAREASIZE] } {
set _WORKAREASIZE $WORKAREASIZE
} else {
set _WORKAREASIZE 0x8000
}
#jtag scan chain
if { [info exists CPUTAPID] } {
set _CPUTAPID $CPUTAPID
} else {
if { [using_jtag] } {
# See STM Document RM0090
# Section 38.6.3 - corresponds to Cortex-M4 r0p1
set _CPUTAPID 0x4ba00477
} {
set _CPUTAPID 0x2ba01477
}
}
swj_newdap $_CHIPNAME cpu -irlen 4 -ircapture 0x1 -irmask 0xf -expected-id $_CPUTAPID
dap create $_CHIPNAME.dap -chain-position $_CHIPNAME.cpu
tpiu create $_CHIPNAME.tpiu -dap $_CHIPNAME.dap -ap-num 0 -baseaddr 0xE0040000
if {[using_jtag]} {
jtag newtap $_CHIPNAME bs -irlen 5
}
set _TARGETNAME $_CHIPNAME.cpu
target create $_TARGETNAME cortex_m -endian $_ENDIAN -dap $_CHIPNAME.dap
$_TARGETNAME configure -work-area-phys 0x20000000 -work-area-size $_WORKAREASIZE -work-area-backup 0
set _FLASHNAME $_CHIPNAME.flash
flash bank $_FLASHNAME stm32f2x 0 0 0 0 $_TARGETNAME
flash bank $_CHIPNAME.otp stm32f2x 0x1fff7800 0 0 0 $_TARGETNAME
if { [info exists QUADSPI] && $QUADSPI } {
set a [llength [flash list]]
set _QSPINAME $_CHIPNAME.qspi
flash bank $_QSPINAME stmqspi 0x90000000 0 0 0 $_TARGETNAME 0xA0001000
}
# JTAG speed should be <= F_CPU/6. F_CPU after reset is 16MHz, so use F_JTAG = 2MHz
#
# Since we may be running of an RC oscilator, we crank down the speed a
# bit more to be on the safe side. Perhaps superstition, but if are
# running off a crystal, we can run closer to the limit. Note
# that there can be a pretty wide band where things are more or less stable.
adapter speed 2000
adapter srst delay 100
if {[using_jtag]} {
jtag_ntrst_delay 100
}
reset_config srst_nogate
if {![using_hla]} {
# if srst is not fitted use SYSRESETREQ to
# perform a soft reset
cortex_m reset_config sysresetreq
}
$_TARGETNAME configure -event examine-end {
# Enable debug during low power modes (uses more power)
# DBGMCU_CR |= DBG_STANDBY | DBG_STOP | DBG_SLEEP
mmw 0xE0042004 0x00000007 0
# Stop watchdog counters during halt
# DBGMCU_APB1_FZ |= DBG_IWDG_STOP | DBG_WWDG_STOP
mmw 0xE0042008 0x00001800 0
}
proc proc_post_enable {_chipname} {
targets $_chipname.cpu
if { [$_chipname.tpiu cget -protocol] eq "sync" } {
switch [$_chipname.tpiu cget -port-width] {
1 {
mmw 0xE0042004 0x00000060 0x000000c0
mmw 0x40021020 0x00000000 0x0000ff00
mmw 0x40021000 0x000000a0 0x000000f0
mmw 0x40021008 0x000000f0 0x00000000
}
2 {
mmw 0xE0042004 0x000000a0 0x000000c0
mmw 0x40021020 0x00000000 0x000fff00
mmw 0x40021000 0x000002a0 0x000003f0
mmw 0x40021008 0x000003f0 0x00000000
}
4 {
mmw 0xE0042004 0x000000e0 0x000000c0
mmw 0x40021020 0x00000000 0x0fffff00
mmw 0x40021000 0x00002aa0 0x00003ff0
mmw 0x40021008 0x00003ff0 0x00000000
}
}
} else {
mmw 0xE0042004 0x00000020 0x000000c0
}
}
$_CHIPNAME.tpiu configure -event post-enable "proc_post_enable $_CHIPNAME"
$_TARGETNAME configure -event reset-init {
# Configure PLL to boost clock to HSI x 4 (64 MHz)
mww 0x40023804 0x08012008 ;# RCC_PLLCFGR 16 Mhz /8 (M) * 128 (N) /4(P)
mww 0x40023C00 0x00000102 ;# FLASH_ACR = PRFTBE | 2(Latency)
mmw 0x40023800 0x01000000 0 ;# RCC_CR |= PLLON
sleep 10 ;# Wait for PLL to lock
mmw 0x40023808 0x00001000 0 ;# RCC_CFGR |= RCC_CFGR_PPRE1_DIV2
mmw 0x40023808 0x00000002 0 ;# RCC_CFGR |= RCC_CFGR_SW_PLL
# Boost JTAG frequency
adapter speed 8000
}
$_TARGETNAME configure -event reset-start {
# Reduce speed since CPU speed will slow down to 16MHz with the reset
adapter speed 2000
}
On a board with an ST Link debugger:
source [find interface/stlink.cfg]
transport select hla_swd
# increase working area to 64KB
set WORKAREASIZE 0x10000
source [find target/stm32f4x.cfg]
reset_config srst_only
You can use any STM32 compatible debuggers such as ST_Link V⅔, J-Link to connect with Serial Wire Debug (SWD) interface on the target MCU.
openocd -f board.cfg
xPack OpenOCD x86_64 Open On-Chip Debugger 0.11.0+dev (2022-03-25-17:32)
Licensed under GNU GPL v2
For bug reports, read
http://openocd.org/doc/doxygen/bugs.html
Info : The selected transport took over low-level target control. The results might differ compared to plain JTAG/SWD
srst_only separate srst_nogate srst_open_drain connect_deassert_srst
Info : Listening on port 6666 for tcl connections
Info : Listening on port 4444 for telnet connections
Info : clock speed 2000 kHz
Info : STLINK V2J39M27 (API v2) VID:PID 0483:374B
Info : Target voltage: 3.276040
Info : [stm32f4x.cpu] Cortex-M4 r0p1 processor detected
Info : [stm32f4x.cpu] target has 6 breakpoints, 4 watchpoints
Info : starting gdb server for stm32f4x.cpu on 3333
Info : Listening on port 3333 for gdb connections
OpenOCD Commands are available online and examples.
Telnet client
Run Telnet:
Run the Telnet client:
telnet 127.0.0.1 4444
Telnet is used access to OpenOCD server and use OpenOCD commands directly.
flash write_image erase main.elf
reset halt
resume
GDB Client
Prepare a debug version with -g
option, named it main-debug.elf
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-nostdlib \
-std=gnu11 \
-Wall \
-g \
-T linker.ld -Wl,-Map=main-debug.elf.map \
main.c delay.c vector.c startup.c \
-o main-debug.elf
Run the GDB client with debug version:
arm-none-eabi-gdb main-debug.elf
Then connect to OpenOCD server:
target extended-remote localhost:3333
All OpenOCD command must be start with monitor
tag
monitor flash write_image erase main.elf
monitor reset halt
monitor resume
GDB has its own command set, you can use it too:
br main
step
Use standard library#
Let see a new example that uses the standard library.
The source code is from the above example, but added some mofifications:
- Add
stdio.h
library for usingprintf()
function - Use Semihosting for output if macro
USE_SEMIHOSTING
is defined
#include <stdint.h>
#include <stdio.h>
#include "delay.h"
/* Clock */
#define RCC_AHB1ENR *((volatile uint32_t*) (0x40023830))
/* GPIO A */
#define GPIOA_MODER *((volatile uint32_t*) (0x40020000))
#define GPIOA_BSRR *((volatile uint32_t*) (0x40020018))
/* Global initialized variable */
uint32_t isLoop = 1;
#ifdef USE_SEMIHOSTING
/* Semohosting */
extern void initialise_monitor_handles(void);
#endif
int main() {
char counter = 0;
#ifdef USE_SEMIHOSTING
initialise_monitor_handles();
#endif
/* turn on clock on GPIOA */
RCC_AHB1ENR |= (1 << 0);
/* set PA5 to output mode */
GPIOA_MODER &= ~(1 << 11);
GPIOA_MODER |= (1 << 10);
while(isLoop) {
/* set HIGH on PA5 */
GPIOA_BSRR |= (1 << 5);
delay();
/* set LOW on PA5 */
GPIOA_BSRR |= (1 << (5+16));
delay();
/* output */
printf("counter = %d\n", counter);
counter++;
}
return 0;
}
Let try to compile:
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
-Wall \
-T linker.ld -Wl,-Map=main.elf.map \
main.c delay.c vector.c startup.c \
-o main.elf
\crt0.o: in function `_mainCRTStartup':
(.text+0x64): undefined reference to `__bss_start__'
(.text+0x68): undefined reference to `__bss_end__'
\libc.a(lib_a-exit.o): in function `exit':
(.text.exit+0x16): undefined reference to `_exit'
\libc.a(lib_a-sbrkr.o): in function `_sbrk_r':
(.text._sbrk_r+0xc): undefined reference to `_sbrk'
\libc.a(lib_a-writer.o): in function `_write_r':
(.text._write_r+0x14): undefined reference to `_write'
\libc.a(lib_a-closer.o): in function `_close_r':
(.text._close_r+0xc): undefined reference to `_close'
\libc.a(lib_a-fstatr.o): in function `_fstat_r':
(.text._fstat_r+0x12): undefined reference to `_fstat'
\libc.a(lib_a-isattyr.o): in function `_isatty_r':
(.text._isatty_r+0xc): undefined reference to `_isatty'
\libc.a(lib_a-lseekr.o): in function `_lseek_r':
(.text._lseek_r+0x14): undefined reference to `_lseek'
\libc.a(lib_a-readr.o): in function `_read_r':
(.text._read_r+0x14): undefined reference to `_read'
\libc.a(lib_a-abort.o): in function `abort':
(.text.abort+0xa): undefined reference to `_exit'
\libc.a(lib_a-signalr.o): in function `_kill_r':
(.text._kill_r+0x12): undefined reference to `_kill'
\libc.a(lib_a-signalr.o): in function `_getpid_r':
(.text._getpid_r+0x0): undefined reference to `_getpid'
\libgcc.a(unwind-arm.o): in function `get_eit_entry':
(.text.get_eit_entry+0x90): undefined reference to `__exidx_start'
(.text.get_eit_entry+0x94): undefined reference to `__exidx_end'
We see that the compiler link to the libc
by default.
Standard C libraries
GNU ARM libraries use newlib to provide standard implementation of C libraries. To reduce the code size and make it independent to hardware, there is a lightweight version newlib-nano
used in MCUs.
However, newlib-nano
does not provide an implementation of low-level system calls which are used by C standard libraries, such as print()
or scan()
. To make the application compilable, a new library named nosys
should be added. This library just provide a simple implementation of low-level system calls which mostly return a by-pass value.
The lib newlib-nano
is enabled via linker options --specs=nano.specs
, and nosys
is enabled via linker option --specs=nosys.specs
. These two libraries are included by default in GCC linker options in generated project, check it here.
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
--specs=nano.specs --specs=nosys.specs \
-Wall \
-T linker.ld -Wl,-Map=main.elf.map \
main.c delay.c vector.c startup.c \
-o main.elf
\crt0.o: in function `_mainCRTStartup':
(.text+0x64): undefined reference to `__bss_start__'
(.text+0x68): undefined reference to `__bss_end__'
\libnosys.a(sbrk.o): in function `_sbrk':
(.text._sbrk+0x18): undefined reference to `end'
There are still few errors that need fixed in Linker script.
Let download the source code of newlib
from newlib ftp directory.
Search for __bss_start__
, __bss_start__
and you can see a note:
* `mcore/crt0.S`: Renamed file from `crt0.s`.
Only invoke `init()` and `fini()` routines for ELF builds.
Use `__bss_start__` and `__bss_end__` to locate `.bss` section.
Search for the function _sbrk
you can see a note in source code, which mentions that end
symbol should be end of heap.
void * _sbrk (ptrdiff_t incr) {
extern char end asm ("end"); /* Defined by the linker. */
static char *heap_end;
char *prev_heap_end;
if (heap_end == NULL)
heap_end = &end;
Update linker sections#
The linker should update below sections:
- add alignment to each section
- include subsections, e.g.
*(.text*)
- new heap section to check reserved memory for stack and heap
ENTRY(Reset_Handler)
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 128K
FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
}
_estack = ORIGIN(RAM) + LENGTH(RAM);
_Min_Heap_Size = 0x200; /* required amount of heap */
_Min_Stack_Size = 0x400; /* required amount of stack */
SECTIONS
{
.isr_vector :
{
. = ALIGN(4);
KEEP(*(.isr_vector)) /* Startup code */
. = ALIGN(4);
} >FLASH
.text :
{
. = ALIGN(4);
*(.text)
*(.text*)
*(.glue_7) /* glue arm to thumb code */
*(.glue_7t) /* glue thumb to arm code */
*(.eh_frame)
KEEP (*(.init))
KEEP (*(.fini))
. = ALIGN(4);
_etext = .;
} >FLASH
.rodata :
{
. = ALIGN(4);
*(.rodata)
*(.rodata*)
. = ALIGN(4);
} >FLASH
.ARM : {
. = ALIGN(4);
__exidx_start = .;
*(.ARM.exidx*)
__exidx_end = .;
. = ALIGN(4);
} >FLASH
_lddata = LOADADDR(.data);
.data :
{
. = ALIGN(4);
_sdata = .;
*(.data)
*(.data*)
. = ALIGN(4);
_edata = .;
} >RAM AT> FLASH
.bss :
{
. = ALIGN(4);
_sbss = .;
__bss_start__ = _sbss;
*(.bss)
*(.bss.*)
*(COMMON)
. = ALIGN(4);
_ebss = .;
__bss_end__ = _ebss;
} >RAM
._user_heap_stack :
{
. = ALIGN(8);
end = .;
_end = .;
__end__ = .;
. = . + _Min_Heap_Size;
. = . + _Min_Stack_Size;
. = ALIGN(8);
} >RAM
/* Remove information from the compiler libraries */
/DISCARD/ :
{
libc.a ( * )
libm.a ( * )
libgcc.a ( * )
}
.ARM.attributes 0 : { *(.ARM.attributes) }
}
Update startup code#
#include <stdint.h>
extern uint32_t _sdata;
extern uint32_t _edata;
extern uint32_t _lddata;
extern uint32_t _sbss;
extern uint32_t _ebss;
extern void main(void);
extern void __libc_init_array(void);
void Reset_Handler(void) {
// copy .data section from flash to ram
uint32_t size = (uint32_t)&_edata - (uint32_t)&_sdata;
uint8_t *pRAM = (uint8_t*)&_sdata;
uint8_t *pFlash = (uint8_t*)&_lddata;
for(int i=0; i<size; i++) {
pRAM[i] = pFlash[i];
}
// initialize .bss section
size = (uint32_t)&_ebss - (uint32_t)&_sbss;
pRAM = (uint8_t*)&_sbss;
for(int i=0; i<size; i++) {
pRAM[i] = 0;
}
// init libc
__libc_init_array();
// call to main
main();
}
Run OpenOCD with Semihosting#
Compile with Semihosting:
- Use
--specs=rdimon.specs
- Use
-DUSE_SEMIHOSTING
- Use
-lrdimon
arm-none-eabi-gcc \
-mcpu=cortex-m4 -mthumb -mfloat-abi=soft \
-std=gnu11 \
--specs=rdimon.specs \
-Wall \
-DUSE_SEMIHOSTING
-T linker.ld -Wl,-Map=main.elf.map -lrdimon\
main.c delay.c vector.c startup.c \
-o main-semi.elf
Run OpenOCD, and use Telnet to connect and run below command
arm semihosting enable
halt
flash write_image erase main-semi.elf
reset halt
resume