A simple implementation of a Task Scheduler
Multitasking is the ability to execute more than one task in a fashion that is seemingly simultaneous. A scheduler is a part of operating system that is responsible for selecting, switching from a task to another one. This guide presents a simple way to implement a Round-Robin task scheduler to handle multiple tasks without priority.
Last update: 2022-06-29
Table of Content
STM32-Tutorials F411RE_Task_Scheduler.zip F411RE_Task_Scheduler_PendSV.zip
Task#
A Task is a piece of code, or a function, that does a specific job when it is allowed to run.
A Task has its own Stack to create its local variables. Its stack can be used to store addition information which is needed to save or restore the task from running state.
Usually, a task is an infinite loop which can repeatedly do multiple steps.
void task_main(void *param) {
// init task
...
// main loop
while(1) {
// do things over and over
}
}
Scheduler#
Round-Robin scheduling method is the time slices assigned to each task are equal, and all tasks are done in a circular order.
The time slice is defined by a special timer - SysTick which is a free-run timer to produce a periodical interrupt.
In the SysTick Handler, we will to the context switching to move to a next task.
There are some others steps to initialize a scheduler:
Task Table#
Task Table is the place to store information of all tasks. Because each task should have its own stack to save its local variable, the task table must include the PSP values which point to the current stack pointer of each task.
The scheduler also needs to know which task is currently running to do save/ restore stack to the corresponding task.
Here is a minimal Task Table implementation, without using struct:
#define TASK_NUMBER_MAX (16)
uint32_t __uCurrentTaskIdx = 0;
uint32_t __puTasksPSP[TASK_NUMBER_MAX] = {0};
The helper functions to access to the Task Table:
// return PSP value stored in slot at __uCurrentTaskIdx index
uint32_t get_current_psp() {
return __puTasksPSP[__uCurrentTaskIdx];
}
// save PSP value to the slot at __uCurrentTaskIdx index
void save_current_psp(uint32_t psp) {
__puTasksPSP[__uCurrentTaskIdx] = psp;
}
Add a Task#
We need to set how much RAM is reserved for a Task Stack. Let say 1 KB. The RAM space for Tasks should not overlap the Main Stack because Handler mode always use the Main Stack.
#define RAM_START (0x20000000u)
#define RAM_SIZE (128 * 1024) // 128 KB
#define MAIN_STACK (RAM_START + RAM_SIZE)
#define TASK_STACK_SIZE (1024u)
For the i
-th task, the beginning Stack address should be:
uint32_t* psp = (uint32_t*)(MAIN_STACK - (i+1)*TASK_STACK_SIZE);
Set Task Stack#
When adding a task, we initialize the PSP value of the new task at the start address of new 1 KB RAM. We reserve 1 KB for Main Stack, then every 1 KB for a new task. The Task Table will be filled a new slot when a new task is added.
void init_task(void (*handler)) {
int i=0;
// find an empty slot
for(; i<TASK_NUMBER_MAX; i++) {
if (__puTasksPSP[i] == 0) break;
}
if(i >= TASK_NUMBER_MAX) {
printf("Can not register a new task anymore!\n");
return;
} else {
printf("Register a task %p at slot %i\n", handler, i);
}
// calculate new PSP
uint32_t* psp = (uint32_t*)(MAIN_STACK - (i+1)*TASK_STACK_SIZE);
// fill stack frame
...
Fill Stack Frame#
When a task is switched in (load and run), the CPU will use PSP pointer to find below information:
- Last CPU status → Load
xPSR
register - Next instruction to be executed → Load
PC
register - Return mode → Load
LR
register - Last execution status → Load all
R0
toR12
registers
As we use the SysTick exception to trigger Context Switching, the below registers are automatically saved at exception:
xPSR
,PC
,LR
,R12
,R3
toR0
, in fixed order
Therefore, we will prepare additional registers in the order R11
to R4
.
What are the value of those registers???
xPSR
=0x01000000
: no special status should be set at initial stage of a task, except the Thumb State bit must be setPC
= task’ handler function: should be the address of the task’s functionLR
=0xFFFFFFFD
: when exception occurs, LR contains theEXC_RETURN
value to control exception exit. This should make exception returns Thread mode, non-floating-point state from PSPR12
toR0
= any dummy data: can use a pattern for debugging
void init_task(void (*handler)) {
...
// fill dummy stack frame
*(--psp) = 0x01000000u; // Dummy xPSR, just enable Thumb State bit;
*(--psp) = (uint32_t) handler; // PC
*(--psp) = 0xFFFFFFFDu; // LR with EXC_RETURN to return to Thread using PSP
*(--psp) = 0x12121212u; // Dummy R12
*(--psp) = 0x03030303u; // Dummy R3
*(--psp) = 0x02020202u; // Dummy R2
*(--psp) = 0x01010101u; // Dummy R1
*(--psp) = 0x00000000u; // Dummy R0
*(--psp) = 0x11111111u; // Dummy R11
*(--psp) = 0x10101010u; // Dummy R10
*(--psp) = 0x09090909u; // Dummy R9
*(--psp) = 0x08080808u; // Dummy R8
*(--psp) = 0x07070707u; // Dummy R7
*(--psp) = 0x06060606u; // Dummy R6
*(--psp) = 0x05050505u; // Dummy R5
*(--psp) = 0x04040404u; // Dummy R4
// save PSP
__puTasksPSP[i] = (uint32_t)psp;
Start scheduling#
The scheduler has to do some extra steps before it can run the first task.
Enable PSP#
Task must use its own stack, therefore, we have to enable Process Stack mode. The initial PSP value should be the PSP of the first task. To access PSP
and CONTROL
register, we have to use inline assembly code.
void start_scheduler() {
printf("Start Scheduler!\n");
// start with the first task
__uCurrentTaskIdx = 0;
// prepare PSP of the first task
__asm volatile("BL get_current_psp"); // return PSP in R0
__asm volatile("MSR PSP, R0"); // set PSP
// change to use PSP
__asm volatile("MRS R0, CONTROL");
__asm volatile("ORR R0, R0, #2"); // set bit[1] SPSEL
__asm volatile("MSR CONTROL, R0");
...
}
Start SysTick#
We need to configure the SysTick to trigger every 1 ms. The application is still have Privileged Access level, therefore, we can start a timer at this point.
/* SysTick register address */
#define SCSR *((volatile uint32_t*) 0xE000E010u)
#define SRVR *((volatile uint32_t*) 0xE000E014u)
void start_scheduler() {
...
// clear and set the period
SRVR &= ~0xFFFFFFFF;
SRVR |= 16000-1; // 1000 Hz ~ 1 ms
// enable SysTick
SCSR |= (1 << 1); // enable SysTick Exception request
SCSR |= (1 << 2); // select system clock
...
}
Set Access Level (optional)#
This is an optional step to move the user task to Unprivileged Access level. If user tasks are set to Unprivileged, they are limited to access system memory, then you have to implement System Call through SVC exception.
void start_scheduler() {
...
// Move to Unprivileged level
__asm volatile("MRS R0, CONTROL");
__asm volatile("ORR R0, R0, #1"); // Set bit[0] nPRIV
__asm volatile("MSR CONTROL, R0");
...
}
Run the first task#
Finally, we run the first task. Note the PC
value we have pushed to the task’s Stack Frame.
void start_scheduler() {
...
// get the handler of the first task by tracing back from PSP which is at R4 slot
void (*handler)() = (void (*))((uint32_t*)__puTasksPSP[__uCurrentTaskIdx])[14];
// execute the handler
handler();
}
The SysTick will be triggered to do Context Switching.
Context Switching#
The Context Switching is implemented in the SysTick_Handler()
as it is triggered periodically.
- Note 1:
- The
xPSR
,PC
,LR
,R12
,R3
toR0
registers are automatically saved to Task’s PSP stack when an exception occurs. They are also restored automatically when the exception exits.
- Note 2:
-
The processor use Fill Descending Stack, therefore, when pushing registers to stack, the PSP is decreased, while popping register from stack, the PSP is increased.
Therefore, we can use
STMDB
(Store memory using Decrement Before) to save registers at PSP. Note to use the exclamation to update the changed address (new PSP value).__asm volatile("STMDB R0!, {R4-R11}"); // R0 is updated after decrement
The same note is applied for
LDMIA
(Load memory using Increment After) to pop registers from stack. - Note 3:
- To note change the PSP stack of the current task,
SysTick_Handler()
must be a naked function to remove prologue and epilogue sequence of a function. - Note 4:
-
The scheduler will decode which task is run next. A Round-Robin, it is easy to make a circular index.
void select_next_task() { /* Round-Robin scheduler */ __uCurrentTaskIdx++; // check if a task is register at current slot if (__uCurrentTaskIdx >= TASK_NUMBER_MAX || __puTasksPSP[__uCurrentTaskIdx] == 0) { __uCurrentTaskIdx=0; } }
The implementation of Context Switching:
__attribute__ ((naked)) void SysTick_Handler() {
// save LR back to main, must do this firstly
__asm volatile("PUSH {LR}");
printf("****\n");
/* Save the context of current task */
// get current PSP
__asm volatile("MRS R0, PSP");
// save R4 to R11 to PSP Frame Stack
__asm volatile("STMDB R0!, {R4-R11}"); // R0 is updated after decrement
// save current value of PSP
__asm volatile("BL save_current_psp"); // R0 is first argument
/* Do scheduling */
// select next task
__asm volatile("BL select_next_task");
/* Retrieve the context of next task */
// get its past PSP value
__asm volatile("BL get_current_psp"); // return PSP is in R0
// retrieve R4-R11 from PSP Fram Stack
__asm volatile("LDMIA R0!, {R4-R11}"); // R0 is updated after increment
// update PSP
__asm volatile("MSR PSP, R0");
// exit
__asm volatile("POP {LR}");
__asm volatile("BX LR");
}
Main function#
In the main, we just need to do:
- Declare Tasks’ main function
- Add tasks into the Scheduler
- Start the scheduler
int main() {
// some initialization
...
// add tasks
init_task(task1_main);
init_task(task2_main);
init_task(task3_main);
init_task(task4_main);
// start
start_scheduler();
// should never go here
for(;;);****
}
Debug and Run#
Set a breakpoint before the scheduler starts:
- Check the Stack values
- Check the dummy Stack Frame
Run the scheduler, and open the output of the system, such as SWV to see the log.
All tasks should be running!
There is no gurantee that the messages in output stream are completed. You mostly see the messages are interrupted into piecese because different tasks will try to use one sharing output. You will learn to synchronize output to avoid interruption later.
Scheduler with PendSV#
The delay problem
Let say task A has to delay 1 ms in its execution, how long does it wait actually?
If there are N tasks running in above scheduler, task A will have to wait N ms, not 1 ms.
System Ticks#
System keeps tracking the number of ticks using a global counter, and increases it by 1 at every SysTick exception. As discussed in Pendable Service Call, SysTick Handler only needs to trigger the Context Switching by setting up the pending bit for PendSV. Note that, reschedule()
is called from an Exception which is in Handler mode with Privileged access level.
uint32_t __gSysTicks = 0;
void SysTick_Handler(void) {
printf("****\n");
__gSysTicks++;
reschedule(1); // privileged
}
Context Switching in PendSV#
We move the Context Switching from SysTick Handler to the PendSV Handler. The sequence is kept as the same as before:
- Save the context of current task
- Select next task
- Retrieve the context of next task
- Resume next task execution
__attribute__ ((naked)) void PendSV_Handler(void) {
// save LR back to main, must do this firstly
__asm volatile("PUSH {LR}");
/* Save the context of current task */
// get current PSP
__asm volatile("MRS R0, PSP");
// save R4 to R11 to PSP Frame Stack
__asm volatile("STMDB R0!, {R4-R11}"); // R0 is updated after decrement
// save current value of PSP
__asm volatile("BL save_current_psp"); // R0 is first argument
/* Do scheduling */
// select next task
__asm volatile("BL select_next_task");
/* Retrieve the context of next task */
// get its past PSP value
__asm volatile("BL get_current_psp"); // return PSP is in R0
// retrieve R4-R11 from PSP Fram Stack
__asm volatile("LDMIA R0!, {R4-R11}"); // R0 is updated after increment
// update PSP
__asm volatile("MSR PSP, R0");
// exit
__asm volatile("POP {LR}");
__asm volatile("BX LR");
}
Task state#
When a task has got nothing to do, it should go to the Blocked State in which it actually does nothing instead of consuming CPU to do a meaningless loop.
To make it simple, a task has 2 states: RUNNING
and BLOCKED
.
The scheduler should schedule only RUNNING
tasks, and unblock BLOCKED
tasks if their blocking period is over.
typedef enum {
RUNNING,
BLOCKED
} TaskState_t;
Task Control Block#
There is a structure called Task Control Block to group all information related to a task:
Handler
: the main function of the taskPSP
: the current process stack pointerState
: task is running or blockedWaitToTick
: task is blocked until the system ticks meet this value
typedef struct {
void (*handler)(void);
uint32_t psp;
TaskState_t state;
uint32_t waitToTick;
} TCB_t;
uint32_t __uCurrentTaskIdx = 0;
TCB_t __pTasks[TASK_NUMBER_MAX] = {0};
uint32_t get_current_psp() {
return __pTasks[__uCurrentTaskIdx].psp;
}
void save_current_psp(uint32_t psp) {
__pTasks[__uCurrentTaskIdx].psp = psp;
}
Blocked Task#
A task_delay()
function will set the current task to BLOCKED
state, and set its waitToTick
value. When a task is blocked, it should tell scheduler to run a next task. Note that, reschedule()
is called from a user task which is in Thread mode with Unprivileged access level.
void task_delay(uint32_t ticks) {
if(__uCurrentTaskIdx) {
__pTasks[__uCurrentTaskIdx].state = BLOCKED;
__pTasks[__uCurrentTaskIdx].waitToTick = __gSysTicks + ticks;
reschedule(0); // unprivileged
}
}
Idle Task#
This is the default and the first task of the scheduler. In case all user tasks are blocked, the Idle task will be run. As soon as a user task is unblocked, that task is scheduled to run on next SysTick exception.
void idle_main(void) {
while(1) {
__asm volatile("NOP");
}
}
The idle_main
is hidden to user, when a user tasks is added, scheduler will try to add the idle_main
at the first slot:
void init_task(void (*handler)) {
// init idle if it is not found
if (handler != idle_main) {
if (__pTasks[0].handler != idle_main) {
init_task(idle_main);
}
}
// find an empty slot
int i=0;
for(; i<TASK_NUMBER_MAX; i++) {
if (__pTasks[i].psp == 0) break;
}
if(i >= TASK_NUMBER_MAX) {
printf("Can not register a new task anymore!\n");
return;
} else {
printf("Register a task %p at slot %i\n", handler, i);
}
// calculate new PSP
uint32_t* psp = (uint32_t*)(MAIN_STACK - (i+1)*TASK_STACK_SIZE);
// fill dummy stack frame
*(--psp) = 0x01000000u; // Dummy xPSR, just enable Thumb State bit;
*(--psp) = (uint32_t) handler; // PC
*(--psp) = 0xFFFFFFFDu; // LR with EXC_RETURN to return to Thread using PSP
*(--psp) = 0x12121212u; // Dummy R12
*(--psp) = 0x03030303u; // Dummy R3
*(--psp) = 0x02020202u; // Dummy R2
*(--psp) = 0x01010101u; // Dummy R1
*(--psp) = 0x00000000u; // Dummy R0
*(--psp) = 0x11111111u; // Dummy R11
*(--psp) = 0x10101010u; // Dummy R10
*(--psp) = 0x09090909u; // Dummy R9
*(--psp) = 0x08080808u; // Dummy R8
*(--psp) = 0x07070707u; // Dummy R7
*(--psp) = 0x06060606u; // Dummy R6
*(--psp) = 0x05050505u; // Dummy R5
*(--psp) = 0x04040404u; // Dummy R4
// save PSP
__pTasks[i].psp = (uint32_t)psp;
// initial state
__pTasks[i].state = RUNNING;
__pTasks[i].handler = handler;
}
Select next task#
- At every context switching request, a task is selected to be run only when it is in
RUNNING
state - All tasks will be checked to unblock if its
waitToTick
meets the global tick
void select_next_task() {
/* Check all task state */
for(int i=0; i<TASK_NUMBER_MAX; i++) {
if(__pTasks[i].state == BLOCKED) {
if(__gSysTicks >= __pTasks[i].waitToTick) {
__pTasks[i].state = RUNNING;
}
}
}
/* Round-Robin scheduler */
while(1) {
__uCurrentTaskIdx++;
// check if a task is register at current slot
if (__uCurrentTaskIdx >= TASK_NUMBER_MAX || __pTasks[__uCurrentTaskIdx].psp == 0) {
__uCurrentTaskIdx=0;
}
if(__pTasks[__uCurrentTaskIdx].state == RUNNING) break;
}
}
Reschedule the task#
From above, we know that there are 2 moments that requests to reschedule the tasks:
-
In SysTick Handler: call from Handler mode with Privileged access level
We can directly access the Interrupt Control and State Register to set the pending bit for PendSV
-
In Delay Task: call from Thread mode with Unprivileged access level
We have limited access to the Interrupt Control and State Register, therefore, we have to request a Supervisor Call
SVC
Fault Exception
- Calling
SVC
in an SysTick handler causes Hard Fault (FORCED) - Direct access ICSR register in Thread Mode with unprivileged access causes Bus Fault (PRECISERR)
void reschedule(uint32_t priv) {
if(priv) {
// trigger PendSV directly
ICSR |= (1 << 28);
} else {
// call Supervior exception to get Privileged access
__asm volatile("SVC #255");
}
}
To call to SVC exception, refer to Supervisor Call (SVC):
__attribute__ ((naked)) void SVC_Handler(void) {
__asm volatile("TST LR, 4"); // check LR to know which stack is used
__asm volatile("ITE EQ"); // 2 next instructions are conditional
__asm volatile("MRSEQ R0, MSP"); // save MSP if bit 2 is 0
__asm volatile("MRSNE R0, PSP"); // save PSP if bit 2 is 1
__asm volatile("B SVC_Handler_main"); // pass R0 as the argument
}
void SVC_Handler_main(uint32_t* SP) {
// get the address of the instruction saved in PC
uint8_t *pInstruction = (uint8_t*)(SP[6]);
// go back 2 bytes (16-bit opcode)
pInstruction -= 2;
// get the opcode, in little endian
uint8_t svc_num = *pInstruction;
switch(svc_num) {
case 0xFF:
// trigger PendSV
ICSR |= (1 << 28);
break;
default:
break;
}
}
User Tasks and Main function#
Here is the time we use task_delay()
to delay a task in non-blocking way. The expected result is: print 1111
after 16 ticks, print 2222
after 8 ticks, and so on.
/* Tasks */
void task1_main(void) {
while(1) {
printf("1111\n");
task_delay(16);
}
}
void task2_main(void) {
while(1) {
printf("2222\n");
task_delay(8);
}
}
void task3_main(void) {
while(1) {
printf("3333\n");
task_delay(4);
}
}
void task4_main(void) {
while(1) {
printf("4444\n");
task_delay(2);
}
}
int main(void)
{
// some initialization
...
// add tasks
init_task(task1_main);
init_task(task2_main);
init_task(task3_main);
init_task(task4_main);
// start
start_scheduler();
// should never go here
for(;;);
}
Debug and Run#
Run the scheduler, and open the output of the system, such as SWV to see the log.
All tasks should be running!
You can count the number of ticks ****
to see how many ticks are passed before a task prints out again.