Real-Time Bluetooth Networks Shape the World Course Notes
Last Updated: October 01, 2018 by Pepe Sandoval
If you find the information in this page useful and want to show your support, you can make a donation
Use PayPal
This will help me create more stuff and fix the existent content...
A Real-Time Operating System (RTOS) is software that manages a computer system resources (memory, I/O, data, processor time, etc.) like a traditional OS and besides it guarantees all timing constraints are satisfied. Main characteristics of an RTOS:
RTOS have very specific metrics so considering the peak performance that need to be handled is essential
latency is the difference between the time a task is scheduled to run, and the time when the task is actually run
Real-time means the system guarantees that important tasks get run at the correct time and also are completed at the right time
The T
bit in the PSR
registers will always be 1
, indicating the ARM Cortex-M processor is executing Thumb instructions.
The ARM Architecture Procedure Call Standard (AAPCS) part of the ARM Application Binary Interface (ABI), uses registers
R0
, R1
, R2
, and R3
to pass input parameters into a C function or an assembly subroutine and place the return parameter in Register R0
R4-R11
contents must be preserved so if these registers are needed during a subroutine the function must save R4-R11
, use R4-R11
, and then restore R4-R11
,
Making sure we push and pop an even number of registers to maintain an 8-byte alignment on the stack.
ARM Architecture Procedure Call Standard (AAPCS) is used to properly connect C and Assembly languages
ARM's Cortex Microcontroller Software Interface Standard (CMSIS) is a standardized hardware abstraction layer for the Cortex-M processor, allows the standardization of I/O functions
The Software abstraction for the I/O layer is what is commonly known as the hardware abstraction layer (HAL), device driver, or board support package (BSP).
The stack is just a region of memory in the RAM, managed as a Last-In First-Out (LIFO) data structure by the stack pointer which initially points to the bottom of that region, the stack grows to lower addresses so the stack pointer always points to the top of the stack which means it points to the last inserted element or newest pushed element (this item is actually the stored at the lowest address!)
PUSH
Operations decrement the Stack pointer SP
and then store 32-bit data at the SP
POP
Operations first retrieve 32-bit data then increment the Stack pointer SP
ARM Cortex-M processor has two stack pointers: the main stack pointer (MSP
) and the process stack pointer (PSP
).
Only one stack pointer is active at a time.
In a high-reliability operating system, we could activate the PSP for user software running at an unprivileged level and the MSP for operating system software running at the privileged level. This way the user program could crash without disturbing the operating system
Perform push/pop operations only in the allocated area and don't violate the linear growth of the stack
which means DO NOT read or write to the region in the stack using other mechanisms besides PUSH
/POP
operations
Perform aligned operations this means PUSH
/POP
must be 32-bits operations so least significant two bits of SP
must always be 0.
On Reset
MSP
stack pointer.SP
and the 32-bit value at location 4 into the PC
(This value is called the reset vector)LR
to 0xFFFFFFFF
and the T
bit to 1The ARM Cortex-M processor has two privilege levels called privileged and unprivileged.
CONTROL
register is the thread privilege level (TPL
).CONTROL
register is the active stack pointer selection (ASPSEL
). if ASPSEL
is 1
, the processor uses the PSP
if 0
then MSP
is usedPrivilege modes
TPL = 1
: Access to all the featuresTPL = 0
: prevents access to various features, like the system timer and the interrupt controller.ARM defines the following running modes
The processor begins in thread mode, signified by
ISR_NUMBER=0.
Whenever it is servicing an interrupt it switches to handler mode, signified by settingISR_NUMBER
to specify which interrupt is being processed. All interrupt service routines run using theMSP
. In particular, the context is saved onto whichever stack pointer is active, but during the execution of the ISR, theMSP
is used.
An interface is defined as the collection of the I/O port, external electronics, physical devices, and the software, which combine to allow the computer to communicate with the external world. An example of an input interface is a switch
In a system with memory-mapped I/O, the I/O ports are connected to the processor in a manner similar to memory. I/O ports are assigned addresses, and the software accesses I/O using reads and writes to the specific I/O addresses. These addresses appear like regular memory addresses, except accessing them results in manipulation of a functionality of the mapped I/O port,
LDR
STR
MOV
PUSH
POP
B
BL
BX
ADD
SUB
CPSID
and CPSIE
AREA
EQU
IMPORT
EXPORT
and ALIGN
BL
instruction to call a subroutine. At run time, the BL instruction will save the return address in the LR
so the last instruction in a subroutine will be BX LR
, which we use to return from the subroutine.Addressing Modes | Description | Example |
---|---|---|
No addressing mode | Instructions operate completely within the processor and require no memory data fetches | ADD R1,R2,R3 |
Immediate | Data within the instruction | MOV R0,#1 |
Indexed | Data pointed to by register | LDR R0,[R1] |
Indexed with offset | Data pointed to by register plus offset | LDR R0,[R1,#4] |
PC-relative | Location is offset relative to PC | BL Incr |
Register-list | List of registers | PUSH {R4,LR} |
Profiling is a type of performance debugging that collects the time history of program execution. Profiling measures where and when our software executes.
With profiling we can determine both a time profile (when) and an execution profile (where) of the software execution.
A program is a sequence of software commands connected together to affect a desired outcome, they are static and lifeless entities.
A thread is a piece of software (program) in it's state of execution.
A thread is defined as either execution itself or the action caused by the execution. A threads is a program in action, they are dynamic
Threads abstracted by an OS or RTOS should look like each have a separate stack, its local variables are private, which means it alone has access to its own local variables but in reality there is just one set of registers that is switched between the threads as the thread scheduler operates, they do share resources such as global memory, and I/O devices
The thread switcher will
Threads categories:
Types of threads in simple RTOS
A producer thread is one that creates or produces data
A consumer thread is a thread that consumes (and removes) data
TCB (Thread Control Block) is a common name for a data structure to implement an RTOS thread which holds the attributes of a thread.
Active Thread is ready to run but waiting for its turn
Run Thread is currently executing
Sleep Waiting for an amount of time after which the OS will make the thread active again
Blocked waiting for an event or something else, could be it's waiting for some external event like input/output
Latency (Δi
) refers to the time between when the event occurs and the completion of the response to that event. Δi = Ti – Ei
for i = 0,1,2,...,n-1
Ei
: Time when that event occur in our systemTi
: Time when the event was servicedJitter (δti
) is the difference between desired time a task is supposed to run and the actual time it is run. δti = Ti – Di
for i = 0,1,2,...,n-1
Ti
: The actual time the task is runDi
: The desired time to run a periodic task. Di = To + i*Δt
for i = 0,1,2,...,n-1
To
is the starting time for the systemΔt
is the desired period Δt = 1/fs
A scheduler is a OS function that gives threads the notion of Concurrent processing where multiple threads are active
Only a single thread can run at any given time while other ready threads contend for processing.
The scheduler runs the ready threads one by one, switching between them to give us the illusion that all are running simultaneously.
Let Ej
be the time to execute each task, and let Tj
be the time between executions of each task. In general, Ej/Tj
will be the percentage of time Task j
needs to run.
The sum of these percentages of all tasks of the system yields a parameter that estimates processor utilization, an effective system will operate in the 5 to 50% range of processor utilization.
We tend to also look utilization is
< ln(2)
,ln(2) = 0.69
soutilization < 0.69
.
In a preemptive scheduler threads are suspended by a periodic interrupt, the scheduler chooses a new thread to run, and the return from interrupt will launch this new thread.
In a cooperative or non-preemptive scheduler, the threads themselves must decide when to stop running or give up control usually through a call to a specific function like OS_Suspend
A round robin scheduler simply runs the ready threads in circular fashion, giving each the same amount of time to execute.
A weighted round robin scheduler runs the ready threads in circular fashion, but gives threads unequal weighting.
A priority scheduler assigns each thread a priority number, priority may be statically assigned or can be changed dynamically
An exponential queue scheduler uses a dynamic scheduling algorithm, with varying priorities and time slices.
In an aging scheduler threads have a permanent fixed priority and a temporary working priority, the temporary priority is used to actually schedule threads. Periodically the OS increases the temporary priority of threads that have not been run in a long time. Once a thread is run, its temporary priority is reset back to its permanent priority.
Race Conditions
A program segment is reentrant it can be interrupted in the middle of its execution and then safely be called again, in more formal words if it can be concurrently executed by two (or more) threads without causing issues
A non-reentrant subroutine will have a section of code called a vulnerable window or critical section, if the code is interrupted during the execution of this sections errors could occur
The root of the critical section issues occur due to the non-atomicity of the read-modify-write operation (or a derivative of this) involved in reading and writing to a shared resource
Usually in embedded systems all I/O ports are considered global variables
An atomic operation is one that once started is guaranteed to finish. In most computers, once an assembly instruction has begun, the instruction must be finished before the computer can process an interrupt.
To avoid critical section issue we can either remove the access to global variables or implement mutual exclusion, which means only one thread at a time is allowed to execute in the critical section
A simple way to implement mutual exclusion is to disable interrupts while executing the critical section. When making code atomic with this method, make sure one critical section is not nested inside another critical section because if we disable interrupts at the beginning and re-enable interrupts at the end of each critical section there are two disable interrupt and two enable interrupt functions, interrupts will be incorrectly enabled after the most nested critical section finishes
When we look for critical sections, we look for a global variable that is shared and that we have a non-atomic operation that involves a write.
The RTOS init consists of two main parts the 1) Thread Creation/Initialization and the 2) OS Launch/start, this init and a 3) Thread Switcher are the essential elements needed to implement an RTOS
Thread Creation/Initialization
Definition of TCB
Disable interrupts and config clock (essential peripherals)
Init of linked lists
Init thread stacks
OS Launch/Start
Init Systick, make sure setting the priority of SysTick to be the lowest priority to never interrupt other ISRs
Implement StartOS
in assembly and run, in this subroutine the first user thread is launched by setting the stack pointer to the value of the first thread, then pulling all the registers off the stack explicitly.
Thread Switcher: Done in Systick ISR (SysTick_Handler
) we switch between tasks/threads taking advantage of the registers automatically pushed to the stack and the return from interrupt functionality
The processor automatically saves eight registers (R0-R3
, R12
, LR
, PC
and PSR
) on the stack as it suspends execution of the main program and launches the ISR
Disable interrupts because the Thread switcher has read-modify-write operations to the SP
and to RunPt
Explicitly save the remaining registers (R4-R11
)
Get value of the thread being interrupt which is in RunPt
, Register R1
is loaded with this
Save the actual SP
into the sp
field of the TCB
Choose the next thread in the circular linked list by updating RunPt
with the new value.
Set the new thread stack pointer using the newly set RunPt
Explicitly pull eight registers from the stack (R4-R11
)
Enable interrupts
LR
will contain 0xFFFFFFF9
so he BX LR
instruction will automatically pull the remaining 8 registers from the stack, and now the processor will be running the new thread
there are 17 total registers that make up the state of the thread, 16 are saved on the thread stack and the stack pointer itself is stored in the thread control block (TCB).
.c
Code Implementation#define NUMTHREADS 3 // maximum number of threads #define STACKSIZE 100 // number of 32-bit words in stack 4*100 = 400 bytes for each thread stack #define THREADFREQ 500 // frequency in Hz // 1.0. Definition of TCB typedef struct tcb { int32_t* sp; // pointer to stack, valid for threads not running struct tcb* next; // linked-list pointer } tcb_t; tcb_t tcbs[NUMTHREADS]; tcb_t *RunPt; int32_t Stacks[NUMTHREADS][STACKSIZE]; // Function SetInitialStack for step 1.3. Init thread stacks void SetInitialStack(int i){ tcbs[i].sp = &Stacks[i][STACKSIZE-16]; // thread stack pointer, make it point to the top of its stack considering we are pushing 16 words // Order of registers should match as if context switch happened Stacks[i][STACKSIZE-1] = 0x01000000; // Thumb bit //// We are missing the PC which would be at Stacks[i][STACKSIZE-2], it will be initialized later Stacks[i][STACKSIZE-3] = 0x14141414; // R14 == LR // The initial values for the rest of the registers do not matter Stacks[i][STACKSIZE-4] = 0x12121212; // R12 Stacks[i][STACKSIZE-5] = 0x03030303; // R3 Stacks[i][STACKSIZE-6] = 0x02020202; // R2 Stacks[i][STACKSIZE-7] = 0x01010101; // R1 Stacks[i][STACKSIZE-8] = 0x00000000; // R0 Stacks[i][STACKSIZE-9] = 0x11111111; // R11 Stacks[i][STACKSIZE-10] = 0x10101010; // R10 Stacks[i][STACKSIZE-11] = 0x09090909; // R9 Stacks[i][STACKSIZE-12] = 0x08080808; // R8 Stacks[i][STACKSIZE-13] = 0x07070707; // R7 Stacks[i][STACKSIZE-14] = 0x06060606; // R6 Stacks[i][STACKSIZE-15] = 0x05050505; // R5 Stacks[i][STACKSIZE-16] = 0x04040404; // R4 } void OS_Init(void){ // 1.1. Disable interrupts and config clock DisableInterrupts(); BSP_Clock_InitFastest();// set processor clock to fastest speed 80MHz in this case } int OS_AddThreads(void(*task0)(void), void(*task1)(void), void(*task2)(void)){ int32_t status = StartCritical(); // 1.2. Init of linked lists tcbs[0].next = &tcbs[1]; // 0 points to 1 tcbs[1].next = &tcbs[2]; // 1 points to 2 tcbs[2].next = &tcbs[0]; // 2 points to 0 // 1.3. Init thread stacks SetInitialStack(0); Stacks[0][STACKSIZE-2] = (int32_t)(task0); // PC SetInitialStack(1); Stacks[1][STACKSIZE-2] = (int32_t)(task1); // PC SetInitialStack(2); Stacks[2][STACKSIZE-2] = (int32_t)(task2); // PC RunPt = &tcbs[0]; // thread 0 will run first EndCritical(status); return 1; // successful } void OS_Launch(uint32_t theTimeSlice){ // 2.1. Init Systick STCTRL = 0; // disable SysTick during setup STCURRENT = 0; // any write to current clears it SYSPRI3 = (SYSPRI3 & 0x00FFFFFF) | 0xE0000000; // priority 7 STRELOAD = theTimeSlice - 1; // reload value STCTRL = 0x00000007; // enable, core clock and interrupt arm // 2.2. Implement `StartOS` in assembly and run StartOS(); // start on the first task } void Scheduler(void){ RunPt = RunPt->next; // Round Robin } void Task0(void){ Count0 = 0; while(1){ Count0++; } } void Task1(void){ Count1 = 0; while(1){ Count1++; } } void Task2(void){ Count2 = 0; while(1) { Count2++; } } int main(void){ // 1. Thread Creation/Initialization OS_Init(); // initialize clock, disable interrupts OS_AddThreads(&Task0, &Task1, &Task2); // 2. OS Launch/start OS_Launch(BSP_Clock_GetFreq()/THREADFREQ); // interrupts enabled in here return 0; // this never executes }
.asm
Code Implementation; 2.2. Implement `StartOS` in assembly and run StartOS LDR R0, =RunPt ; currently running thread LDR R1, [R0] ; R1 = value of RunPt LDR SP, [R1] ; new thread SP; SP = RunPt->sp; POP {R4-R11} ; restore regs r4-11 POP {R0-R3} ; restore regs r0-3 POP {R12} ADD SP, SP, #4 ; discard LR from initial stack POP {LR} ; start location ADD SP, SP, #4 ; discard PSR CPSIE I ; Enable interrupts at processor level BX LR ; start first thread IMPORT Scheduler ; 3. Thread Switcher with C SysTick_Handler ; 3.1) Saves R0-R3,R12,LR,PC,PSR CPSID I ; 3.2) Prevent interrupt during switch PUSH {R4-R11} ; 3.3) Save remaining regs r4-11 LDR R0, =RunPt ; 3.4) R0=pointer to RunPt, old thread LDR R1, [R0] ; R1 = RunPt STR SP, [R1] ; 3.5) Save SP into TCB PUSH {R0,LR} BL Scheduler ; 3.6) RunPt = RunPt->next POP {R0,LR} LDR R1, [R0] ; 3.6) R1 = RunPt, new thread casue R0 still points to RunPt LDR SP, [R1] ; 3.7) new thread SP; SP = RunPt->sp; POP {R4-R11} ; 3.8) restore regs r4-11 CPSIE I ; 3.9) tasks run with interrupts enabled BX LR ; 3.10) restore R0-R3,R12,LR,PC,PSR ; 3. Alternative Thread Switcher implementation in pure assembly SysTick_Handler_asm ; 3.1) Saves R0-R3,R12,LR,PC,PSR CPSID I ; 3.2) Prevent interrupt during switch PUSH {R4-R11} ; 3.3) Save remaining regs r4-11 LDR R0, =RunPt ; 3.4) R0=pointer to RunPt, old thread LDR R1, [R0] ; R1 = RunPt STR SP, [R1] ; 3.5) Save SP into TCB LDR R1, [R1,#4] ; 3.6) R1 = RunPt->next STR R1, [R0] ; 3.6) RunPt = R1 LDR SP, [R1] ; 3.7) new thread SP; SP = RunPt->sp; POP {R4-R11} ; 3.8) restore regs r4-11 CPSIE I ; 3.9) tasks run with interrupts enabled BX LR ; 3.10) restore R0-R3,R12,LR,PC,PSR
A Semaphore is a counter with three functions: OS_InitSemaphore
to init the semaphore and OS_Wait
& OS_Signal
/OS_Post/
/OS_Set
which are used at run time to provide synchronization between threads
Semaphores are used to implement:
A semaphore that can only be 0 or 1 is called a binary semaphore.
A spin-lock semaphore is the simplest way of implementing a semaphore in which a thread just waits for the semaphore to be posted
// If s is initialized as 0 the Semaphore can be used as a event flag, wait to wait for event and post to signal event
// If s is initialized as 1 the Semaphore can be used as a mutex, wait to lock and post to unlock the mutex
void OS_SpinLock_Init(int32_t *s, int32_t value){
*s = value;
}
void OS_SpinLock_Wait(int32_t *s) {
DisableInterrupts();
while((*s) == 0) { // Spin-lock loop
EnableInterrupts(); // interrupts can occur here
DisableInterrupts();
}
(*s) = (*s) - 1;
EnableInterrupts();
}
void OS_SpinLock_Signal(int32_t *s) {
DisableInterrupts();
(*s) = (*s) + 1;
EnableInterrupts();
}
Mailbox Semaphore Synchronization: We can use semaphores to implement a more structured mailbox synchronization mechanism whis is generally used in a producer-consumer situation between two threads
//// Spin-Lock Blocking implementation
// OS_MailBox init
uint32_t Mail; // shared data
int32_t Send = 0; // semaphore
int32_t Ack = 0; // semaphore
// Spin-Lock Blocking send implementation using Ack
void OS_MailBox_Send(uint32_t data) {
Mail = data;
OS_Signal(&Send);
OS_Wait(&Ack);
}
// Spin-Lock Blocking recv implementation using Ack
uint32_t OS_MailBox_Recv(void) {
uint32_t theData;
OS_Wait(&Send);
theData = Mail; // read mail
OS_Signal(&Ack);
return theData;
}
//// Spin-Lock Blocking, Non-Blocking mixed implementation
// OS_MailBox init
uint32_t Mail = 0 // shared data
int32_t Send = 0 // semaphore
// Spin-Lock Non-blocking send implementation if mailbox is already full that data will be lost.
void OS_MailBox_Send(uint32_t data){
Mail = data;
if(!Send) {
OS_Signal(&Send);
}
}
// Spin-Lock Blocking recv
uint32_t OS_MailBox_Recv(void){
OS_Wait(&Send);
return Mail;
}
When a thread can longer make process the smart thing to do is give up control so other task can run,
to implement this and OS usually has a OS_Suspend
function or a similar method to give up control
A Cooperative Semaphore is one that instead of waiting, gives up control to other task for example by calling a OS_Suspend
function
One way to suspend a thread is to trigger a Scheduler interrupt
(E.g. trigger a SysTick
interrupt by writing INTC
Systick register INTCTRL = 0x04000000
and resetting the counter to give a full time slice to the next thread).
void OS_Suspend(void){
STCURRENT = 0; // reset counter
INTCTRL = 0x04000000; // trigger SysTick
}
Blocking in an OS come from the idea of letting a thread block and wake up only when the resource is available.
A blocking semaphore is a semaphore that will prevent a thread from running when the thread needs a resource that is unavailable, it puts threads in a blocked state. Reason to use Blocking Semaphores
A thread is in the blocked state when it is waiting for some external event like input/output it is the semaphore function OS_Wait
that will block a thread if it needs to wait.
A Counting Semaphore is a way of implementing a Blocking Semaphore which holds a meaning in its count, usually represents the number of threads blocked on this resource.
1
: means it's free, the resource is available.0
: means the resource is not available but nobody's blocked.<0
: represents the number of threads blocked waiting for the resource guarded by this semaphore besides the thread using the resource. E.g. if the semaphore is -2
, it means one thread is using the resource and two other threads are blocked, waiting to use it. // Wait Decrements
void OS_CountingSem_Wait(int32_t *s){
DisableInterrupts();
(*s) = (*s) - 1;
if((*s) < 0) {
RunPt->blocked = s; // reason it is blocked
EnableInterrupts();
OS_Suspend(); // run thread switcher
}
EnableInterrupts();
}
// Signal Increments
void OS_CountingSem_Signal(int32_t *s){
tcb_t *pt;
DisableInterrupts();
(*s) = (*s) + 1;
if((*s) <= 0) {
pt = RunPt->next; // search for a thread blocked on this semaphore
while(pt->blocked != s) {
pt = pt->next;
}
pt->blocked = 0; // wakeup this one
}
EnableInterrupts();
}
// In this implementation calling the OS_CountingSem_Signal will not invoke the thread switcher.
// So during the thread switch, the OS needs to search the circular linked-list for a thread with a blocked field equal to zero
// the woken up thread in the signal call is just a possible candidate in the next scheduler iteration.
// Because of this we need to update the Scheduler accordingly
void Scheduler(void){
RunPt = RunPt->next; // run next thread not blocked
while(RunPt->blocked) { // skip if blocked
RunPt = RunPt->next;
}
}
FIFO Example: If a thread needs information from a FIFO (calls Get), then it will be blocked if the FIFO is empty (because it cannot retrieve any information.) Also, if a thread outputs information to a FIFO (calls Put), then it will be blocked if the FIFO is full (because it cannot save its information.)
Put
will store data in the FIFO, and the function Get
will remove data (returning the oldest data)#define FIFOSIZE 10 // FIFO Size
uint32_t volatile *PutPt; // put next. Points to the next location to be put to.
uint32_t volatile *GetPt; // get next. Points to the oldest item
uint32_t static Fifo[FIFOSIZE];
int32_t CurrentSize; // Specifies the number of elements currently in the FIFO. 0 means FIFO empty
int32_t RoomLeft; // Specifies the how many more elements could be put into the FIFO. 0 means FIFO full
int32_t FIFOmutex; // exclusive access to FIFO, mutex to protect the pointers
// Initialize FIFO
void OS_Fifo_Init(void){
PutPt = GetPt = &Fifo[0]; // Empty
OS_InitSemaphore(&CurrentSize, 0);
OS_InitSemaphore(&RoomLeft, FIFOSIZE);
OS_InitSemaphore(&FIFOmutex, 1);
}
void OS_Fifo_Put(uint32_t data){
OS_Wait(&RoomLeft); // RoomLeft is decremented by Fifo_Put signifying there is space for one less element. Check if FIFO is full and block if it is
OS_Wait(&FIFOmutex);
*(PutPt) = data; // Put
PutPt++; // place to put next
if(PutPt == &Fifo[FIFOSIZE]){
PutPt = &Fifo[0]; // wrap
}
OS_Signal(&FIFOmutex);
OS_Signal(&CurrentSize); // CurrentSize is incremented by Fifo_Put signifying one more element
}
uint32_t OS_Fifo_Get(void){
uint32_t data;
OS_Wait(&CurrentSize); // CurrentSize is decremented by Fifo_Get signifying one less element. Check if FIFO is empty and block if it is
OS_Wait(&FIFOmutex);
data = *(GetPt); // get data
GetPt++; // points to next data to get
if(GetPt == &Fifo[FIFOSIZE]){
GetPt = &Fifo[0]; // wrap
}
OS_Signal(&FIFOmutex);
OS_Signal(&RoomLeft); // RoomLeft incremented by Fifo_Get signifying there is space for one more element.
return data;
}
PutPt
.#define FIFOSIZE 10 // FIFO Size
uint32_t volatile *PutPt; // put next. Points to the next location to be put to.
uint32_t volatile *GetPt; // get next. Points to the oldest item
uint32_t static Fifo[FIFOSIZE];
int32_t CurrentSize; // Specifies the number of elements currently in the FIFO. 0 means FIFO empty
int32_t FIFOmutex; // Exclusive access to FIFO, needed to prevent two consumers from reading the same data
uint32_t LostData;
// initialize FIFO
void OS_Fifo_Init(void) {
PutPt = GetPt = &Fifo[0]; // Empty
OS_InitSemaphore(&CurrentSize, 0);
OS_InitSemaphore(&FIFOmutex, 1);
LostData = 0;
}
int OS_FIFO_Put(uint32_t data) {
if(CurrentSize == FIFOSIZE) {
LostData++; // Returns an error (-1) if the data was not saved because the FIFO was full
return -1;
}
*(PutPt) = data; // Put
PutPt++; // place for next
if(PutPt == &Fifo[FIFOSIZE]){
PutPt = &Fifo[0]; // wrap
}
OS_Signal(&CurrentSize);
return 0;
}
uint32_t OS_FIFO_Get(void) {
uint32_t data;
OS_Wait(&CurrentSize); // Check if FIFO is empty and block if it is empty
OS_Wait(&FIFOmutex);
data = *(GetPt); // get data
GetPt++; // points to next data to get
if(GetPt == &Fifo[FIFOSIZE]){
GetPt = &Fifo[0]; // wrap
}
OS_Signal(&FIFOmutex);
return data;
}
If there is only one consumer (besides the only one producer) we don't need the
FIFOmutex
so we can remove it from theOS_FIFO_Get
and convert this to a one-semaphore implementation of a FIFO
Sleeping is another way a thread gives up control and implement a 'periodic' task that runs approximately every certain amount of time (It doesn't have an exact period of execution). Usually used for Debouncing or other operatiosn that need a delay but not a strict one
Usually and OS implements an OS_Sleep
function that will make a thread dormant for a finite time and after that time the tread
will be active again (not run, just active to be considered to run again by the scheduler)
Implementation considerations
Sleep
field and call the scheduler to switch tasksSleep
field is not an exact time delay because when reacheing 0, the thread is not immediately run it just active so it will be considered to run again by the scheduler void Timer_ISR(void) {
uint32_t i = 0;
while (i < NUMTHREADS) {
if(RunPt->sleep > 0) {
RunPt->sleep--;
}
RunPt = RunPt->Next;
i++;
}
}
void OS_Sleep(uint32_t n) {
RunPt->sleep = n;
OS_Suspend();
}
void Scheduler(void) {
RunPt = RunPt->next; // skip at least one
while((RunPt->sleep) || (RunPt->blocked)) {
RunPt = RunPt->next; // find one not sleeping and not blocked
}
}
int32_t SW1,SW2;
uint8_t last1,last2;
void Switch_Init(void) {
SYSCTL_RCGCGPIO_R |= 0x20; // activate clock for Port F
OS_InitSemaphore(&SW1,0); // initialize semaphores
OS_InitSemaphore(&SW2,0);
GPIO_PORTF_LOCK_R = 0x4C4F434B; // unlock GPIO Port F
GPIO_PORTF_CR_R = 0x1F; // allow changes to PF4-0
GPIO_PORTF_DIR_R &= ~0x11; // make PF4,PF0 in
GPIO_PORTF_DEN_R |= 0x11; // enable digital I/O on PF4,PF0
GPIO_PORTF_PUR_R |= 0x11; // pullup on PF4,PF0
GPIO_PORTF_IS_R &= ~0x11; // PF4,PF0 are edge-sensitive
GPIO_PORTF_IBE_R |= 0x11; // PF4,PF0 are both edges
GPIO_PORTF_ICR_R = 0x11; // clear flags
GPIO_PORTF_IM_R |= 0x11; // arm interrupts on PF4,PF0
NVIC_PRI7_R = (NVIC_PRI7_R&0xFF00FFFF)|0x00A00000; // priority 5
NVIC_EN0_R = 0x40000000; // enable interrupt 30 in NVIC
}
void GPIOPortF_Handler(void) {
if(GPIO_PORTF_RIS_R&0x10){ // poll PF4
GPIO_PORTF_ICR_R = 0x10; // acknowledge flag4
OS_Signal(&SW1); // signal SW1 occurred
GPIO_PORTF_IM_R &= ~0x10; // disarm interrupt on PF4
}
if(GPIO_PORTF_RIS_R&0x01) { // poll PF0
GPIO_PORTF_ICR_R = 0x01; // acknowledge flag0
OS_Signal(&SW2); // signal SW2 occurred
GPIO_PORTF_IM_R &= ~0x81; // disarm interrupt on PF0
}
OS_Suspend(); // Explicit task switch
}
void Switch1Task(void) { // high priority main thread
last1 = GPIO_PORTF_DATA_R&0x10;
while(1) {
OS_Wait(&SW1); // wait for SW1 to be touched/released
if(last1) { // was previously not touched
Touch1(); // user software associated with touch
} else {
Release1(); // user software associated with release
}
OS_Sleep(10); // wait for bouncing to be over
last1 = GPIO_PORTF_DATA_R&0x10;
GPIO_PORTF_IM_R |= 0x10; // rearm interrupt on PF4
GPIO_PORTF_ICR_R = 0x10; // acknowledge flag4
}
}
void Switch2Task(void) { // high priority main thread
last2 = GPIO_PORTF_DATA_R&0x01;
while(1) {
OS_Wait(&SW2); // wait for SW2 to be touched/released
if(last2) { // was previously not touched
Touch2(); // user software associated with touch
}else {
Release2(); // user software associated with release
}
OS_Sleep(10); // wait for bouncing to be over
last2 = GPIO_PORTF_DATA_R&0x01;
GPIO_PORTF_IM_R |= 0x01; // rearm interrupt on PF0
GPIO_PORTF_ICR_R = 0x01; // acknowledge flag0
}
}
Turn around time is the time elapsed from when a thread arrives till it completes execution
Response time is the time elapsed from when a thread arrives till it starts execution. Round robin minimizes response time
Priority inversion is a problem or condition usually caused when a high-priority thread is waiting on a resource owned by a low-priority thread. in this situation a low priority task is acting as if it is bigger than a high priority for some time so the priorities are "inverted". A solution for this is with priority inheritance, once a high-priority thread blocks on a resource, the thread holding that resource is granted a temporary priority equal to the priority of the high-priority blocked thread, once the thread releases the resource, its priority is returned to its original value. This can be implemented with a semaphore protocol called priority ceiling
The Rate Monitonic Theorem main idea is to assign high priority to tasks that generate a lot of blocking (I/O) and low priority to tasks which are processor intensive
Multi-level feedback queue has good performance for response time and the turn around time.
void Scheduler(void) { // every time slice
uint32_t max = 255; // max
tcbType *pt;
tcbType *bestPt;
pt = RunPt; // search for highest thread not blocked or sleeping
do{
pt = pt->next; // skips at least one
if((pt->Priority < max) && ((pt->BlockPt) == 0) && ((pt->Sleep) == 0)){
max = pt->Priority;
bestPt = pt;
}
} while(RunPt != pt); // look at all possible threads
RunPt = bestPt;
}
priority scheduler is flexible in two ways.
OS_Signal
and OS_Suspend
to signals the semaphore on the appropriate event and the user code runs as a main thread.With priority scheduler, we can place time-critical tasks as high priority threads. We will block these time-critical tasks waiting on an event (semaphore) and when the event occurs we signal its semaphore. Because we now have a high priority thread not blocked, the scheduler will run it immediately. This will make them run periodically with very little jitter.
int32_t TakeSoundData; // binary semaphore
void RealTimeEvents(void){
OS_Signal(&TakeSoundData);
OS_Suspend();
}
void Task0(void){
while(1){
OS_Wait(&TakeSoundData); // signaled every 1ms
Profile_Toggle0(); // viewed by the logic analyzer to know Task0 started
// time-critical software here
}
}
int main(void){
OS_Init();
// other initialization
OS_InitSemaphore(&TakeSoundData,0);
OS_AddThreads(&Task0,0, &Task1,1, &Task2,2);
BSP_PeriodicTask_Init(&RealTimeEvents, 1000,0);
OS_Launch(BSP_Clock_GetFreq()/THREADFREQ); // doesn't return
return 0; // this never executes
}
PendSV
ISRA File System allows the software to store data and to retrieve previously stored data. Essentially a file system supports:
File System metrics:
Fragmentation is a problem related to wasted space
Files systems are usually implemented as a sequence of sectors
A sector is a unit of storage, each sector is the smallest amount of storage we can use to save data and has a fixed size (usually ~512 bytes) so whole sectors will be allocated to a file.
One of the resposabilities of the file system is to translate the logical address to the physical address.
Internal fragmentation
External fragmentation
Algorithms to manage storage:
Contiguous allocation:
Linked allocation:
Indexed allocation:
Memory Categories
Solid-state disks can be made from any nonvolatile memory
Secure digital (SD) cards use Flash EEPROM together with interface logic to read and write data
SD FS implementations low-level eDisk and a high-level FAT16 file system
The TM4C123 and MSP432 have 256 kB of internal flash, existing from addresses 0 to 0x0003FFFF
Normally, internal flash is used to save the machine code of our software. However, we can allocate part of it (half for example which is 2^17
= 128 kibibytes
= 128 kB
) to create a solid state disk
let each sector be 2^p
bytes and will partition the 2^17
= 128 kB
disk into 2^m
sectors so m+p=17
The smallest block that we can erase on the TM4C123 is 1024 bytes
= 1 kB
Operations:
int Flash_Erase(uint32_t addr);
int Flash_Write(uint32_t addr, uint32_t data);
and int Flash_WriteArray(uint32_t *source, uint32_t addr, uint16_t count);
Implementation details of Write once FS:
255
=0xFF
to mean null-pointer, and use sector number 255
as the directory.A network is a collection of interfaces that share a physical medium (which refers to the actual physical medium, like a wire) and a data protocol (which is how data is encoded in the communication channel.), it provides the transfer of information as well as the mechanisms for process synchronization.
The topology of a network defines how the components are interconnected, shape of how the network is connected.
A full duplex channel allows data to transfer in both directions at the same time (E.g. Ethernet, SPI, UART...). In a half duplex system, data can transfer in both directions but only in one direction at a time (E.g. CAN, I2C, etc.). A simplex channel allows data to flow in only one direction.
The International Standards Organization (ISO) defines a 7-layer model called the Open Systems Interconnection (OSI) which t provides a standard way to classify network components and operations.
CAN is a high-integrity serial data communications bus that is used for real-time applications
In CAN there are dominant and recessive states on the transmitter, if one or more nodes are sending a dominant state, it will override any nodes attempting to send a recessive state.
CAN components
A transceiver is a device capable of transmitting and receiving on the same channel.
In a CAN system, messages are identified by their contents rather by addresses. Each message sent on the bus has a unique identifier, which defines both the content and the priority of the message.
CAN Message types:
The Data Frame
0
: (dominant) for data frames1
: (recessive) for remote request framesTo transmit a message, the software must set the Identifier (11-bit or 28-bit), set the 4-bit DLC, and give the 0 to 8 bytes of data, the receivers can define filters on the identifier field, so only certain message types will be accepted and when it is the software can read the identifier, length, and data.
The Intermission Frame Space (IFS) separates one frame from the next.
The number of bits in a CAN message frame is determined by the ID (11 or 29 bits) and the Data fields (from 0 to 64 bits in mutiples of 8, in other words rom 0 to 8 bytes)
In CAN bandwidth and response time are affected by message priority, the identifier with the lowest binary number has the highest priority
In order to resolve a bus access conflict, each node in the network observes the bus level bit by bit, a process known as bit-wise arbitration. The dominant state overwrites the recessive state. All nodes with recessive transmission but dominant observation immediately lose the competition for bus access and become receivers of the message with the higher priority (become listeners only),
Bluetooth is wireless medium and a data protocol that connects devices together over a short distance
At the highest level, we see Bluetooth devices implement profiles. A profile is a suite of functionalities that support a certain type of communication E.g. the Advanced Audio Distribution Profile (A2DP) can be used to stream data and the most generic one the Generic Attribute Protocol (GATT) within the GATT there can be one or more services
Within a service there may be one or more characteristics. A characteristic is user or application data that is transmitted from one device to another across the network (characteristic = data). They have universally unique identifier (UUID) which is a 128-bit (16-byte) number. Characteristic has a property which define what can be do with it like read, write or notify and one or more descriptors. Descriptors may be information like its name and its units. UUIDs are passed across the network.
Handles are a mechanism to identify characteristics within the device. A handle is a pointer to an internal data structure within the GATT that contains all the information about that characteristic, they are not passed across the Bluetooth network; rather, they are used by the host and controller to keep track of characteristics.
Bluetooth LE uses range from 2.40 to 2.48 GHz in the Electromagnetic spectrum, which exists in the microwave spectrum. It could use any of the 40 narrow bands (LL 0 to 39) at 2.4 GHz (Each band is ±1 MHz.)
LL channels 37, 38 and 39 are used to advertise, and LL channels 9-10, 21-23 and 33-36 are used for BLE communication
BLE has good performance in congested/noisy environments because it can hop from one frequency to another. Frequency Hopping Spread Spectrum (FHSS) rapidly switches the carrier among many frequency channels, using a pseudorandom sequence known to both transmitter and receiver
The overriding theme of Bluetooth communication is the exchange of data between paired devices. A service is a mechanism to exchange data. A collection of services is a profile.
The BLE protocol stack includes a controller and a host. Layers
Connection flow
The client-server paradigm is the dominant communication pattern for network protocols. The client can request information from the server, or the client can send data to the server.
In BLE client-server read-write-notify operations
BLE Devices:
There are three controllers on the CC2650:
The CC2650 BoosterPack comes preprogrammed with the simple network processor described in the next section. With a JTAG debugger, other programs can be loaded onto this CC2650. allowing for a single chip solutions using BLE
254
(0xFE
)
MRDY
lowSRDY
lowMRDY
highSRDY
high signaling ackSRDY
lowMRDY
lowSRDY
highMRDY
high signaling ack
MRDY
high and pulses reset low for 10 msEach of the commands has an acknowledgement response.
If you find the information in this page useful and want to show your support, you can make a donation
Use PayPal
This will help me create more stuff and fix the existent content...