The Embedded New Testament

The "Holy Bible" for embedded engineers


Project maintained by theEmbeddedGeorge Hosted on GitHub Pages — Theme by mattgraham

Memory Models in Embedded Systems

Understanding memory layout, segmentation, and access patterns for efficient embedded programming

📋 Table of Contents


🎯 Overview

Concept: Sections map to cost at startup and at runtime

Know which data ends up in Flash vs RAM and what the startup code must zero or copy. Use the map file to make footprint visible and deliberate.

Why it matters in embedded

Try it

  1. Build with a map file; identify largest contributors to .data and .bss.
  2. Move large tables to static const and observe .rodata vs .data changes.

Takeaways


🧪 Guided Labs

✅ Check Yourself

Understanding memory models is crucial for embedded systems programming. Memory layout, segmentation, and access patterns directly impact performance, reliability, and security of embedded applications.

Key Concepts for Embedded Development

🤔 What are Memory Models?

Memory models define how memory is organized, accessed, and managed in embedded systems. They specify the layout of different memory regions, how data is stored and retrieved, and how memory operations are synchronized across different components of the system.

Core Concepts

Memory Organization:

Memory Access Patterns:

Memory Management:

Memory Model Types

Flat Memory Model:

Segmented Memory Model:

Paged Memory Model:

🎯 Why are Memory Models Important?

Embedded System Requirements

Performance Optimization:

Reliability and Safety:

Resource Constraints:

Real-world Impact

Performance Impact:

// Poor memory access pattern - cache misses
void poor_memory_access(uint32_t* data, size_t size) {
    for (size_t i = 0; i < size; i += 16) {
        // Accessing every 16th element - poor cache utilization
        data[i] = process_value(data[i]);
    }
}

// Good memory access pattern - cache-friendly
void good_memory_access(uint32_t* data, size_t size) {
    for (size_t i = 0; i < size; i++) {
        // Sequential access - good cache utilization
        data[i] = process_value(data[i]);
    }
}

Memory Layout Impact:

// Poor memory layout - fragmentation
typedef struct {
    uint8_t small_field;    // 1 byte
    uint32_t large_field;   // 4 bytes (3 bytes padding)
    uint8_t another_small;  // 1 byte (3 bytes padding)
} poor_layout_t;  // 12 bytes total

// Good memory layout - efficient
typedef struct {
    uint32_t large_field;   // 4 bytes
    uint8_t small_field;    // 1 byte
    uint8_t another_small;  // 1 byte (2 bytes padding)
} good_layout_t;  // 8 bytes total

Stack Management Impact:

// Poor stack usage - potential overflow
void poor_stack_usage(void) {
    uint8_t large_buffer[8192];  // 8KB on stack
    // Process large buffer...
    // May cause stack overflow
}

// Good stack usage - heap for large data
void good_stack_usage(void) {
    uint8_t* large_buffer = malloc(8192);  // Heap allocation
    if (large_buffer != NULL) {
        // Process large buffer...
        free(large_buffer);
    }
}

When Memory Models Matter

High Impact Scenarios:

Low Impact Scenarios:

🧠 Memory Model Concepts

How Memory Models Work

Address Space Organization:

  1. Logical Addresses: Addresses used by programs
  2. Physical Addresses: Actual memory locations
  3. Address Translation: Converting logical to physical addresses
  4. Memory Mapping: Mapping logical addresses to physical memory

Memory Segmentation:

Memory Protection:

Memory Access Patterns

Sequential Access:

Random Access:

Strided Access:

Memory Hierarchy

Cache Levels:

Memory Types:

🏗️ Memory Layout

What is Memory Layout?

Memory layout refers to how different memory regions are organized in the address space. It defines where code, data, stack, and heap are located and how they relate to each other.

Memory Layout Concepts

Address Space Organization:

Memory Region Types:

Typical Embedded Memory Layout

/*
High Address
    ┌─────────────────┐
    │     Stack       │ ← Grows downward
    │                 │
    ├─────────────────┤
    │     Heap        │ ← Grows upward
    │                 │
    ├─────────────────┤
    │     .bss        │ ← Uninitialized data
    ├─────────────────┤
    │     .data       │ ← Initialized data
    ├─────────────────┤
    │     .text       │ ← Code
    └─────────────────┘
Low Address
*/

Memory Address Space

// Memory address ranges for ARM Cortex-M
#define FLASH_BASE     0x08000000  // Code memory
#define SRAM_BASE      0x20000000  // Data memory
#define PERIPH_BASE    0x40000000  // Peripheral registers

// Memory sizes
#define FLASH_SIZE     (512 * 1024)  // 512KB
#define SRAM_SIZE      (64 * 1024)   // 64KB
#define STACK_SIZE     (8 * 1024)    // 8KB stack

📊 Memory Segments

What are Memory Segments?

Memory segments are logical divisions of memory that serve different purposes. They help organize memory efficiently and provide different access patterns and protection levels.

Memory Segment Concepts

Segment Organization:

Segment Characteristics:

.text Segment (Code)

// Code segment - contains executable instructions
void function_in_text(void) {
    // This function is stored in .text segment
    uint32_t local_var = 42;
    // Function code here...
}

// Constants in .text segment
const uint8_t lookup_table[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9};

// Function pointers
typedef void (*callback_t)(void);
const callback_t callbacks[] = {function1, function2, function3};

.data Segment (Initialized Data)

// Initialized global variables
uint32_t global_counter = 0;
uint8_t sensor_data[64] = {0xAA, 0xBB, 0xCC, 0xDD};
const char* const_string = "Hello World";

// Initialized static variables
static uint16_t static_var = 0x1234;

// Initialized arrays
uint8_t buffer[1024] = {0};  // Zero-initialized

.bss Segment (Uninitialized Data)

// Uninitialized global variables (zeroed by startup code)
uint32_t uninitialized_var;
uint8_t large_buffer[8192];
static uint16_t static_uninit;

// These variables are automatically zeroed
// No space is used in the binary file

Stack Segment

// Stack variables
void stack_example(void) {
    int local_var = 42;           // Stack allocated
    uint8_t buffer[256];          // Stack array
    struct sensor_data data;       // Stack structure
    
    // Stack grows downward
    // Variables are automatically freed when function returns
}

// Stack overflow detection
void check_stack_usage(void) {
    uint8_t* stack_ptr;
    asm volatile ("mov %0, sp" : "=r" (stack_ptr));
    
    // Calculate stack usage
    uint32_t stack_used = STACK_BASE - (uint32_t)stack_ptr;
    if (stack_used > STACK_SIZE - 1024) {
        // Stack nearly full - take action
    }
}

Heap Segment

// Dynamic memory allocation
void heap_example(void) {
    uint8_t* buffer = malloc(1024);
    if (buffer != NULL) {
        // Use buffer...
        free(buffer);
    }
}

// Heap fragmentation monitoring
typedef struct {
    size_t total_blocks;
    size_t free_blocks;
    size_t largest_free_block;
} heap_stats_t;

heap_stats_t get_heap_stats(void) {
    heap_stats_t stats = {0};
    // Implementation depends on malloc implementation
    return stats;
}

🔧 Linker Scripts

What are Linker Scripts?

Linker scripts define how the linker organizes memory and creates the final executable. They specify memory layout, section placement, and symbol definitions.

Linker Script Concepts

Memory Definition:

Section Placement:

Basic Linker Script

/* STM32F4 Linker Script */
MEMORY
{
    FLASH (rx) : ORIGIN = 0x08000000, LENGTH = 512K
    SRAM (rwx) : ORIGIN = 0x20000000, LENGTH = 64K
}

SECTIONS
{
    /* Code section */
    .text : {
        *(.text)
        *(.text*)
        *(.rodata)
        *(.rodata*)
    } > FLASH
    
    /* Initialized data */
    .data : {
        _sdata = .;
        *(.data)
        *(.data*)
        _edata = .;
    } > SRAM AT> FLASH
    
    /* Uninitialized data */
    .bss : {
        _sbss = .;
        *(.bss)
        *(.bss*)
        *(COMMON)
        _ebss = .;
    } > SRAM
}

Custom Sections

// Custom section for critical data
__attribute__((section(".critical_data")))
uint8_t critical_buffer[256];

// Custom section for fast code
__attribute__((section(".fast_code")))
void fast_function(void) {
    // Fast code implementation
}

🛡️ Memory Protection

What is Memory Protection?

Memory protection prevents unauthorized access to memory regions. It ensures that programs can only access memory they are supposed to access.

Memory Protection Concepts

Protection Mechanisms:

Protection Levels:

Memory Protection Implementation

MPU Configuration

// Memory Protection Unit configuration
typedef struct {
    uint32_t region_number;
    uint32_t base_address;
    uint32_t size;
    uint32_t access_permissions;
    uint32_t attributes;
} mpu_region_t;

void configure_mpu(void) {
    // Configure MPU regions
    mpu_region_t regions[] = {
        {0, 0x20000000, 0x1000, 0x03, 0x00},  // SRAM region
        {1, 0x08000000, 0x80000, 0x05, 0x00}, // Flash region
    };
    
    // Apply MPU configuration
    for (int i = 0; i < sizeof(regions)/sizeof(regions[0]); i++) {
        configure_mpu_region(&regions[i]);
    }
}

Memory Access Control

// Memory access control functions
void protect_memory_region(uint32_t start, uint32_t end, uint32_t permissions) {
    // Configure memory protection for region
    mpu_region_t region = {
        .base_address = start,
        .size = end - start,
        .access_permissions = permissions
    };
    configure_mpu_region(&region);
}

// Usage
protect_memory_region(0x20000000, 0x20001000, MPU_READ_WRITE);

Cache Behavior

What is Cache Behavior?

Cache behavior refers to how the CPU cache interacts with memory. Understanding cache behavior is crucial for optimizing memory access patterns.

Cache Behavior Concepts

Cache Organization:

Cache Operations:

Cache Optimization

Cache-friendly Access Patterns

// Cache-friendly array access
void cache_friendly_access(uint32_t* data, size_t size) {
    // Sequential access - good for cache
    for (size_t i = 0; i < size; i++) {
        data[i] = process_value(data[i]);
    }
}

// Cache-unfriendly access pattern
void cache_unfriendly_access(uint32_t* data, size_t size) {
    // Strided access - poor for cache
    for (size_t i = 0; i < size; i += 16) {
        data[i] = process_value(data[i]);
    }
}

Cache Line Alignment

// Cache line aligned data structure
typedef struct {
    uint32_t data[16];  // 64 bytes - cache line size
} __attribute__((aligned(64))) cache_aligned_t;

// Cache line aligned allocation
void* allocate_cache_aligned(size_t size) {
    void* ptr;
    posix_memalign(&ptr, 64, size);  // 64-byte alignment
    return ptr;
}

🔄 Memory Ordering

What is Memory Ordering?

Memory ordering refers to the order in which memory operations are performed. It’s important for multi-core systems and concurrent programming.

Memory Ordering Concepts

Memory Ordering Types:

Memory Barrier Types:

Memory Ordering Implementation

Memory Barriers

// Memory barrier functions
void full_memory_barrier(void) {
    __asm volatile (
        "dmb 0xF\n"  // Full system memory barrier
        : : : "memory"
    );
}

void data_memory_barrier(void) {
    __asm volatile (
        "dmb 0xE\n"  // Data memory barrier
        : : : "memory"
    );
}

void instruction_barrier(void) {
    __asm volatile (
        "isb 0xF\n"  // Instruction synchronization barrier
        : : : "memory"
    );
}

Atomic Operations

// Atomic operations with memory ordering
uint32_t atomic_add(uint32_t* ptr, uint32_t value) {
    uint32_t result;
    __asm volatile (
        "ldrex %0, [%1]\n"
        "add %0, %0, %2\n"
        "strex r1, %0, [%1]\n"
        "cmp r1, #0\n"
        "bne 1b\n"
        : "=r" (result)
        : "r" (ptr), "r" (value)
        : "r1", "cc"
    );
    return result;
}

🔧 Implementation

Complete Memory Model Example

#include <stdint.h>
#include <stdbool.h>

// Memory layout definitions
#define FLASH_BASE     0x08000000
#define SRAM_BASE      0x20000000
#define STACK_SIZE     (8 * 1024)
#define HEAP_SIZE      (16 * 1024)

// Memory protection definitions
#define MPU_READ_WRITE     0x03
#define MPU_READ_ONLY      0x05
#define MPU_NO_ACCESS      0x00

// Memory region structure
typedef struct {
    uint32_t start_address;
    uint32_t end_address;
    uint32_t permissions;
    const char* name;
} memory_region_t;

// Memory regions
static const memory_region_t memory_regions[] = {
    {FLASH_BASE, FLASH_BASE + 512*1024, MPU_READ_ONLY, "Flash"},
    {SRAM_BASE, SRAM_BASE + 64*1024, MPU_READ_WRITE, "SRAM"},
    {0x40000000, 0x40000000 + 1024*1024, MPU_READ_WRITE, "Peripherals"},
};

// Memory protection functions
void configure_memory_protection(void) {
    // Configure MPU for memory regions
    for (int i = 0; i < sizeof(memory_regions)/sizeof(memory_regions[0]); i++) {
        const memory_region_t* region = &memory_regions[i];
        configure_mpu_region(region->start_address, 
                           region->end_address - region->start_address,
                           region->permissions);
    }
}

// Stack monitoring
typedef struct {
    uint32_t stack_base;
    uint32_t stack_size;
    uint32_t current_usage;
} stack_monitor_t;

static stack_monitor_t stack_monitor = {
    .stack_base = SRAM_BASE + 64*1024 - STACK_SIZE,
    .stack_size = STACK_SIZE
};

void update_stack_usage(void) {
    uint32_t current_sp;
    __asm volatile ("mov %0, sp" : "=r" (current_sp));
    
    stack_monitor.current_usage = 
        stack_monitor.stack_base + stack_monitor.stack_size - current_sp;
    
    // Check for stack overflow
    if (stack_monitor.current_usage > stack_monitor.stack_size - 1024) {
        // Stack nearly full - take action
        handle_stack_overflow();
    }
}

// Heap monitoring
typedef struct {
    size_t total_allocated;
    size_t total_freed;
    size_t current_usage;
    size_t peak_usage;
} heap_monitor_t;

static heap_monitor_t heap_monitor = {0};

void* monitored_malloc(size_t size) {
    void* ptr = malloc(size);
    if (ptr != NULL) {
        heap_monitor.total_allocated += size;
        heap_monitor.current_usage += size;
        if (heap_monitor.current_usage > heap_monitor.peak_usage) {
            heap_monitor.peak_usage = heap_monitor.current_usage;
        }
    }
    return ptr;
}

void monitored_free(void* ptr) {
    if (ptr != NULL) {
        // Note: This is simplified - actual size tracking requires more complex implementation
        heap_monitor.total_freed += sizeof(void*);
        heap_monitor.current_usage -= sizeof(void*);
        free(ptr);
    }
}

// Cache optimization functions
void* allocate_cache_aligned(size_t size) {
    void* ptr;
    if (posix_memalign(&ptr, 64, size) != 0) {
        return NULL;
    }
    return ptr;
}

void cache_friendly_copy(uint8_t* dest, const uint8_t* src, size_t size) {
    // Copy data in cache-friendly manner
    for (size_t i = 0; i < size; i++) {
        dest[i] = src[i];
    }
}

// Memory barrier functions
void full_memory_barrier(void) {
    __asm volatile (
        "dmb 0xF\n"
        : : : "memory"
    );
}

void data_memory_barrier(void) {
    __asm volatile (
        "dmb 0xE\n"
        : : : "memory"
    );
}

// Main function
int main(void) {
    // Configure memory protection
    configure_memory_protection();
    
    // Monitor stack usage
    update_stack_usage();
    
    // Use monitored memory allocation
    uint8_t* buffer = monitored_malloc(1024);
    if (buffer != NULL) {
        // Use buffer
        monitored_free(buffer);
    }
    
    // Use cache-aligned allocation
    uint8_t* cache_buffer = allocate_cache_aligned(1024);
    if (cache_buffer != NULL) {
        // Use cache-aligned buffer
        free(cache_buffer);
    }
    
    return 0;
}

⚠️ Common Pitfalls

1. Stack Overflow

Problem: Stack grows beyond allocated space Solution: Monitor stack usage and allocate sufficient stack space

// ❌ Bad: Large stack allocation
void bad_stack_usage(void) {
    uint8_t large_buffer[8192];  // 8KB on stack
    // May cause stack overflow
}

// ✅ Good: Heap allocation for large data
void good_stack_usage(void) {
    uint8_t* large_buffer = malloc(8192);
    if (large_buffer != NULL) {
        // Use buffer
        free(large_buffer);
    }
}

2. Memory Fragmentation

Problem: Memory becomes fragmented over time Solution: Use memory pools and avoid frequent allocation/deallocation

// ❌ Bad: Frequent allocation/deallocation
void bad_memory_usage(void) {
    for (int i = 0; i < 1000; i++) {
        void* ptr = malloc(100);
        // Use ptr
        free(ptr);
    }
}

// ✅ Good: Reuse allocated memory
void good_memory_usage(void) {
    void* ptr = malloc(100);
    for (int i = 0; i < 1000; i++) {
        // Reuse ptr
    }
    free(ptr);
}

3. Cache-unfriendly Access

Problem: Poor cache utilization due to access patterns Solution: Use cache-friendly access patterns

// ❌ Bad: Cache-unfriendly access
void cache_unfriendly(uint32_t* data, size_t size) {
    for (size_t i = 0; i < size; i += 16) {
        data[i] = process_value(data[i]);
    }
}

// ✅ Good: Cache-friendly access
void cache_friendly(uint32_t* data, size_t size) {
    for (size_t i = 0; i < size; i++) {
        data[i] = process_value(data[i]);
    }
}

4. Memory Alignment Issues

Problem: Misaligned memory access causing performance penalties Solution: Ensure proper memory alignment

// ❌ Bad: Misaligned access
typedef struct {
    uint8_t a;     // 1 byte
    uint32_t b;    // 4 bytes (3 bytes padding)
    uint8_t c;     // 1 byte (3 bytes padding)
} misaligned_t;    // 12 bytes

// ✅ Good: Aligned access
typedef struct {
    uint32_t b;    // 4 bytes
    uint8_t a;     // 1 byte
    uint8_t c;     // 1 byte (2 bytes padding)
} aligned_t;       // 8 bytes

Best Practices

1. Understand Memory Layout

2. Optimize for Performance

3. Monitor Memory Usage

4. Use Appropriate Tools

5. Follow Standards

🎯 Interview Questions

Basic Questions

  1. What are memory models and why are they important?
    • Define how memory is organized and accessed
    • Important for performance, reliability, and security
    • Affect memory access patterns and efficiency
    • Critical for embedded systems
  2. What are the different memory segments?
    • .text: Code segment
    • .data: Initialized data segment
    • .bss: Uninitialized data segment
    • Stack: Function call stack
    • Heap: Dynamically allocated memory
  3. How do you optimize memory access for cache?
    • Use sequential access patterns
    • Align data to cache lines
    • Minimize cache misses
    • Use cache-friendly data structures

Advanced Questions

  1. How would you implement memory protection in an embedded system?
    • Use MPU for memory protection
    • Configure memory regions
    • Set access permissions
    • Monitor memory access
  2. How would you handle memory fragmentation?
    • Use memory pools
    • Implement defragmentation
    • Monitor fragmentation
    • Use appropriate allocation strategies
  3. How would you optimize memory usage in a memory-constrained system?
    • Minimize code size
    • Optimize data structures
    • Use memory pools
    • Monitor memory usage

Implementation Questions

  1. Write a function to monitor stack usage
  2. Implement a memory pool allocator
  3. Create a cache-friendly data structure
  4. Design a memory protection system

📚 Additional Resources

Books

Online Resources

Tools

Standards


Next Steps: Explore Advanced Memory Management to understand efficient memory management techniques, or dive into Hardware Fundamentals for hardware-specific programming.