The Embedded New Testament

The "Holy Bible" for embedded engineers

Project maintained by theEmbeddedGeorge Hosted on GitHub Pages — Theme by mattgraham

Multi-threading

Concurrent Execution in Linux Applications
Understanding pthread programming, thread synchronization, and concurrent programming for embedded systems

🏗️ Threading Fundamentals

What are Threads?

Threads are lightweight units of execution within a process that share the same memory space and resources. Unlike processes, which have separate memory spaces, threads can directly access shared data, making them ideal for concurrent programming in embedded systems.

The Thread Abstraction:

Lightweight: Much faster to create and destroy than processes
Shared Memory: All threads in a process share the same address space
Concurrent Execution: Multiple threads can run simultaneously
Resource Sharing: Threads share file descriptors, memory, and other resources
Efficient Communication: Direct access to shared data structures

Threads vs. Processes: Understanding the Distinction

Understanding when to use threads versus processes is crucial for efficient embedded programming:

Processes:

Memory Space: Separate virtual address space for each process
Resource Isolation: Complete isolation of resources
Communication: Requires IPC mechanisms (pipes, sockets, etc.)
Overhead: Higher creation and context switching overhead
Use Case: Independent applications, security isolation

Threads:

Memory Space: Shared address space within a process
Resource Sharing: Direct access to shared memory and resources
Communication: Direct access to shared data structures
Overhead: Lower creation and context switching overhead
Use Case: Concurrent tasks within an application, performance optimization

┌─────────────────────────────────────┐
│         Process Address Space       │
├─────────────────────────────────────┤
│         Code Segment                │ ← Shared by all threads
│         (Program instructions)      │
├─────────────────────────────────────┤
│         Data Segment                │ ← Shared by all threads
│         (Global variables)          │
├─────────────────────────────────────┤
│         Heap                        │ ← Shared by all threads
│         (Dynamic allocations)       │
├─────────────────────────────────────┤
│         Stack Segments              │ ← Separate for each thread
│  ┌─────────┬─────────┬─────────┐   │
│  │Thread 1 │Thread 2 │Thread 3 │   │
│  │ Stack   │ Stack   │ Stack   │   │
│  └─────────┴─────────┴─────────┘   │
└─────────────────────────────────────┘

Threading Philosophy

Threading follows the concurrent execution principle—enable multiple tasks to execute simultaneously while maintaining data consistency and system stability.

Threading Design Goals:

Concurrency: Enable parallel execution of multiple tasks
Efficiency: Minimize overhead of thread creation and management
Safety: Ensure thread-safe access to shared resources
Performance: Optimize for multi-core systems
Simplicity: Provide clear and intuitive programming model

🔧 POSIX Threads (pthread)

The Standard Threading Interface

POSIX Threads (pthread) is the standard threading interface for Unix-like systems, including Linux. It provides a comprehensive set of functions for thread creation, management, and synchronization.

POSIX Threads Philosophy

POSIX Threads follows the portability and consistency principle—provide a standard interface that works consistently across different systems while maintaining high performance and flexibility.

pthread Design Goals:

Portability: Work consistently across POSIX-compliant systems
Performance: Minimal overhead for thread operations
Flexibility: Support various threading models and patterns
Reliability: Robust error handling and resource management
Standards Compliance: Follow POSIX.1c thread standard

Core pthread Components

Thread Management:

pthread_create(): Create new threads
pthread_join(): Wait for thread completion
pthread_detach(): Make thread automatically clean up
pthread_exit(): Terminate thread execution

Synchronization:

pthread_mutex_t: Mutual exclusion locks
pthread_cond_t: Condition variables
pthread_sem_t: Semaphores
pthread_rwlock_t: Read-write locks

Thread Attributes:

pthread_attr_t: Thread creation attributes
Stack size and location: Control thread memory usage
Scheduling policy: Set thread scheduling parameters
Detach state: Control thread cleanup behavior

🔄 Thread Creation and Management

Creating and Managing Thread Lifecycles

Thread creation and management involves setting up threads, controlling their execution, and ensuring proper cleanup. Understanding these operations is essential for building robust multi-threaded applications.

Thread Creation Philosophy

Thread creation follows the controlled initialization principle—create threads with specific attributes and behaviors that match their intended purpose and system requirements.

Thread Creation Goals:

Efficiency: Minimize overhead of thread creation
Control: Set appropriate attributes for each thread
Resource Management: Control thread memory and scheduling
Error Handling: Handle creation failures gracefully
Cleanup: Ensure proper resource cleanup

Basic Thread Creation

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>
#include <unistd.h>
#include <string.h>

// Thread function signature: void *function_name(void *arg)
void *thread_function(void *arg) {
    int thread_id = *(int *)arg;
    
    printf("Thread %d starting\n", thread_id);
    
    // Simulate work
    for (int i = 0; i < 5; i++) {
        printf("Thread %d: iteration %d\n", thread_id, i);
        sleep(1);
    }
    
    printf("Thread %d completing\n", thread_id);
    
    // Return value (can be used by joining thread)
    int *result = malloc(sizeof(int));
    *result = thread_id * 100;
    return result;
}

// Basic thread creation example
int basic_thread_creation(void) {
    pthread_t threads[3];
    int thread_ids[3];
    void *thread_results[3];
    int i;
    
    printf("Main thread starting\n");
    
    // Create threads
    for (i = 0; i < 3; i++) {
        thread_ids[i] = i;
        
        if (pthread_create(&threads[i], NULL, thread_function, &thread_ids[i]) != 0) {
            perror("Failed to create thread");
            return -1;
        }
        
        printf("Created thread %d\n", i);
    }
    
    // Wait for all threads to complete
    for (i = 0; i < 3; i++) {
        if (pthread_join(threads[i], &thread_results[i]) != 0) {
            perror("Failed to join thread");
            return -1;
        }
        
        printf("Thread %d returned: %d\n", i, *(int *)thread_results[i]);
        free(thread_results[i]);  // Free returned memory
    }
    
    printf("All threads completed\n");
    return 0;
}

Thread Attributes and Control

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

// Thread function with attributes
void *attributed_thread_function(void *arg) {
    pthread_t self = pthread_self();
    
    // Get thread attributes
    int policy;
    struct sched_param param;
    
    if (pthread_getschedparam(self, &policy, &param) == 0) {
        printf("Thread %lu: policy=%d, priority=%d\n", 
               (unsigned long)self, policy, param.sched_priority);
    }
    
    // Simulate work
    sleep(2);
    
    return NULL;
}

// Thread creation with custom attributes
int attributed_thread_creation(void) {
    pthread_t thread;
    pthread_attr_t attr;
    struct sched_param param;
    
    // Initialize thread attributes
    if (pthread_attr_init(&attr) != 0) {
        perror("Failed to initialize thread attributes");
        return -1;
    }
    
    // Set stack size (1 MB)
    if (pthread_attr_setstacksize(&attr, 1024 * 1024) != 0) {
        perror("Failed to set stack size");
        pthread_attr_destroy(&attr);
        return -1;
    }
    
    // Set scheduling policy to SCHED_FIFO (real-time)
    if (pthread_attr_setschedpolicy(&attr, SCHED_FIFO) != 0) {
        perror("Failed to set scheduling policy");
        pthread_attr_destroy(&attr);
        return -1;
    }
    
    // Set priority
    param.sched_priority = 50;
    if (pthread_attr_setschedparam(&attr, &param) != 0) {
        perror("Failed to set scheduling parameters");
        pthread_attr_destroy(&attr);
        return -1;
    }
    
    // Set detach state to detached (auto-cleanup)
    if (pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED) != 0) {
        perror("Failed to set detach state");
        pthread_attr_destroy(&attr);
        return -1;
    }
    
    // Create thread with attributes
    if (pthread_create(&thread, &attr, attributed_thread_function, NULL) != 0) {
        perror("Failed to create thread");
        pthread_attr_destroy(&attr);
        return -1;
    }
    
    printf("Created detached thread with custom attributes\n");
    
    // Clean up attributes
    pthread_attr_destroy(&attr);
    
    // Wait a bit for thread to complete (since it's detached)
    sleep(3);
    
    return 0;
}

Thread Lifecycle Management

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>

// Global thread control
volatile int thread_running = 1;
pthread_t worker_thread;

// Worker thread function
void *worker_thread_function(void *arg) {
    printf("Worker thread starting\n");
    
    while (thread_running) {
        printf("Worker thread: processing...\n");
        sleep(1);
    }
    
    printf("Worker thread: shutdown signal received\n");
    return NULL;
}

// Signal handler for graceful shutdown
void signal_handler(int sig) {
    if (sig == SIGINT || sig == SIGTERM) {
        printf("Shutdown signal received\n");
        thread_running = 0;
    }
}

// Thread lifecycle management example
int thread_lifecycle_example(void) {
    void *thread_result;
    
    // Set up signal handling
    signal(SIGINT, signal_handler);
    signal(SIGTERM, signal_handler);
    
    printf("Creating worker thread\n");
    
    // Create worker thread
    if (pthread_create(&worker_thread, NULL, worker_thread_function, NULL) != 0) {
        perror("Failed to create worker thread");
        return -1;
    }
    
    printf("Worker thread created. Press Ctrl+C to stop\n");
    
    // Wait for shutdown signal
    while (thread_running) {
        sleep(1);
    }
    
    printf("Waiting for worker thread to complete\n");
    
    // Wait for thread to complete
    if (pthread_join(worker_thread, &thread_result) != 0) {
        perror("Failed to join worker thread");
        return -1;
    }
    
    printf("Worker thread completed\n");
    return 0;
}

🔒 Thread Synchronization

Coordinating Access to Shared Resources

Thread synchronization is essential for preventing race conditions and ensuring data consistency when multiple threads access shared resources. Understanding synchronization mechanisms is crucial for building reliable multi-threaded applications.

Synchronization Philosophy

Thread synchronization follows the safety and performance principle—ensure thread-safe access to shared resources while minimizing performance overhead and avoiding deadlocks.

Synchronization Goals:

Safety: Prevent race conditions and data corruption
Performance: Minimize synchronization overhead
Deadlock Prevention: Avoid circular waiting conditions
Scalability: Support increasing numbers of threads
Simplicity: Provide clear and intuitive synchronization primitives

Mutex-based Synchronization

Mutexes provide mutual exclusion, ensuring that only one thread can access a critical section at a time:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

// Shared data structure
struct shared_data {
    int counter;
    pthread_mutex_t mutex;
    char buffer[256];
};

// Thread function using mutex
void *mutex_thread_function(void *arg) {
    struct shared_data *data = (struct shared_data *)arg;
    int thread_id = *(int *)arg;
    
    for (int i = 0; i < 10; i++) {
        // Acquire mutex
        if (pthread_mutex_lock(&data->mutex) != 0) {
            perror("Failed to lock mutex");
            return NULL;
        }
        
        // Critical section
        data->counter++;
        snprintf(data->buffer, sizeof(data->buffer), 
                "Thread %d updated counter to %d", thread_id, data->counter);
        printf("%s\n", data->buffer);
        
        // Simulate work in critical section
        usleep(100000);  // 100ms
        
        // Release mutex
        if (pthread_mutex_unlock(&data->mutex) != 0) {
            perror("Failed to unlock mutex");
            return NULL;
        }
        
        // Non-critical work
        usleep(50000);  // 50ms
    }
    
    return NULL;
}

// Mutex synchronization example
int mutex_synchronization_example(void) {
    pthread_t threads[3];
    int thread_ids[3];
    struct shared_data shared_data = {0};
    
    // Initialize mutex
    if (pthread_mutex_init(&shared_data.mutex, NULL) != 0) {
        perror("Failed to initialize mutex");
        return -1;
    }
    
    printf("Starting mutex synchronization example\n");
    printf("Initial counter value: %d\n", shared_data.counter);
    
    // Create threads
    for (int i = 0; i < 3; i++) {
        thread_ids[i] = i;
        if (pthread_create(&threads[i], NULL, mutex_thread_function, &shared_data) != 0) {
            perror("Failed to create thread");
            return -1;
        }
    }
    
    // Wait for all threads to complete
    for (int i = 0; i < 3; i++) {
        if (pthread_join(threads[i], NULL) != 0) {
            perror("Failed to join thread");
            return -1;
        }
    }
    
    printf("Final counter value: %d\n", shared_data.counter);
    
    // Clean up mutex
    pthread_mutex_destroy(&shared_data.mutex);
    
    return 0;
}

Condition Variable Synchronization

Condition variables allow threads to wait for specific conditions to become true:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

// Producer-consumer data structure
struct producer_consumer {
    int buffer[10];
    int count;
    int head;
    int tail;
    pthread_mutex_t mutex;
    pthread_cond_t not_empty;
    pthread_cond_t not_full;
    int shutdown;
};

// Producer thread function
void *producer_function(void *arg) {
    struct producer_consumer *pc = (struct producer_consumer *)arg;
    int item = 0;
    
    while (!pc->shutdown) {
        // Acquire mutex
        pthread_mutex_lock(&pc->mutex);
        
        // Wait while buffer is full
        while (pc->count == 10 && !pc->shutdown) {
            printf("Producer: buffer full, waiting...\n");
            pthread_cond_wait(&pc->not_full, &pc->mutex);
        }
        
        if (pc->shutdown) {
            pthread_mutex_unlock(&pc->mutex);
            break;
        }
        
        // Add item to buffer
        pc->buffer[pc->tail] = item;
        pc->tail = (pc->tail + 1) % 10;
        pc->count++;
        
        printf("Producer: added item %d, count = %d\n", item, pc->count);
        
        // Signal consumer that buffer is not empty
        pthread_cond_signal(&pc->not_empty);
        
        // Release mutex
        pthread_mutex_unlock(&pc->mutex);
        
        item++;
        usleep(200000);  // 200ms
    }
    
    printf("Producer: shutting down\n");
    return NULL;
}

// Consumer thread function
void *consumer_function(void *arg) {
    struct producer_consumer *pc = (struct producer_consumer *)arg;
    
    while (!pc->shutdown) {
        // Acquire mutex
        pthread_mutex_lock(&pc->mutex);
        
        // Wait while buffer is empty
        while (pc->count == 0 && !pc->shutdown) {
            printf("Consumer: buffer empty, waiting...\n");
            pthread_cond_wait(&pc->not_empty, &pc->mutex);
        }
        
        if (pc->shutdown && pc->count == 0) {
            pthread_mutex_unlock(&pc->mutex);
            break;
        }
        
        // Remove item from buffer
        int item = pc->buffer[pc->head];
        pc->head = (pc->head + 1) % 10;
        pc->count--;
        
        printf("Consumer: removed item %d, count = %d\n", item, pc->count);
        
        // Signal producer that buffer is not full
        pthread_cond_signal(&pc->not_full);
        
        // Release mutex
        pthread_mutex_unlock(&pc->mutex);
        
        usleep(300000);  // 300ms
    }
    
    printf("Consumer: shutting down\n");
    return NULL;
}

// Condition variable example
int condition_variable_example(void) {
    pthread_t producer_thread, consumer_thread;
    struct producer_consumer pc = {0};
    
    // Initialize synchronization primitives
    if (pthread_mutex_init(&pc.mutex, NULL) != 0 ||
        pthread_cond_init(&pc.not_empty, NULL) != 0 ||
        pthread_cond_init(&pc.not_full, NULL) != 0) {
        perror("Failed to initialize synchronization primitives");
        return -1;
    }
    
    printf("Starting producer-consumer example\n");
    
    // Create producer and consumer threads
    if (pthread_create(&producer_thread, NULL, producer_function, &pc) != 0 ||
        pthread_create(&consumer_thread, NULL, consumer_function, &pc) != 0) {
        perror("Failed to create threads");
        return -1;
    }
    
    // Let threads run for a while
    sleep(5);
    
    // Signal shutdown
    pthread_mutex_lock(&pc.mutex);
    pc.shutdown = 1;
    pthread_cond_broadcast(&pc.not_empty);
    pthread_cond_broadcast(&pc.not_full);
    pthread_mutex_unlock(&pc.mutex);
    
    // Wait for threads to complete
    pthread_join(producer_thread, NULL);
    pthread_join(consumer_thread, NULL);
    
    printf("Producer-consumer example completed\n");
    
    // Clean up
    pthread_mutex_destroy(&pc.mutex);
    pthread_cond_destroy(&pc.not_empty);
    pthread_cond_destroy(&pc.not_full);
    
    return 0;
}

📡 Thread Communication

Thread communication involves sharing data and coordinating activities between threads. Understanding safe communication patterns is essential for building reliable multi-threaded applications.

Thread Communication Philosophy

Thread communication follows the safe sharing principle—enable efficient data sharing between threads while maintaining data consistency and preventing race conditions.

Communication Goals:

Efficiency: Minimize overhead of data sharing
Safety: Ensure thread-safe access to shared data
Clarity: Make communication patterns easy to understand
Performance: Optimize for common communication patterns
Reliability: Handle communication failures gracefully

Shared Memory Communication

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

// Shared data structure
struct shared_memory {
    int data[100];
    int data_count;
    pthread_mutex_t mutex;
    pthread_cond_t data_ready;
    int shutdown;
};

// Writer thread function
void *writer_function(void *arg) {
    struct shared_memory *shared = (struct shared_memory *)arg;
    
    for (int i = 0; i < 50; i++) {
        pthread_mutex_lock(&shared->mutex);
        
        // Add data to shared memory
        shared->data[shared->data_count] = i * 2;
        shared->data_count++;
        
        printf("Writer: added data %d, count = %d\n", i * 2, shared->data_count);
        
        // Signal that data is ready
        pthread_cond_signal(&shared->data_ready);
        
        pthread_mutex_unlock(&shared->mutex);
        
        usleep(100000);  // 100ms
    }
    
    // Signal shutdown
    pthread_mutex_lock(&shared->mutex);
    shared->shutdown = 1;
    pthread_cond_broadcast(&shared->data_ready);
    pthread_mutex_unlock(&shared->mutex);
    
    printf("Writer: completed\n");
    return NULL;
}

// Reader thread function
void *reader_function(void *arg) {
    struct shared_memory *shared = (struct shared_memory *)arg;
    int last_read = 0;
    
    while (!shared->shutdown || last_read < shared->data_count) {
        pthread_mutex_lock(&shared->mutex);
        
        // Wait for new data
        while (last_read >= shared->data_count && !shared->shutdown) {
            printf("Reader: waiting for data...\n");
            pthread_cond_wait(&shared->data_ready, &shared->mutex);
        }
        
        // Read available data
        while (last_read < shared->data_count) {
            printf("Reader: read data %d\n", shared->data[last_read]);
            last_read++;
        }
        
        pthread_mutex_unlock(&shared->mutex);
        
        usleep(150000);  // 150ms
    }
    
    printf("Reader: completed\n");
    return NULL;
}

// Shared memory communication example
int shared_memory_communication_example(void) {
    pthread_t writer_thread, reader_thread;
    struct shared_memory shared = {0};
    
    // Initialize synchronization primitives
    if (pthread_mutex_init(&shared.mutex, NULL) != 0 ||
        pthread_cond_init(&shared.data_ready, NULL) != 0) {
        perror("Failed to initialize synchronization primitives");
        return -1;
    }
    
    printf("Starting shared memory communication example\n");
    
    // Create writer and reader threads
    if (pthread_create(&writer_thread, NULL, writer_function, &shared) != 0 ||
        pthread_create(&reader_thread, NULL, reader_function, &shared) != 0) {
        perror("Failed to create threads");
        return -1;
    }
    
    // Wait for threads to complete
    pthread_join(writer_thread, NULL);
    pthread_join(reader_thread, NULL);
    
    printf("Shared memory communication example completed\n");
    printf("Total data items processed: %d\n", shared.data_count);
    
    // Clean up
    pthread_mutex_destroy(&shared.mutex);
    pthread_cond_destroy(&shared.data_ready);
    
    return 0;
}

🛡️ Thread Safety and Best Practices

Building Reliable Multi-threaded Applications

Thread safety involves designing code that can be safely executed by multiple threads simultaneously. Following best practices is essential for building reliable and maintainable multi-threaded applications.

Thread Safety Philosophy

Thread safety follows the defensive programming principle—assume that code will be executed by multiple threads and design it to handle concurrent access safely.

Thread Safety Goals:

Correctness: Ensure correct behavior under all thread interleavings
Performance: Minimize synchronization overhead
Maintainability: Make thread-safe code easy to understand and modify
Reliability: Prevent race conditions and deadlocks
Scalability: Support increasing numbers of threads

Thread Safety Patterns

Immutable Data:

// Thread-safe immutable data structure
struct immutable_config {
    const int max_threads;
    const char *log_level;
    const double timeout;
};

// Global immutable configuration
static const struct immutable_config config = {
    .max_threads = 10,
    .log_level = "INFO",
    .timeout = 5.0
};

// Thread function using immutable data
void *thread_function(void *arg) {
    printf("Thread using config: max_threads=%d, log_level=%s, timeout=%.1f\n",
           config.max_threads, config.log_level, config.timeout);
    
    // No synchronization needed - data is immutable
    return NULL;
}

Thread-Local Storage:

#include <pthread.h>

// Thread-local storage key
static pthread_key_t thread_local_key;

// Thread-local data structure
struct thread_local_data {
    int thread_id;
    char thread_name[64];
    int local_counter;
};

// Destructor function for thread-local data
void thread_local_destructor(void *value) {
    struct thread_local_data *data = (struct thread_local_data *)value;
    printf("Cleaning up thread-local data for thread %d\n", data->thread_id);
    free(data);
}

// Initialize thread-local storage
int init_thread_local_storage(void) {
    if (pthread_key_create(&thread_local_key, thread_local_destructor) != 0) {
        perror("Failed to create thread-local key");
        return -1;
    }
    return 0;
}

// Get thread-local data (creates if doesn't exist)
struct thread_local_data *get_thread_local_data(void) {
    struct thread_local_data *data = pthread_getspecific(thread_local_key);
    
    if (data == NULL) {
        // Create new thread-local data
        data = malloc(sizeof(struct thread_local_data));
        if (data == NULL) {
            perror("Failed to allocate thread-local data");
            return NULL;
        }
        
        data->thread_id = (int)(long)pthread_self();
        snprintf(data->thread_name, sizeof(data->thread_name), "Thread_%d", data->thread_id);
        data->local_counter = 0;
        
        pthread_setspecific(thread_local_key, data);
    }
    
    return data;
}

Thread Safety Best Practices

1. Minimize Shared State:

// Good: Minimal shared state
struct thread_safe_counter {
    pthread_mutex_t mutex;
    int count;
};

// Bad: Excessive shared state
struct thread_unsafe_counter {
    int count;
    char description[256];
    double last_update;
    int update_count;
    // ... many more fields
};

2. Use Appropriate Synchronization:

// Good: Use mutex for complex critical sections
pthread_mutex_lock(&mutex);
// Complex operation on shared data
pthread_mutex_unlock(&mutex);

// Good: Use atomic operations for simple operations
__sync_fetch_and_add(&counter, 1);

// Bad: No synchronization
shared_variable++;  // Race condition!

3. Avoid Deadlocks:

// Good: Consistent lock ordering
void safe_operation(struct data *data1, struct data *data2) {
    if (data1 < data2) {
        pthread_mutex_lock(&data1->mutex);
        pthread_mutex_lock(&data2->mutex);
    } else {
        pthread_mutex_lock(&data2->mutex);
        pthread_mutex_lock(&data1->mutex);
    }
    
    // Critical section
    
    pthread_mutex_unlock(&data1->mutex);
    pthread_mutex_unlock(&data2->mutex);
}

// Bad: Potential deadlock
void unsafe_operation(struct data *data1, struct data *data2) {
    pthread_mutex_lock(&data1->mutex);
    pthread_mutex_lock(&data2->mutex);  // Could deadlock!
    
    // Critical section
    
    pthread_mutex_unlock(&data2->mutex);
    pthread_mutex_unlock(&data1->mutex);
}

🚀 Advanced Threading Techniques

Beyond Basic Threading

Advanced threading techniques involve sophisticated patterns and optimizations for high-performance multi-threaded applications. These techniques are essential for building scalable embedded systems.

Thread Pools

Thread pools manage a fixed number of worker threads to handle tasks efficiently:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

// Task structure
struct task {
    void (*function)(void *);
    void *arg;
    struct task *next;
};

// Thread pool structure
struct thread_pool {
    pthread_t *threads;
    int thread_count;
    struct task *task_queue;
    pthread_mutex_t queue_mutex;
    pthread_cond_t queue_condition;
    int shutdown;
    int active_threads;
};

// Create thread pool
struct thread_pool *thread_pool_create(int thread_count) {
    struct thread_pool *pool = malloc(sizeof(struct thread_pool));
    if (!pool) return NULL;
    
    pool->threads = malloc(thread_count * sizeof(pthread_t));
    if (!pool->threads) {
        free(pool);
        return NULL;
    }
    
    pool->thread_count = thread_count;
    pool->task_queue = NULL;
    pool->shutdown = 0;
    pool->active_threads = 0;
    
    if (pthread_mutex_init(&pool->queue_mutex, NULL) != 0 ||
        pthread_cond_init(&pool->queue_condition, NULL) != 0) {
        free(pool->threads);
        free(pool);
        return NULL;
    }
    
    // Create worker threads
    for (int i = 0; i < thread_count; i++) {
        if (pthread_create(&pool->threads[i], NULL, worker_thread_function, pool) != 0) {
            thread_pool_destroy(pool);
            return NULL;
        }
    }
    
    return pool;
}

// Worker thread function
void *worker_thread_function(void *arg) {
    struct thread_pool *pool = (struct thread_pool *)arg;
    
    while (1) {
        pthread_mutex_lock(&pool->queue_mutex);
        
        // Wait for tasks
        while (pool->task_queue == NULL && !pool->shutdown) {
            pthread_cond_wait(&pool->queue_condition, &pool->queue_mutex);
        }
        
        if (pool->shutdown && pool->task_queue == NULL) {
            pthread_mutex_unlock(&pool->queue_mutex);
            break;
        }
        
        // Get task from queue
        struct task *task = pool->task_queue;
        pool->task_queue = task->next;
        
        pthread_mutex_unlock(&pool->queue_mutex);
        
        // Execute task
        task->function(task->arg);
        free(task);
    }
    
    return NULL;
}

// Submit task to thread pool
int thread_pool_submit(struct thread_pool *pool, void (*function)(void *), void *arg) {
    struct task *task = malloc(sizeof(struct task));
    if (!task) return -1;
    
    task->function = function;
    task->arg = arg;
    task->next = NULL;
    
    pthread_mutex_lock(&pool->queue_mutex);
    
    // Add task to queue
    if (pool->task_queue == NULL) {
        pool->task_queue = task;
    } else {
        struct task *current = pool->task_queue;
        while (current->next != NULL) {
            current = current->next;
        }
        current->next = task;
    }
    
    pthread_cond_signal(&pool->queue_condition);
    pthread_mutex_unlock(&pool->queue_mutex);
    
    return 0;
}

Lock-free Programming

Lock-free programming uses atomic operations to achieve synchronization without locks:

#include <stdint.h>
#include <stdatomic.h>

// Lock-free stack implementation
struct lock_free_stack_node {
    void *data;
    struct lock_free_stack_node *next;
};

struct lock_free_stack {
    _Atomic(struct lock_free_stack_node *) head;
};

// Push operation (lock-free)
void lock_free_stack_push(struct lock_free_stack *stack, void *data) {
    struct lock_free_stack_node *new_node = malloc(sizeof(struct lock_free_stack_node));
    new_node->data = data;
    
    struct lock_free_stack_node *old_head;
    do {
        old_head = atomic_load(&stack->head);
        new_node->next = old_head;
    } while (!atomic_compare_exchange_weak(&stack->head, &old_head, new_node));
}

// Pop operation (lock-free)
void *lock_free_stack_pop(struct lock_free_stack *stack) {
    struct lock_free_stack_node *old_head;
    struct lock_free_stack_node *new_head;
    
    do {
        old_head = atomic_load(&stack->head);
        if (old_head == NULL) return NULL;
        
        new_head = old_head->next;
    } while (!atomic_compare_exchange_weak(&stack->head, &old_head, new_head));
    
    void *data = old_head->data;
    free(old_head);
    return data;
}

📊 Performance and Debugging

Optimizing and Troubleshooting Multi-threaded Applications

Performance optimization and debugging are crucial for building efficient multi-threaded applications. Understanding how to measure performance and identify problems is essential for embedded development.

Performance Measurement

Thread Profiling:

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>

// Performance measurement structure
struct thread_performance {
    pthread_t thread_id;
    struct timeval start_time;
    struct timeval end_time;
    int work_count;
};

// Get current time in microseconds
long long get_time_us(void) {
    struct timeval tv;
    gettimeofday(&tv, NULL);
    return tv.tv_sec * 1000000LL + tv.tv_usec;
}

// Thread function with performance measurement
void *performance_thread_function(void *arg) {
    struct thread_performance *perf = (struct thread_performance *)arg;
    
    gettimeofday(&perf->start_time, NULL);
    
    // Simulate work
    for (int i = 0; i < 1000000; i++) {
        perf->work_count++;
        // Some computation
        volatile int x = i * i;
        (void)x;  // Suppress unused variable warning
    }
    
    gettimeofday(&perf->end_time, NULL);
    
    return NULL;
}

// Performance measurement example
int performance_measurement_example(void) {
    const int thread_count = 4;
    pthread_t threads[thread_count];
    struct thread_performance perf[thread_count];
    
    printf("Starting performance measurement with %d threads\n", thread_count);
    
    // Create threads
    for (int i = 0; i < thread_count; i++) {
        perf[i].work_count = 0;
        if (pthread_create(&threads[i], NULL, performance_thread_function, &perf[i]) != 0) {
            perror("Failed to create thread");
            return -1;
        }
    }
    
    // Wait for threads to complete
    for (int i = 0; i < thread_count; i++) {
        pthread_join(threads[i], NULL);
    }
    
    // Calculate and display performance metrics
    long long total_time = 0;
    int total_work = 0;
    
    for (int i = 0; i < thread_count; i++) {
        long long start_us = perf[i].start_time.tv_sec * 1000000LL + perf[i].start_time.tv_usec;
        long long end_us = perf[i].end_time.tv_sec * 1000000LL + perf[i].end_time.tv_usec;
        long long duration = end_us - start_us;
        
        printf("Thread %d: %lld us, %d work items, %.2f work/us\n",
               i, duration, perf[i].work_count, (double)perf[i].work_count / duration);
        
        total_time += duration;
        total_work += perf[i].work_count;
    }
    
    printf("Total: %lld us, %d work items, %.2f work/us\n",
           total_time, total_work, (double)total_work / total_time);
    
    return 0;
}

Common Threading Problems and Solutions

1. Race Conditions:

// Problem: Race condition
int counter = 0;

void *unsafe_thread_function(void *arg) {
    for (int i = 0; i < 1000; i++) {
        counter++;  // Race condition!
    }
    return NULL;
}

// Solution: Use mutex
pthread_mutex_t counter_mutex = PTHREAD_MUTEX_INITIALIZER;

void *safe_thread_function(void *arg) {
    for (int i = 0; i < 1000; i++) {
        pthread_mutex_lock(&counter_mutex);
        counter++;
        pthread_mutex_unlock(&counter_mutex);
    }
    return NULL;
}

2. Deadlocks:

// Problem: Potential deadlock
void unsafe_function(pthread_mutex_t *mutex1, pthread_mutex_t *mutex2) {
    pthread_mutex_lock(mutex1);
    pthread_mutex_lock(mutex2);  // Could deadlock!
    
    // Critical section
    
    pthread_mutex_unlock(mutex2);
    pthread_mutex_unlock(mutex1);
}

// Solution: Consistent lock ordering
void safe_function(pthread_mutex_t *mutex1, pthread_mutex_t *mutex2) {
    if (mutex1 < mutex2) {
        pthread_mutex_lock(mutex1);
        pthread_mutex_lock(mutex2);
    } else {
        pthread_mutex_lock(mutex2);
        pthread_mutex_lock(mutex1);
    }
    
    // Critical section
    
    pthread_mutex_unlock(mutex1);
    pthread_mutex_unlock(mutex2);
}

🎯 Conclusion

Multi-threading provides a powerful mechanism for building concurrent applications in Linux. Understanding pthread programming, thread synchronization, and best practices is essential for creating reliable and efficient embedded systems.

Key Takeaways:

Threads provide lightweight concurrency within a single process
POSIX Threads (pthread) offer a standard interface across systems
Proper synchronization prevents race conditions and ensures data consistency
Thread safety requires careful design and defensive programming
Advanced techniques like thread pools and lock-free programming optimize performance
Performance measurement and debugging are crucial for optimization

The Path Forward:

As embedded systems become more complex and multi-core processors become standard, the importance of multi-threading skills will only increase. Modern systems continue to evolve, providing new threading primitives and optimization techniques that enable more powerful and efficient concurrent applications.

Remember: Multi-threading is not just about creating threads—it’s about understanding how to coordinate concurrent execution, manage shared resources safely, and build applications that can efficiently utilize multiple CPU cores. The skills you develop here will serve you throughout your embedded systems career, enabling you to create robust, efficient, and scalable systems.