Contents |
1. Threads in UNIX |
With the UNIX thread interface, an application has one or more execution streams within the same process. Each execution stream is a thread of execution, or simply a thread. A UNIX process with a single thread can create additional threads within the same process address space.
A thread separates a process's sequential execution stream from its other resources. The resources are shared between the multiple execution streams. The threads execute as concurrent execution streams sharing the same address space performing tasks associated with the desired services.
To support separate execution streams, each thread requires a piece of executable code with a stack and data. The user level stack provides for per-thread local variable administration. From the OS perspective, each of the executing threads must have a separate kernel stack and process context. This will allow each thread to execute system calls, and to service interrupts and page faults independently.
Collectively, the threads executing in the same address space are called sibling threads. The term is not same as a parent and child relationship but more of a peer relationship.
User supported threads are represented by data structures within a process's own address space. These threads don't require direct support from the OS. A user level thread has the potential to execute only when associated with a kernel process. The implementation must therefore multiplex a user level thread on to a kernel process for execution and later preempt it in favor of running a sibling thread. The advantage of this implementation is that this does not require the process to change its mode from user to kernel in order to schedule multiplexed threads. This can permit thread creation, scheduling and eventual termination to complete with better performance than an equivalent kernel supported thread implementation.
A process can use one kernel thread per user level thread. The kernel thread is a light weight process (LWP). A LWP can be thought of as a virtual CPU which is available for executing code or system calls. Each LWP is dispatched separately by the kernel, may perform independent system calls, incur independent page faults and may also run in parallel on a multiprocessor architecture. The user level thread is bound to the kernel LWP. System scheduling, dispatching and execution of kernel LWPs will result in the execution of the bound user level thread. The disadvantage of bound user threads is that each user level thread requires the creation of a kernel LWP. This consumes the kernel processing time. Similarly, when the user level thread blocks then the kernel LWP also blocks and cannot do anything else.
The default thread creation assumes a multiplexed thread and allocates a data structure in the process's user address space to represent the newly created thread. User level scheduling of the thread will map it to an available kernel LWP for execution. The OS will schedule the kernel LWP based on the characteristics of the mapped thread. Some implementations maintain a group of LWPs for this purpose called, a thread pool.
With multiplexed threads implemented at the user level, the UNIX process can have more threads than the available kernel LWPs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The daemon thread attribute modifies how a process termination occurs. A UNIX process will terminate when the last thread exits or the process voluntarily terminates by calling the exit() function. The daemon attribute affects the first situation and permits the UNIX process to terminate when the last non-daemon thread exits. When this happens, the UNIX process will silently terminate the daemon threads.
The detached thread attribute modifies the availability of the thread's termination status. It indicates that the application program does not require the exit status of the detached thread.
When a thread modifies the value of a global variable, the new value is immediately visible to all the siblings. Also if one thread issues a chdir() function, all threads effectively change their current working directory.
Shared Resources | Private Resources |
Global Memory | Thread Identifier |
Current Working Directory | Interval Timers |
User Identification | Signal Mask |
File Descriptors | Registers |
Shell Environment | Errno |
Umask | Instruction Pointer |
Thread specific data is different from thread private data. The UNIX threads implementation defines the thread private data, while the application defines the thread specific data.
For example an application designed with multiple threads can use a thread specific global error variable to report error conditions.
From the OS point of view, the process has one or more kernel LWPs that provide the execution for the threads. The threads are either multiplexed to these LWPs or are bound to them.
Multiplexed threads can execute only when mapped on to a kernel LWP. The mapping service will propagate the characteristics of the thread to the kernel LWP. The OS uses these characteristics to schedule and subsequently dispatch the kernel LWP for execution.
When the kernel LWP issues a kernel call on behalf of the multiplexed thread, the thread remains bound to the LWP until the kernel completes the call. If the kernel call blocks, the multiplexed thread along with its kernel LWP will also block.
If the kernel LWP from the pool is preempted by the OS or it voluntarily gives up the processor, the state of the thread is saved in the UNIX process's user address space. A new thread can then be mapped to the kernel LWP.
There is in general, no way to predict how the instructions of different threads are interleaved.
All runnable, unbound threads are on a user level, prioritized dispatch queue. Thread priorities range from 0 to infinity (the maximum number representable by 32 bits). A thread's priority can be changed only by the thread itself or by another thread in the same process. The priority of an unbound thread is known only to the user level scheduler and not to the kernel.
An LWP in the pool is either idling or running a thread. When a thread is idle, it waits on a synchronization variable. When a thread is made runnable, it is added to a dispatch queue and an idle LWP from the pool is awakened by signaling the synchronization variable. The LWP after waking up, switches to the highest priority thread on the dispatch queue. In the course of its execution, if the running thread blocks on a synchronization variable, e.g. mutex lock, the running LWP puts it in the sleep queue and switches to the highest priority thread in the dispatch queue. If the dispatch queue is empty, the LWP goes back to its idle state in the LWP pool. Threads in the sleep queue become runnable when their synchronization locks are freed.When a thread becomes runnable, it is put back in the dispatch queue. If all the LWPs in the pool are busy when a thread becomes runnable, it remains on the dispatch queue waiting for an LWP to become available. An LWP is made available either when a new one is added to the pool or when one of the running threads blocks exits or is preempted.
When a bound thread blocks on a process local synchronization variable, its LWP must also stop running. It does so by waiting on a LWP semaphore associated with the thread. The LWP is now parked. When the bound thread unblocks, the parking semaphore is signaled so that the LWP can continue executing the thread.
When an unbound thread becomes blocked and there are no more runnable threads, the LWP that was running the thread also parks itself on the thread's semaphore rather than idling on the idle stack and the global condition variable. This is done to optimize the case where the blocked thread becomes runnable quickly and thus avoiding the context switch to idle stack and back to the same thread.
2. Using Threads |
#include <thread.h>
int thr_create(
|
The size of the thread stack is specified by the stack_size parameter. A stack size of zero will result in a default stack size determined by the thread library. A specific stack size can be requested by specifying the stack_size as a non-zero value. This value is used by the library to initialize the stack. When using a user defined stack, the stack_size specifies the number of bytes allocated. In specifying the size of the stack, stack_size must be greater than or equal to the minimum stack size required. If not so, the thr_create() function will fail and return with an EINVAL error condition.
STACK ADDRESS | STACK SIZE | RESULTS |
NULL | 0 | A default stack allocation will be used and the stack will have a default size. |
NULL | > Minimum Stack size | A default stack allocation will be used to create a stack with the minimum size required by the implementation. |
<ADDRESS> | 0, >= Minimum Stack Size | The stack for the new thread begins at the specified address and is greater than or equal to the minimum stack size required by the implementation. |
The void *arg is the pointer to the list of arguments which have to be passed on to the newly created thread. It can be any kind of pointer from a string to an array. It is directly accessible from the new thread and can be deciphered as wished by the internal code of the thread. Because of the pointer being of type void, it may have to be type casted before use.
The default action of the thr_create() function is to create a multiplexed thread and to transition the thread to a runnable state. This default behavior can be modified through the flags parameter, which is a bitwise OR of zero or more of the available flags shown in the above table.
THR_SUSPENDED | When the thread is created, it enters the RUNNABLE state by default. With this flag set, the new thread will enter the SUSPENDED state. The thread will not begin execution till an explicit call to thr_continue() is made by a sibling thread. Creating a thread in this mode allows the application to modify the scheduling parameters and other resources associated with the new thread prior to the execution of the start_routine. |
THR_DETACHED | The exit status of the created thread will not be available to the sibling threads. This permits the UNIX thread library to reuse all of the thread's resources immediately after it terminates. |
THR_NEW_LWP | In addition to creating a new thread, a new LWP is to be created and added to the pool of available LWPs. The thr_create function does not create an LWP but merely requests an extra LWP in the LWP pool. The request may never be fulfilled. |
THR_BOUND | The thread is bound to a new kernel LWP. Both, the thread and the kernel LWP, will be created as a result of the call. The thread will be bound only to this LWP and cannot execute on another LWPs. Similarly, this LWP will execute only this thread and not any other. The thread is said to be permanently bound to the LWP. |
THR_DAEMON | The created thread is a daemon thread. A UNIX process will not exit until all non-daemon threads have exited, an explicit exit() has been called or the initial thread completes without calling thr_exit(). The UNIX thread library will cause the process to exit if the only remaining threads are daemon threads. |
When a new has successfully been created, its thread identifier will be stored in the location pointed at by the new_thread parameter. If the new_thread parameter is specified as a NULL value, then the new thread identifier is not returned to the calling thread.
EAGAIN | A system limit is exceeded, e.g., too many LWPs were created. |
ENOMEM | Not enough memory was available to create the new thread. |
EINVAL | stack_base is not NULL and stack_size is less than the value returned by thr_min_stack(). |
EINVAL | stack_base is NULL and stack_size is not zero and is less than the value returned by thr_min_stack(). |
#include <thread.h>
#include <stdio.h> void *function(void *arg); int main()
/* Create thread
using default stack and size */
address=(void *) malloc(1, 2048); /* Create thread
using a defined stack and size */
if( error != 0 ) return; /* Create a bound
thread and request for a new LWP. Pass the string str
printf(" The thread
id is: %d\n", id);
void *function(void *arg)
|
#include <thread.h>
size_t thr_min_stack(void); |
/* Rest of code */
thr_create(NULL, thr_min_stack() + 2048, function, NULL, 0, NULL); /* Rest of code */ |
#include <thread.h>
thread_t thr_self(void); |
/* Rest of code */
printf(" The thread id of this thread is : %d\n", thr_self); /* Rest of code */ |
NOTE : Only threads created without the THR_DETACHED flag can be joined.
#include <thread.h>
int thr_join(
|
The terminating thread's identifier is returned in the location pointed to by the departed_thread parameter. If *departed_thread is not NULL, it points to a location that is set to the ID of the terminated thread if thr_join() returns successfully.
If status is not NULL, it points to a location that is set to the exit status of the terminated thread if thr_join() returns successfully.
Multiple threads cannot wait for the same thread to terminate. One thread will return successfully and the others will fail with a ESRCH return value.
ESRCH | wait_for is not a valid, undetached thread in the current process. |
EDEADLK | wait_for specifies the calling thread. |
/* Rest of code */
thread_t Tid; void *status = NULL; thr_create( NULL, 0, function, arg, 0, &Tid); /* Create a thread and get its TID */ /* Wait for the thread to join and
get status */
if( (error == 0) && (status != NULL) ) /* Print message if successful return */ printf("Thread %d has successfully joined\n", Tid); /* Rest of code */ |
NOTE : Although the two things are similar, it is considered good programming practice to end your threads by executing a thr_exit() function.
Involuntary thread termination is the result of a UNIX process terminating. Regardless of the UNIX process terminating by returning from main(), calling an explicit exit() or as a result of an exception handling routine, the UNIX threads associated with the process are considered to be involuntarily terminated. If a thread spawns new threads and wants all of them to complete, then it should exit with a thr_exit() function call. This will ensure that the terminating thread waits for all non-daemon threads to finish before returning.
#include <thread.h>
void thr_exit( void *status ); |
All thread specific data bindings are deleted during the termination process. Default user level stacks are reclaimed by the UNIX threads library.
NOTE : If the exiting thread was created with a user defined stack, then it is the application's responsibility to free this memory or to reuse it.
In case of voluntary thread termination, the thr_exit() function transits the exiting thread from the ON PROCESSOR state to the ZOMBIE state. The UNIX thread library retains minimal information regarding the thread, such as TID and termination status value. If the terminating thread has a DETACHED attribute, then all information regarding the thread is released.
When a non-detached thread calls the thr_exit() function, or terminates by returning from its start_routine and a sibling thread is waiting on its termination, then the sibling thread will complete its thr_join() call.
The exiting thread can specify an exit status by setting the parameter status. This status will then be returned to the thread calling the thr_join() function. A thread returning from the start routine without an explicit call to the thr_exit() function will have its exit status set to the value returned by the start_routine. The last non-detached thread to terminate causes the UNIX process to terminate. The exiting status of this thread is passed to the exit() function as the status value for the terminating UNIX process.
/* Rest of the code */
int *status; if( error == 0)
/* Rest of the code */ |
The thr_getconcurrency() function call returns the minimum number of kernel LWPs in the multiplexing LWP pool. If the concurrency has not been set before, then this function returns a zero value to denote that the concurrency is being controlled automatically by the UNIX threads library. If the concurrency level has been explicitly adjusted before, then the function returns the current concurrency value.
#include <thread.h>
int thr_getconcurrency( void ); |
/* Rest of the code */
printf("The current concurrency level is: %d\n", thr_getconcurrency() ); /* Rest of the code */ |
#include <thread.h>
int thr_setconcurrency( int new_level ); |
On the other extreme, if the requested size would be below the number of user-level threads in the RUNNABLE state, then the UNIX threads library will lower the number of kernel-level LWPs in the multiplexing LWP pool to that value. This way the size of the multiplexing LWP pool can be lowered down to the size requested by the application.
void CheckConcurrency( void
)
{ int concurrency_level = 0; concurrency_level = thr_getconcurrency(); ++concurrency_level; thr_setconcurrency( concurrency_level ); } |
When the thread is first created , the THR_SUSPEND flag may be specified to immediately transition the new thread to the suspended state.
The other method is to explicitly call the thr_suspend() function to suspend the execution of the executing thread.
#include <thread.h>
int thr_suspend( thread_t target_thread ); |
Some implementations of the UNIX threads interface use a notification scheme to alert the to alert the target thread of a pending suspension. The requesting thread does not directly suspend the target thread. Instead, it notifies the target thread that its suspension is pending. The requesting thread will then wait for the target thread to execute. When the target thread transitions to the ON_PROCESSOR state, it will see the pending suspension request and will release any critical implementation dependent resources it is holding. The target thread will suspend itself, and will transition from the ON_PROCESSOR state to the SUSPENDED state.
A multithreaded application can have several threads trying to suspend the same target thread. The implementation of the UNIX threads library permits only one of the requesting threads to send the suspension request to the target thread. The other threads will recognize the suspension request and will not issue any of their own. They will simply wait till the target thread goes to the SUSPENDED state. All requesting threads will be blocked till the suspension.
A bound thread is suspended through the kernel-level LWP function and not thr_suspend(). This is done since the LWP is not used for any other thread, and thus cannot perform any useful work when the thread is suspended. It transitions from the ON_PROCESSOR state to the SUSPENDED state.
If the target thread is a multiplexed thread, then the target thread must first transition to the ON_PROCESSOR state before it can be transitioned to the SUSPENDED state. Thus if the target thread is currently in the SLEEP state, it must first transition to the RUNNABLE state and then to the ON_PROCESSOR state before going to the SUSPENDED state. Unlike a bound thread, now the free LWP will be made available to other multiplexed threads.
ESRCH | target_thread cannot be found in the current process. |
#include <thread.h>
int thr_continue( thread_t target_thread ); |
If the target_thread is a multiplexed thread then it will be transitioned from the SUSPENDED state to the RUNNABLE state.
If the target_thread is a bound thread then it will be transitioned from the SUSPENDED state directly to the ON_PROCESSOR state.
ESRCH | target_thread cannot be found in the current process. |
#include <thread.h>
void thr_yield( void ); |
If a bound thread calls this function then it yields the processor to another thread with equal or higher priority.