THREAD SCHEDULING IN KAAPI
The execution of a KAAPI program on a cluster is done by a dynamic set of processes, one per multi-processor, communicating through a network. Each process has several threads of control to execute tasks. In KAAPI a thread represents a computation.
A thread could be an active control flow executing some computation or are inactive. At deployment step, one of the processes is designed to be the main process which starts the thread that executes the main task. This thread is called the main thread of the distributed execution. Next sections present the thread interface and the non-preemptive scheduling algorithm by work stealing. It shows connection from the KAAPI thread to data flow computation.
THREAD
Thread interface in KAAPI is a
subset of POSIX threads except for the functionalities about the management of
signal and cleanup routines. We assume that readers are familiar with POSIX
Thread and its interface. The interface for thread management is closed to the
POSIX threads: A thread may be created, but not joined; mutex and condition are
the first class objects for thread synchronization. KAAPI extends the POSIX
interface in two ways.
An extension of POSIX condition with
additional integer value state is proposed. It allows threads to wait until a
specific value is signaled: The thread is woken-up when the signaled value of
the condition is equal or not equal to the signaled value. This functionality
is closed to one of those Futex provides.
The second extension concerns two new thread attributes. The first one is used to define functions called by the thread scheduler to extract work when one of the processors is idle, we named it the steal function attribute. The second attribute allows to specify two functions used to serialize thread through the network, it is named the serialization attribute. Threads with serialization attribute may be migrated2 between processes on some predefined points.
It illustrates the creation of a KAAPI thread with the use of thread attribute. The steal function attribute parameters are: The first parameter is the KAAPI thread identifier on which the function is called; the second parameter is a pointer to a KAAPI thread identifier used to return a newly created thread that executes theft work; the third parameter contains the identifier of the idle virtual processor (also named kernel thread) that initiated the call. And the last parameter is a pointer to a user defined data structure. The return value is true if and only if a new thread has been created. Next section presents the thread scheduling algorithm based on work-stealing. 3.2 Scheduling by work-stealing The scheduling of KAAPI threads on the physical processors is a two level thread scheduling.
The KAAPI runtime system schedules non-preemptively several user level threads on a fixed number of virtual processors (also called kernel threads). Virtual processors are scheduled by the kernel on physical processors. The KAAPI threads model is M : N, meaning that M user space threads are scheduled onto N kernel threads.
This two thread scheduling levels take advantage of fast user space context switch between KAAPI threads while multi-processors are exploited with kernel threads. Each KAAPI user level thread runs to completion, i.e. in a non-preemptive manner. If a thread executes a blocking instruction, such as locking an already locked mutex or waiting for a condition variable, then it suspends: The kernel thread (or the virtual processor) is said idle and it switches to the scheduler thread.
The scheduler thread is responsible to assign thread to idle virtual processor. The algorithm is simple. First, it tries to find a suspended thread ready for execution, like a standard thread scheduler. In case of success, the elected thread resumes its execution on the idle kernel thread. Else, the scheduler thread chooses at random a thread and call, if specified, the steal function attribute, as defined in previous section.
If the function returns true then the newly created thread is scheduled on the kernel thread which becomes active. On multi-cores / multi-processors, each core should has a kernel thread for scheduling work: the work-stealing scheduling algorithm is naturally distributed and the implementation has been carefully designed to avoid contention (for instance due to mutex). It shows good speedup even at fine grain.
The scheduler thread is extended when group of processes execute the application. In case of failure, the kernel thread chooses at random a process and sends it request to steal work from its threads. On reception, the process trigger the request to try to steal work by calling the steal function attribute to several victim threads that have a serialization attribute. The reply message contains the output values, i.e. return value and the newly created thread.
To improve locality of the work-stealing decision, several steal operations on the local process are performed before emitting request to remote processes. This thread scheduling principle by work-stealing is the kernel of implementation in, where the steal function attribute could be specialized to the work that is theft. If a KAAPI thread performs blocking I/O, the kernel thread that executes it is also blocked. In, kernel scheduler activation is introduced to improve performance of the user level application.
This section presents how KAAPI deals with communication between processes using non-blocking, one-side, active message communications. On Unix platform (mostly Linux and MacOSX), the implementation relies on user space context switch, either by using make context/set context/ get context/ swap context, or the more available interface set jmp/ long jmp. Kernel threads are implemented by the native thread library Using KAAPI thread for data flow computation.
The high level API creates KAAPI threads that execute tasks following the reference order. Created threads have both the steal function attribute and the serialization attribute enabled. The steal function attribute is the most interesting function. It implements most of the work first principle of KAAPI design.
The thread changes the closure state to execute from created to running, pops it and executes it. This transition in the closure state is atomic with respect to other transition, such as performed by the steal function call. It is implemented using a compare and swap instruction. Moreover, it does not require the computation of the readiness of the closure thank to the reference order semantics.
3 Comments
Scheduling algorithm are very important for managing resources
ReplyDeleteKAAPI and POSIX is very well described
ReplyDeleteVery well 👏
ReplyDelete