Implement a prototype of a plan based scheduler in the current linux kernel version
Analyse conditions of the plan (how many idle times, how are they distributed)
Weekly Status
Week 1 (CW 45)
Activities
Research on the architecture of current linux versions
Create a development infrastructure
Start implementation of the scheduler
Results
Created Qemu-Image, Scripts that build and execute the kernel
Implemented prototype scheduler that switches after a certain time from PB-Scheduler to CFS
Implemented linux module that starts the plan execution
Next Steps
Enhance the scheduler so that the switch back from CFS to PB-Scheduler is in time
Create a first table of contents
Modify the kernel & module so that the plan can be set by the module
Problems
Week 2 (CW 46)
Activities
Enhance prototype
Enhance development infrastructure
Create first table of contents
Results
Working prototype scheduler that switches from PB-Sched. → CFS/Idle and Idle/CFS → PB-Sched.
The plan can be set by a module
It is possible to set a new plan after the first is execute
Created infrastructure script, that converts a plan in csv → linux module + Makefile
Next Steps
Start with theoretical work (e.g. find a definition of an unstable system)
Create script that executes all modules one after another and aggregates the results
Problems
Found a race-condition (which led to sporadic kernel-panic) in the prototype, caused by a missing put_prev_task call in the pick_next_task method of the PB-Scheduler
Week 3 (CW 47)
Activities
Enhance test infrastructure
Start writing first sections
Start with analysis of criteria of a plan guaranteeing a stable system
Results
Test script that executes a set of modules and aggregates the results
First sections: Requirements of the PB-Scheduler prototype, Explanation of the abstract design of the Linux scheduler core functionality, Explanation of the implementation of the Linux scheduler (started)
Divided stability criteria of a plan into sub problem: 1.: "What is the maximal task length without causing problems for other applications?", 2.: "What is the minimal idle time that allows to run processes from (1.) and run processes depending on the plan?"
Next Steps
Continue with the definition of criteria
Continue writing the Explanation of the implementation of the Linux scheduler
Start analysis of different Queue states (CFS-Q full/empty, PB-Q full/empty)
Start writing to explain how the PB-Scheduler prototype is implemented in Linux
Problems
Running all tests showed that the delta between switching from idle to PB-Scheduler and switching from cfs to PB-Scheduler is still big. This could be fixed
Week 4 (CW 48)
Activities
Write first draft of chapter "Plan Based Linux Scheduler", results in the "Evaluation" chapter, assumptions in the "Plan" chapter
Install modified kernel on real hardware and execute all tests
Results
First (!) draft of the "Plan Based Linux Chapter" (containing: Requirements, Why Linux, Architecture of current Linux scheduler, Module handling)
Installed Lubuntu (16.10 with a 4.8.0-59-generic) on real hardware, installed modified kernel. The system ran stable with the new kernel (even the X environment).
Executed tests to check whether the OS itself has a task execution time limitations. The result suggest that no limitation exists. Tested execution times of 1min, 20min, 60min.
Next Steps
improve drafts
Create test to determine the minimal free time of a plan (Approach: measure kernel thread exec. time during free time)
Week 5 (CW 49)
Activities
Enhance kernel: Measure kernel thread execution time in a certain period
Start tests with real HPC application to measure the amount of kernel thread execution time
Write first draw about the enhancement and the measurement results
Results
Enhanced kernel: Implemented measurement in __schedule() and required variables in the PB-runqueue structure
Wrote first results and description of the enhancement
Measured times for Matrix-Mul. and Primer-Nr.-Generation with MPI (100 test runs per input)
Next Steps
Find a more representative app, define a testset, measure it, document environment, document results
If possible: Derive abstract result from the measurement results.
Problems
Find a representative example HPC application
Running such an application is not easy in QEMU. Therefore it was necessary to run on real hardware.
The first measurements results were wrong, because the idle thread is also a kernel thread, but the idle time should not be a part of the kernel thread execution time.
Week 6 (CW 1)
Activities
Write first version of the evaluation
Research for the first chapter that describes the plan creation, grid computing and the VRM
Results
Write first version of the evaluation
First sketches of the grid computing explanation
Next Steps
Clean the second chapter so that the first chapter will the remaining main task
Week 7 (CW 2)
Activities
Complete the code: rename variables, change indentation so that the code fits into LaTeX code boxes, Test the changed code
Create State Transition Diagram and integrate into tex
Correct result charts (error bars are < 0)
Add explanation of the task_struct
Add short explanation of each Linux scheduler module