Parallelism In The SeqAn Library: Schedule
Tentative Schedule
Tentative total: 25 weeks.
Review of existing literature and other implementations (2–4 weeks)
Cleanup of module find
(2–3 weeks)
Write benchmark comparing existing SeqAn
find
implementation using iterators with straightforward array-based implementation.
- Review of code so far in folder
find2
.
- Minimal interface design for algorithms to test.
- Create unit tests.
- Implement algorithms:
- naive
- Horspool
- Shift-Or
- (BNDM/BOM?)
- Myers’ bit vector
- Generate benchmark data set (contigs).
- Create test bench.
- Gather data.
- Adapt to SeqAn interface.
- Benchmark, compare with above implementations
- Analysis of strengths and weaknesses of current SeqAn implementation (???)
1D load balancing (!)
- Conceptual draft.
- Implementation outside of SeqAn (
find
implementations) using existing libraries (e.g. MCSTL, TBB …).
- Benchmark.
- Implementation of a framework in SeqAn.
- Compare with above benchmark, work out strengths and weaknesses in existing libraries.
Divide and conquer techniques (!)
Pipeline techniques (nice to have)
- Conceptual draft.
- Exemplary implementation.
- Benchmark.
Master/worker schemas (nice to have)
- Conceptual draft.
- Exemplary implementation.
- Benchmark.
Various parallel find
implementations (2–3 weeks)
- Adapt the patterns for parallelization to the
find
algorithms.
- Find task/data decompositions.
- Divide and conquer
- (Master/worker)
- Find data dependencies.
- Plan general parallelization pattern in pseudocode.
- Specify parallel finder in pseudocode.
- Develop appropriate interfaces for the parallel
find
algorithms.
- Implement the different approaches.
- (Study trade-offs of different implementations and interfaces.)
- Explore different approaches for load balancing (see OpenMP)
- static
- dynamic
- work stealing
Collect ideas about generalizing code into a parallel framework (2 weeks)
Exploration of parallelization ideas for more parts of SeqAn and implementation (4–6 weeks)
Experimental study, comparison with other implementations (2–3 weeks)
Thesis write-up (4 weeks)