General info
|n this part of the Practical course sequence analysis you will be confronted with the situation of prototyping
NGS analysis programs with an available C++ library of effiecient data types and algorithms, namly
SeqAn.
Date |
Content |
Lecturer |
05.06 |
Introduction to SeqAn |
Knut Reinert, Jochen Singer |
12.06 |
Programming NGS tools in SeqAn |
Jochen Singer |
Day 1 (05.06)
- General introduction of goals of this unit
- Assignment 1: Install SeqAn on your computer (follow instructions here)
- Assignment 2: Work through First steps tutorial with guidance.
- Assignment 3: Program first app with the command line parser
- Assignment 4: Work through the sequence IO Tutorial
- Assignment 5: Adapt your first app, such that it can read fastq files.
- Assignment 6: Program a simple quality trimming (easy version: just cut a number of bases at the end)
- (optional): Make you functions template functions (such that they can be reused)
- (optional): Adapt the trimming function from the trimmer such that all bases from the end are removed which are below a certain threshold.
- (optional): optional: Adapt the trimming function from the trimmer such that a window of a specified length is shifted from the begin to the end of the read and the average quality of the window is used to trim the read.
Day 2 (12.06)
- Introduction to adapter removal, read mapping
- Assignment 1: Program de-multiplexer removal
- Write a simple de-multiplexer that retrieves all reads with a certain barcode
- The barcode has to be provided by a file
- (optional) Use a file with multiple barcodes (only select a read if it is the best match to the specified adaptor)
- (optional) Use a file with multiple barcodes and create several output files, one for each adaptor
- Assignment 2: Program a adapter removal tool
- Sometimes part of the adapter is contaminating a read and therefore has to be removed
- Write an app (such as the quality trimmer) that reads a read file, removes adaptors from the reads and writes the result to a file.
- The adapter sequence can either be read from file or taken from the command line.
- (optional) Allow errors in the adapter sequence.
- Assignment 3: Work through the index tutorial
- Assignment 4: Program simple read mapper
- The app should be based on a seed and extend approach
- Create the seeds (pigeon-principle)
- Search for seed with the help of an index
- Verify the seeds - write a simple function which compares two sequences (or use the globalAlignment() function)
- (optinal) Implement the verification with Myers verification
- (optinal) Adjust the range of the verification to take edit distance into consideration
- (optinal) Try different indices (the app should be based on templates)
- (optinal) Implement a strategy to optimize the number of verifications.
Resources and links
Solutions
Sources