You are here: Foswiki>ABI Web>SnippetsAnneKatrin (07 Dec 2010, AnneKatrinEmde)Edit Attach

Page SnippetsAnneKatrin

This is where I write my weekly goals.

all time todos:

  • make splitRazers project page
  • add link in seqanswers forum srp thread
  • check sensitivity at 4x coverage and compare with dindel

6.12. - 10.12.

PLANNED
  • Monday: write split mapping + SNP/indel calling part in Mat&Meth section. go through comments regarding miscalls/missing indels.
look at simulation pipeline results, complete runs
  • Tuesday: include supersplat, include a generic indel detector
  • Wednesday: write, write, plot, write
  • Thursday: real results! make plan! --> Stefan (show first draft to him)
  • Friday: send draft to Knut, should contain: methods, simresults, + a bit of intro

REALIZED:
  • Monday: JC, mensa, commented comments, read dindel paper! --> they observe homopolymer-indels! and otherwise very low indel rates --> adapted simulation: mismatch probs on default, pi=pd=0.0002 (which still seems relatively high), started on queue. wrote half of MM text while falling asleep
  • Tuesday: finished MM text. cluster down → restart simulation pipeline asap. took closer look at supersplat → cant do alignment errors + output is super raw, a lot of processing would have to be done --> raus!

29.11. - 3.12.

PLANNED
  • Monday: adapt indelSimulator to simulate known indels within ranges (garuanteeing a certain number of indels per size range), set up smaller pipeline for testing purposes
  • Tuesday: replace readsim with mason, include gsnap, bwa into pipeline
  • Wednesday: include supersplat in pipeline, read up on other SE indel detection software
  • Thursday: produce some example results, indelPercThresh 0/0.5, include 454 in pipeline
  • Friday: meeting with Knut at 10.30, make new plan

REALIZED (or what i did instead..)
  • Monday:
    • resolved snpStore issues for Stefan → input file was corrupt
    • discovered another pitfall for indel calling and realignment: read clipping may loose indels when clipped read is pairwise aligned to ref → minNumIndels required for realignment may not be reached → no indel call (solution: make indelThreshold for realignment independent from indelThreshold for indel calling + keep indel info for clipped reads (todo: friday))
    • fixed addInterval
    • read about indel rates: should be about 8-10:1 to snp rates
    • experimented with mason
  • Tuesday:
    • adapted indelSimulator
    • replaced readsim with mason
  • Wednesday:
    • set up small test pipeline, included gsnap, bwa
    • tested.. cant find sensible bwa settings
  • Thursday:
    • added 454 simulation to pipeline
  • Friday:

See progress report for the last 6 months.. all big todos are done!

big todos:
  • indel calling on split-aligned reads in SeqAn
  • gapped prefix/suffix alignment in razersSpliced
  • realignment in snp calling

medium todos:
  • generalized gff-to-fragmentStore parsing

small todos:
  • open exchange calender (IE from home)

25.5. - 28.5.

PLANNED
  • Tuesday: continued debugging
  • Wednesday: ---
  • Thursday: debugging/testing/code-cleaning
  • Friday: 10.30 Kerstin

do also:
  • make a more robust indel calling program --> collect positions in perl script or make c program!!!

17.5. - 21.5.

PLANNED
  • Monday: make presentation, write review, check index construction in uniqueReads.cpp (done, sent to David)
  • Tuesday: JC talk
  • Wednesday: check Cougar's indels, sent improved microRazerS to Ho-Ryun, make split mapping run again
  • Thursday: Kerstin 10.30 (vorher CNV-HMM angucken!), answer cougar,
  • Friday: wrote tests for split alignment (still buggy obviously...)

3.5. - 7.5.

PLANNED
  • process 454 data (doing!)
  • debug split alignment for edit distance
  • make plan for split-mapping results!!!!!!!!
  • answer kerstin, sabrina, lena
  • continue writing paper

26.4. - 30.4.

PLANNED
  • process 454 data (doing!)
  • finish split alignment for edit distance (done!)
  • make plan for split-mapping results!!!!!!!!

kleines todo fuer dienstag abend: splitmapping auf A14 unmapped batches anschmeissen!!! sabrina antworten, lena antworten (A14 mappings, snps, indels, + readme! + mapped split? )

19.4. - 23.4.

PLANNED
  • process 454 data
  • finish split alignment for edit distance!!!
  • make plan for split-mapping results for ismb/paper?

REALIZED
  • split edit debugging....

12.4. - 16.4.

PLANNED
  • process 454 data
  • finish split alignment for edit distance!!!
  • make plan for split-mapping results for ismb/paper?
  • read and take notes about other split-mappers
  • look into realignment again...

monday: send travel fellowship application, put progress report online, answer lena, finish edit split mapping!

5.4. - 8.4.

PLANNED
  • progress report
  • process 454 data
  • split alignment for edit distance!

22.3. - 26.3.

PLANNED
  • split alignment for edit distance! --> not done yet
  • reprocess some patients with new split alignment (done)
  • meet Kerstin and maybe try simple CNV detection on our data (not ready for testing yet)
  • 25. & 26. Lena aus Tübingen

15.3. - 19.3.

PLANNED
  • write poster abstract
  • resolve pindel calling problems
  • finish SeqAn tutorial

REALIZED
  • sent abstract to ISMB
  • resolved split alignment pindel calling problems
  • integrated random match estimate into split alignment
  • tutorial as good as finished

8.3. - 12.3.

PLANNED
  • make plan for poster!
  • breakpoint statistics: adapt parameters and check influence of windowlength, make plots
  • continue SeqAn tutorial

REALIZED
  • expected number of random matches is extremely low for 76bp reads with 23bp minMatchLen, even with errors, and even on whole genome scale → iid model probably not good enough to approximate real-genome-situation

1.3. - 5.3.

PLANNED
  • fix split-alignment (done) * missing: maskDuplicates for spliced matches (done for hamming distance)
  • fix edit-distance-indel-calling bug(XLMR_1996, siehe Stefans mail) (done)
  • continue SeqAn tutorial (continuing)
  • get snpStore ready for Marcel (done)
  • check out Marcel's breakpoint statistics, adapt parameters and check influence of windowlength (halfway done)

REALIZED
  • finished snpStore for Marcel, met on Tuesday
  • fixed edit-distance-indel-calling bug: wrong indel position was calculated for reverse gapped reads
  • seemingly fixed split-alignment, instead of maskDuplicates: directly discard duplicate prefix and duplicate suffix matches (only works for hamming distance)

22.2. - 26.2.

PLANNED
  • test split-alignment
  • implement indel calling on split-aligned reads
  • continue reading and taking notes
  • integrate indel calling on realigned reads
  • meet with Paz
  • do realignment on windows of varying size? inspect influence of window size. debug realigner?
  • SeqAn tutorial

REALIZED
  • split-alignment --> model gap costs with exponential funktion? → talked to Marcel about problem of read placement, + probabilities of random matches, expected numbers of reads → variant predicition
  • somehow produced a weird bug in split-alignment when integrating into uptodateSeqan
  • SeqAn tutorial: finished motif finding + started with alignments
  • met with Kerstin --> seqan basics, structs, fragmentStore
  • seminars: reseq meeting, genereg meeting, group meeting, Illumina Casava 1.6 talk by Oliver Goldenberg

15.2. - 19.2.

PLANNED
  • test split-alignment --> weird insertion/deletion calls (doing)
  • implement indel calling on split-aligned reads (not done)
  • continue reading and taking notes (naja)
  • integrate indel calling on realigned reads (not done)
  • meet with Paz (next week)
  • do realignment on windows of varying size? inspect influence of window size. debug realigner? (not done)
  • meet with Marcel Grunert (done)
  • meet with Bernd Timmermann --> casava 1.6, heterogeneous dataset (done)

REALIZED

  • test split-alignment --> weird insertion/deletion calls: problem in sorting of matches which was done like this 1) minimizing number of errors (not even minimizing of distanceError was implemented!!!). instead
  • met with Marcel Grunert: need to adapt snpStore to incorporate readcounts given as gff-tag. do snp calling on mircoRNAs.
  • met with Bernd Timmermann

1.2. - 5.2.

PLANNED

  • fix razers uniqueness bug (dachte dass done, aber doch nicht...)
  • correct maxPile correction for indel calls (done)
  • integrate indel calling on realigned reads (not done)
  • meet with Paz (not done)
  • do realignment on windows of varying size? inspect influence of window size. debug realigner? (not done)
  • continue reading and taking notes (doing)

REALIZED

  • Monday: meeting with Kerstin, discussed different CNV tools
  • Tuesday: fixed razers uniqueness bug in all three versions
  • Wednesday: checked distribution of call quality scores and presented in reseq meeting
  • Thursday: integrated indel calling, now insertion/deletion seperate in gff
  • Friday: tested snpStore and razerS

25.1. - 29.1.

PLANNED

major:
  • test/debug snp calling on realigned reads (doing)
  • read SV detection papers and take notes in wiki (doing now)

minor:
  • answer tübingen people (done)
  • read MA Vipul (done)

REALIZED

  • Realignment: problem is in including the reference sequence into the multi read alignment. If includeReference == true this only means that the original reference sequence is stored as an artificial read that is NOT realigned. if one includes the artificial reference-read in the realignment process, the reference is not given enough weight, ie many artificial snp calls may be the result. instead one needs to do a pairwise alignment of reference and consensus sequence, to get a better reference-to-reads alignment.
--> Did that. Results are worse. TODO: inspect cases to find cause. maybe a problem with static windowlength?
  • Inspected quality scores of SNP calls: the more FP calls, the more their distribution resembles a geometric distribution. GC bias is also detected in high-quality calls, even gets stronger with increasing quality for geometrically distributed quality scores. --> base call quality value recalibration needed? (to correct for nucleotide specific bias, downweight reads with many errors, ie assign mapping quality) more stringent clipping? (stefan)

Comments

 
Topic revision: r34 - 07 Dec 2010, AnneKatrinEmde - This page was cached on 14 Mar 2025 - 17:38.

This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback