C code available from paper (Suppl. Material):
→ download Phospho_PKA Training.zip for training a multilayer perceptron (MLP) yourself (done)
→ download AMS3_Consensus_distributable_v1.tar.gz for already trained MLPs to test expression patterns yourself (done)
- here you can find the phospho-PKA model as well as other post-translational modifications (PTMs)
→ original source code as an external project in OpenMS: (done)
→ you can use the attachment (https://svn.imp.fu-berlin.de/seqan/trunk/teaching/master/P4-2010/Proteomics/coding-projects/PTMs/Predictor.tar.gz) and then follow these instructions:- change the path to OpenMS in line 19 in CMakeLists.txt and type "cmake ." and then "make" (ignore the warnings)
- follow the readme-instructions (http://code.google.com/p/automotifserver/wiki/Predictor_1_3_ReadMe) to train a model using the example:$ ./Predictor 0 2 20 2 A
- afterwords you can use the testing-mode by changing the first parameter:
$ ./Predictor 1 2 20 2 A
C++ Class "PTMPredictor" with the following interface (suggestion): void predict(std::vectorImplement prediction of modification state for a given set of peptides and a trained PTM model.
The peptide set can be arbitrary, but should be realistic (e.g. digest a couple of human proteins and save as FASTA file). The trained PTM model can be obtained from the paper (or train yourself [ambitious]).
[22/06/2010] Proposed executable has the following interface:
PTMSimulator -in <my_peptides.FASTA> -out <my_peptides_with_PTMs.FASTA> -mod_p_threshold <float[0-1]> -mod_fraction <float[0-1]> -append
Arguments explained:
'in' - input FASTA file containing peptides/proteins (do not digest)
'out' - modified sequences containing PTMS
'mod_p_threshold' - a threshold for a modification to be accepted
'mod_fraction' - the maximal fraction of the input set which receives a modification. If more sequences receive modification, only use the highest scoring ones
'append' - concatenate the modified set with the original file before writing result to 'out'; be careful not to include unmodified sequences twice
If you have other ideas about how to control the state of modified sequences and/or the total fraction modified, let me know.
In order to ease development, derive your executable from TOPPBase (see OpenMS/source/APPLICATIONS/TOPP/FileConverter.C for an easy example, just put your code into main_() and modify registerOptionsAndFlags_() ).