OpenMS
Loading...
Searching...
No Matches
MSSimulator

A highly configurable simulator for mass spectrometry experiments.

This implementation is described in

Bielow C, Aiche S, Andreotti S, Reinert K
MSSimulator: Simulation of Mass Spectrometry Data
Journal of Proteome Research (2011), DOI: 10.1021/pr200155f

The most important features are:

  • Simulation of Capillary electrophoresis and HPLC as separation step
  • Simulation of MS spectra
  • Simulation of MS/MS spectra with configurable precursor-selection strategy
  • Simulation of iTRAQ labels
  • Simulation of different noise models and instrument types (resolution, peak shape)

Look at the INI file (via "MSSimulator -write_ini myini.ini") to see the available parameters and more functionality.

Input: FASTA files

Protein sequences (including amino acid modifications) can be provided as FASTA file. We allow a special tag in the description of each entry to specify protein abundance. If you want to create a complex FASTA file with a Gaussian protein abundance model in log space, see our Python script shipping with your OpenMS installation (e.g., <OpenMS-dir>/share/OpenMS/SIMULATION/FASTAProteinAbundanceSampling.py). It supports (random) sampling from a large FASTA file, protein weight filtering and adds an intensity tag to each entry.

If multiplexed data is simulated (like SILAC or iTRAQ) you need to supply multiple FASTA input files. For the label-free setting, all FASTA input files will be merged into one, before simulation.

For MS/MS simulation only a test model is shipped with OpenMS.
Please find trained models at: http://sourceforge.net/projects/open-ms/files/Supplementary/Simulation/.

To specify intensity values for certain proteins, add an abundance tag for the corresponding protein in the FASTA input file:

  • add '[# <key>=<value> #]' at the end of the > line to specify intensity For RT control (disable digestion, to make this work!)
    • rt (subjected to small local error by randomization)
    • RT (used as is without local error)

For amino acid modifications, insert their name at the respective amino acid residues. The modifications are fixed. If you need variable modifications, you have to add the desired combinatorial variants (presence/absence of one or all modifications) to the FASTA file. Valid modification names are listed in many TOPP/UTILS, e.g MSGFPlusAdapter 's -fixed_modifications parameter.

e.g.

>seq1 optional comment [# intensity=567.4 #]
(Acetyl).M(Oxidation)ASQYLATARHGC(Carbamidomethyl)FLPRHRDTGILP
>seq2 optional comment [# intensity=117.4, RT=405.3 #]
QKRPSQRHGLATAC(Carbamidomethyl)RHGTGGGDRAT.(Dehydrated)

The command line parameters of this tool are:

INI file documentation of this tool: