OpenMS
Loading...
Searching...
No Matches
FidoAdapter

Runs the protein inference engine Fido.

pot. predecessor tools → FidoAdapter → pot. successor tools
PeptideIndexer ProteinQuantifier
(via protein_groups parameter)
IDPosteriorErrorProbability
(with prob_correct option)
IDScoreSwitcher

This tool wraps the protein inference algorithm Fido (http://noble.gs.washington.edu/proj/fido/). Fido uses a Bayesian probabilistic model to group and score proteins based on peptide-spectrum matches. It was published in:

Serang et al.: Efficient marginalization to compute protein posterior probabilities from shotgun mass spectrometry data (J. Proteome Res., 2010).

By default, this adapter runs the Fido variant with parameter estimation (FidoChooseParameters), as recommended by the authors of Fido. However, it is also possible to run "pure" Fido by setting the prob:protein, prob:peptide and prob:spurious parameters, if appropriate values are known (e.g. from a previous Fido run). Other parameters, except for log2_states, are not applicable in this case.

Depending on the separate_runs setting, data from input files containing multiple protein identification runs (e.g. several replicates or different search engines) will be merged (default) or annotated separately.

Input format:

Care has to be taken to provide suitable input data for this adapter. In the peptide/protein identification results (e.g. coming from a database search engine), the proteins have to be annotated with target/decoy meta data. To achieve this, run PeptideIndexer .
In addition, the scores for peptide hits in the input data have to be posterior probabilities - as produced e.g. by PeptideProphet in the TPP or by IDPosteriorErrorProbability (with the prob_correct option switched on) in OpenMS. If scores are found to be posterior error probabilities (PEPs, lower is better), they are converted to posterior probabilities (higher is better) using "1 - PEP".
If the posterior (error) probabilities are stored in user parameters ("UserParam") in the idXML instead of in the score fields, IDScoreSwitcher can be used to rewrite the scores. (This may be the case e.g. if FalseDiscoveryRate and IDFilter were applied for FDR filtering prior to protein inference.)

Output format:

The output of this tool is an augmented version of the input: The protein groups and accompanying posterior probabilities inferred by Fido are stored as "indistinguishable protein groups", attached to the protein identification run(s) of the input data. Also attached are meta values recording the Fido parameters (Fido_prob_protein, Fido_prob_peptide, Fido_prob_spurious).
The result can be passed to ProteinQuantifier via its protein_groups parameter, to have the protein grouping taken into account during quantification.
Note that if the input contains multiple identification runs and separate_runs is not set (the default), the identification data from all runs will be pooled for the Fido analysis and the result will only contain one (merged) identification run. This is the desired behavior if the protein grouping should be used by ProteinQuantifier. When the greedy_group_resolution flag is set, "peptide to indistinguishable proteins" mappings will be unique in the output and the actual resolved groups are added as "protein groups", attached to the protein identification run(s) of the input data (in addition to the "indistinguishable protein groups").

Note
Currently mzIdentML (mzid) is not directly supported as an input/output format of this tool. Convert mzid files to/from idXML using IDFileConverter if necessary.

The command line parameters of this tool are:

INI file documentation of this tool: