Processing wizard¶

mzwizard_logo The processing wizard reduces the configuration of complex standardized workflows into few parameters. Those values are used to estimate or calculate all other parameters from spectral processing over feature detection and alignment to annotation and data export. The mzwizard organizes the different parts of the instrumental setup to define a workflow split up into: Sample introduction, IMS, MS instrument, and workflow. Ideas for a workflow? We are open for ideas and contributions. Feel free to reach out if your workflow might be a candidate for a wizard setup.

Processing wizard

Tip

This workflow produces Feature lists as their result. Feature lists are a central part in visualizing the results in the Statistics dashboard and in the Interactive network visualizer.

Tip

The mzwizard is split into sections, most importantly, the MS part and the sample introduction. Parameters here reflect the sensitivity, resolution, and accuracy of these parts of the hyphenation. Selecting one of the default presets actually populates the initial values.

Danger

The defaults are only suggestions and different acquisition methods and instruments produce different noise levels etc. The raw data overview and aligned feature lists are useful to optimize these parameters.

mzwizard parameter tabs¶

mzwizard

[1] Data import¶

Specify all data files that need to be processed. You can drag & drop the data files into the text field.

[2] Sample introduction system¶

Info

Depends on the selected sampling system, e.g. MALDI, HPLC, DESI

Chromatography-based¶

Smoothing¶

Apply smoothing to the chromatograms. Smoothing should not be used for data low MS1 acquisition rates (e.g. below 5 data points per chromatographic peak/feature).

Influences batch steps: Smoothing

Stable ionization across samples¶

Used during feature grouping of adducts and other ions of the same molecule. Only use if the matrix (e.g., salt content, same culture medium, only fresh OR sea water samples) and ionization efficiencies are comparable across the whole study.

Influences batch steps: Correlation grouping

Crop retention time¶

Crops the chromatograms at these retention time values. Useful to cut off the start and the end of the chromatograms. The start is often diverted into the waist and the end often contains the cleanup procedure.

Influences batch steps: Chromatogram building Local minimum feature resolver (Calculation of chromatographic threshold)

Max peaks in chromatogram¶

An estimate of the number of isomers and isobaric ions in the chromatograms. Used to estimate the percentage of data points that hold useful data for the feature resolving step. (Chromatographic threshold in local minimum resolver).

Influences batch steps: Local minimum feature resolver (Calculation of chromatographic threshold)

Minimum consecutive scans¶

Only keep chromatograms and features with at least X data points in retention time dimension.

Influences batch steps: Local minimum feature resolver, Smoothing, Gap filling

Approximate feature FWHM¶

The full-width at half maximum of peaks in retention time dimension. Best extracted from the feature tables of already processed test datasets or from the raw data overview.

Influences batch steps: Local minimum feature resolver, (Minimum search range)

RT tolerance (intra-sample)¶

Retention time tolerance to group adducts and isotopes of the same molecule. The comparison is performed within each individual sample, usually leading to small variance.

Influences batch steps: Isotope grouping, Correlation grouping

RT tolerance (sample-to-sample)¶

Retention time tolerance to align features across all samples. Dependent on retention time shifts.

Influences batch steps: Join Aligner, Gap filling, Local compound database search

Imaging based¶

Info

Supported techniques: MALDI, SIMS, DESI, LDI

Minimum number of pixels¶

The minimum number of pixels containing the same m/z. Influences batch steps: Image Builder

[3] Ion mobility instrument¶

Smoothing¶

Applies smoothing to the ion mobilograms. This should be enabled for long ramp times in TIMS instruments.

Influences batch steps: Smoothing

Minimum consecutive scans¶

The number of consecutive scans/datapoints in a valid ion mobilogram and feature. Increase this for longer ramps in TIMS instruments.

Influences batch steps: Smoothing (mobility dimension), Local minimum resolver (mobility dimension)

Approximate feature FWHM¶

The full-width at half maximum for IMS features with regard to the IMS dimension.

Influences batch steps: Local minimum resolver (mobility dimension)

[4] MS instrument, e.g., Orbitrap, QTOF, FTICR¶

Noise threshold (MS1/MS2)¶

Choose the mass detector from the drop down menu. Choose the Factor of lowest signal for both centroided or profile data where each spectrum contains noise signals, often represented by many signals at the same low intensity (spectral grass). This may correspond to static noise or single counts. If the spectra are free of noise, for example because they were already filtered, use an absolute intensity threshold.

Depending on the selected mass detector, separate absolute noise levels or factors are defined to threshold spectra of MS level 1 and 2 (or above). So the MS2 noise level is used for MSn data with level > 1. These parameters can be optimized by looking at the spectral raw data in the raw data overview.

Info

We recommend the Factor of lowest signal setting for Orbitrap instruments and the Absolute intensity setting for TOF instruments.

Influences batch steps: Mass detection (MS1, MSn) 1/5th of the value is used for ion mobility scans.

Minimum feature height¶

The minimum height of chromatograms and features.

Influences batch steps: Chromatogram builder, Local minimum feature resolver (RT dimension), Local minimum feature resolver (IMS dimension), DIA MS2 grouping (todo documentation), RT Calibration

Scan-to-scan m/z tolerance¶

Relative and absolute m/z tolerance. Always applies the maximum tolerance based on the m/z this means that smaller and higher values are stronger affected by the absolute and relative tolerance, respectively. Used to find the same signal in different scans when connecting chromatograms. Reflects on the mass accuracy between scans in the same raw data file.

Influences batch steps: Mobility scan merging, Chromatogram builder, Gap filling, Lipid annotation (MS2), Spectral library search, Spectral/ Molecular Networking, DIA MS2 grouping (todo documentation)

Feature-to-feature m/z tolerance¶

Relative and absolute m/z tolerance. Always applies the maximum tolerance based on the m/z this means that smaller and higher values are stronger affected by the absolute and relative tolerance, respectively. Used to group isotopes and adducts of the same molecule. Those m/z values are already averaged over their features and should have lower m/z differences that the scan to scan tolerance.

Influences batch steps: Isotope filter Isotope finder Ion identity networking Duplicate filter

Sample-to-sample m/z tolerance¶

Relative and absolute m/z tolerance. Always applies the maximum tolerance based on the m/z this means that smaller and higher values are stronger affected by the absolute and relative tolerance, respectively. Used to align features across samples. Those m/z values are already averaged over their features but originate from different samples.

Influences batch steps: Join Aligner Local compound database search Lipid annotation (MS1)

[5] Filters¶

Original feature list¶

Options to either keep or remove the original feature lists. Keep is valuable during workflow optimization whereas remove saves resources and allows for more performant processing or large datasets. (See section on performance).

Influences batch steps: all batch steps that create a new feature list.

Min samples per aligned feature¶

Only keep aligned features that were detected in at least n samples. This parameter should usually scale with the sampleset size and if samples are relatively similar from their compounds. Uses the maximum of an absolute and relative value.

Influences batch steps: Feature list rows filter

Only keep features with 13C¶

Detect isotope pattern and only keep feature with valid 13C isotope pattern.

Influences batch steps: Feature list rows filter

[6] Annotation¶

Local compound database¶

Specify the database file (csv or tsv format).
Select precursor m/z option to either
- calculate from given neutral mass (or formula/structure)
- use provided precursor m/z in column
Filename column is only used for the library generation workflow
Columns map the table column headers to the internal names in mzmine
- Select data types present in local compound database

Annotate lipids¶

Select if you want to annotate lipids. Lipid database scope can be defined in the Lipid annotation.

Spectral library files¶

Select all spectral libraries to import and use during spectral library matching to annotate compounds in the final aligned feature list.

Influences batch steps: Spectral library search

Influences batch steps: Local compound database

[7] Workflows¶

Most of the parameters in the workflow section define data output and some workflow specific parameters.

General parameters¶

Apply spectral networking (FBMN/IIMN): Will compare all MS2 spectra across features to form molecular networks by spectral similarity.
Define an export path and base file name, e.g., "D:\analysis\date_project" this will create a new folder and save all files from export modules there. Each module will add a specific suffix to the file name.
Export for molecular networking (e.g., GNPS, FBMN, IIMN, MetGem): Will export all files for molecular networking
Export for SIRIUS: Will export all files needed for SIRIUS
Export annotation graphics: Exports annotations like spectral library matches, lipid matches, etc. to graphical reports. Contains options to also export chromatographic/ion mobility shapes, images, and other plots.

Workflow parameters and descriptions¶

DDA¶

The data-dependent acquisition workflow is the default non target workflow. We recommend to also use this workflow for targeted analysis and combine it with the Local compound database search or Spectral library search annotation modules.

DIA¶

The data-independent acquisition workflow is can process any type of MSe, broad band CID, or all ion fragmentation experiments. Add a minimum DIA correlation coefficient and a minimum number correlated data points in of MS2 data points.

Library generation¶

More method and contributor metadata is required to build spectral libraries. This workflow produces reference libraries with options to filter and merge spectra.

Imaging¶

Info

This workflow is only available for the sampling techniques: MALDI, LDI, DESI, SIMS

Select if you want to co-localize images.

Target plate¶

Info

This workflow is only available for the sampling technique: MALDI (Bruker timsTOF only)

Select if you want to co-localize images.

Spectral deconvolution¶

Info

This workflow is only available for the sampling technique: GC-EI

Select the minimum number of signals in a deconvoluted spectrum

[8] Create Batch¶

Select on the "Create batch" button to open a batch queue. Each step in the batch queue can be modified via double click. Find more information how to modify a batch queue. Click on "OK" to start the processing. Processing wizard

Optimizing the wizard workflow¶

While the presets of the processing wizard will produce good results out of the box for many instrument types and combinations, it may be worth optimizing specific workflow steps to achieve even better results. There are two ways to optimize the wizard workflow.

By optimizing the parameters in the wizard steps.
By optimizing individual steps in the created batch.

A step-by-step approach is detailed in the workflow optimization guide.

Ansgar Korf, Robin Schmid, Steffen Heuckeroth, Steffen Heuckeroth, corinnabrungs