Skip to content

Raw data representation

David Teschner edited this page May 18, 2023 · 11 revisions

General

Liquid Chromatography - Ion Mobility Spectrometry - Tandem Mass Spectrometry (LC-IMS-MS/MS) datasets, including those produced from timsTOF instruments, comprise four-dimensional (4D) data points. Each raw data point is represented as a quadruple, <rt, im, mz, i>, where:

  • rt refers to the Retention Time, derived from the Liquid Chromatography (LC) separation phase.
  • im signifies the Ion Mobility, captured from the drift gas separation phase, often using a method such as Trapped-Ion-Mobility-Separation (TIMS).
  • mz represents the Mass-to-Charge ratio, provided by the Mass Spectrometer (for instance, a Time-of-Flight or TOF instrument).
  • i stands for the Intensity of the signal captured by the mass spectrometer.

Each of these values characterizes the analytes along three different separation axes, with an intensity measurement at each point. The resolution of a given experiment is finite, contingent on the type of device used and its hardware parameter settings.

It is also worthwhile to note that it can be beneficial to index these intensity values for more efficient storage and in-memory representation. This technique is employed in the handling of data stored in TDF files, where intensity values are indexed and then translated back to physically meaningful values upon data load. The mappings for the indices and their corresponding floating-point values are as follows:

Please note that the precise information linked to datapoints can vary depending on the acquisition scheme used, such as ddaPASEF, diaPASEF, or midiaPASEF. Some of this additional information may pertain to whether fragmentation was applied or specific quadrupole m/z filter settings.

Basic usage

The proteolizard-data library offers Python classes that efficiently encapsulate and provide access to the four-dimensional (4D) data generated from the timsTOF experiments. These classes also incorporate built-in functionalities for frequently needed tasks, such as filtering and vectorization.

The foundational class for working with timsTOF raw data is TimsFrame. This class represents a single retention time, which can be associated with either a precursor or fragment spectrum. Each TimsFrame instance contains several scans separated by ion mobility.

A TimsFrame can be deconstructed into a collection of MzSpectrum objects. Each MzSpectrum object contains a single scan (or m/z-spectrum), which is generated for a particular ion mobility.

Conversely, multiple TimsFrame instances can be aggregated into a TimsSlice object. TimsSlice is a more complex structure that holds collections of both precursor and fragment TimsFrame instances, allowing for a more complete representation of the experimental data.

TimsFrame

Constructor

Properties

Methods

Usage example

MzSpectrum

Constructor

Properties

Methods

Usage example

TimsSlice

Constructor

Properties

Methods

Usage example

TimsFrameVectorized

Constructor

Properties

Methods

Usage example

MzVector

Constructor

Properties

Methods

Usage example

TimsSliceVectorized

Constructor

Properties

Methods

Usage example