OpenSWATH - GUI Models¶
The main models used by the GUI are the PeptideTree
and the
MSData.DataModel
models. Internally, PeptideTree
uses ChromatogramTransition
to store access to single rows in the
tree data structure while MSData.DataModel
uses
SwathRunCollection
to keep track of multiple SWATH-MS runs.
MSData Data model
Module¶
Contains classes that provide access to the raw data
MSData¶
-
class
openswathgui.models.MSData.
DataModel
(fdr_cutoff=0.01, only_quantified=True)¶ Bases:
object
The main data model, provides access to all raw data
It stores the references to individual
SwathRun
objects and can be initialized from a list of files. Each “load” method is responsible for setting the self.runs parameter.-
self.
runs
¶ list of
SwathRun
orSqlSwathRun
The MS runs which are handled by this class
-
self.
fdr_cutoff
¶ bool
Selected FDR cutoff
-
self.
only_show_quantified
¶ bool
Whether to only show peptides that are quantified
-
self.
draw_transitions_
¶ bool
Whether to draw individual transitions
-
getDrawTransitions
()¶ Returns: Whether to draw transitions or not Return type: bool
-
getStatus
()¶ Returns: Returns its own status (number of transitions etc.) for the status bar. Return type: str
-
get_precursor_tree
()¶ Returns the data models precursor tree structure
Returns a list of
ChromatogramTransition
root elements (rows) to display in the left side tree view. Each element may contain nestedChromatogramTransition
elements (tree elements).Returns: Root element(s) for the peptide tree Return type: list of ChromatogramTransition
-
get_runs
()¶ Returns the list of
SwathRun
objects of this current data modelReturns: The main content of the class is returned, its list of SwathRun
Return type: list of SwathRun
-
loadFiles
(filenames)¶ Load a set of chromatogram files (no peakgroup information).
Parameters: filenames (list of str) – List of filepaths containing the chromatograms
-
loadMixedFiles
(rawdata_files, aligned_pg_files, fileType)¶ Load files that contain raw data files and aligned peakgroup files.
Since no mapping is present here, we need to infer it from the data. Basically, we try to map the column align_runid to the filenames of the input .chrom.mzML hoping that the user did not change the filenames.
Parameters: - rawdata_files (list of str) – List of paths to chrom.mzML files
- aligned_pg_files (list of str) – List of paths to output files of the FeatureAligner
- fileType (str) – Description of the type of file the metadata file (valid: simple, traml, openswath)
-
loadSqMassFiles
(filenames)¶
-
load_from_yaml
(yamlfile)¶ Load a yaml file containing a mapping of chromatogram files and aligned peakgroup files.
Parameters: yamlfile (str) – Filepath to the yaml file for loading
-
setDrawTransitions
(draw_transitions)¶ Whether to draw individual transitions or not
-
TreeModels
Module¶
Contains classes that provide access to the hierarhical tree container protein, precursor, peptide and transition level data.
While TreeNode
and TreeModel
are generic models for trees
and nodes, the derived classes PeptideTreeNode
and
PeptideTree
are implementations specific to TAPIR.
TreeNode¶
-
class
openswathgui.models.TreeModels.
TreeNode
(parent, row)¶ Bases:
object
Generic model of a tree node
Adopted from http://www.hardcoded.net/articles/using_qtreeview_with_qabstractitemmodel.htm
See
PeptideTreeNode
for implementation.
TreeModel¶
-
class
openswathgui.models.TreeModels.
TreeModel
¶ Bases:
PyQt4.QtCore.QAbstractItemModel
Generic tree model
Adopted from http://www.hardcoded.net/articles/using_qtreeview_with_qabstractitemmodel.htm
See parent class http://qt-project.org/doc/qt-5/QAbstractItemModel.html
See
PeptideTree
for implementation.
PeptideTreeNode¶
-
class
openswathgui.models.PeptideTree.
PeptideTreeNode
(ref, parent, row)¶ Bases:
openswathgui.models.TreeModels.TreeNode
Data model of a node in the left-hand peptide tree in the GUI
PeptideTree¶
-
class
openswathgui.models.PeptideTree.
PeptideTree
(rootElements, firstColumnName='Peptide Sequence')¶ Bases:
openswathgui.models.TreeModels.TreeModel
Data model of the underlying hierarchical data model (proteins, precursors, peptides, transitions).
This class represents the data model, see
openswathgui.views.PeptideTree.PeptidesTreeView
for the view implementation.-
columnCount
(parent)¶ Returns how many columns we have
-
data
(index, role)¶ Get data for a specific index (and role)
Currently supported role is only Qt.DisplayRole (for displaying the tree). The three columns are:
- Compound name (generally peptide sequence or compound sum formula)
- Charge
- Name
Parameters: - index (QModelIndex) – Index of the element to be accessed
- role (Qt::ItemDataRole) – Item role to be used (only Qt.DisplayRole supported)
-
headerData
(section, orientation, role)¶ Get header data (column header) for a specific index (and role)
- The three columns are:
- Peptide Sequence
- Charge
- Name
Note that the user can set the name of the first column name manually in order to accomodate for other data (e.g. metabolomics) where “Peptide Sequence” would not make sense.
-
set_precursor_tree_structure
(data, sortData=True)¶ Initialize tree structure with data from
DataModel.get_precursor_tree
The tree is initialized by giving it a pointer to the root element(s)
Parameters: - data (list of
ChromatogramTransition
) – Root element(s) for the peptide tree - sortData (bool) – Whether to sort data
- data (list of
-
SWATH MS Run
Module¶
Raw chromatographic data is handled using the SwathRunCollection
class which can either hold references to mzML or to SqMass data.
SwathRunCollection¶
-
class
openswathgui.models.SwathRunCollection.
SwathRunCollection
¶ Bases:
object
A collection of SWATH files
Contains multiple SwathRun objects which each represent one single mass spectrometric injection. It can be initialized in three different ways, either from a set of directories (assuming each directory is one run), a set of files mapped to a run id (multiple files may be mapped to run id) or a simple flat list of chromatogram files.
-
getRunIds
()¶ Returns all available run ids
Returns: runlist – A list of all available runs Return type: list of str
-
getSwathFile
(key)¶ Parameters: key (str) – The requested run Returns: run – The run corresponding to the requested run Return type: SwathRun
-
getSwathFiles
()¶ Returns: runs – All runs found in this collection Return type: list of SwathRun
orSqlSwathRun
-
initialize_from_chromatograms
(runid_mapping, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Initialize from a set of mapped chromatogram files. There may be multiple chromatogram (chrom.mzML) files mapped to one run id.
Parameters: - runid_mapping (dict) – A mapping dictionary of form { run_id : [filename, filename, ...] }
- precursor_mapping (dict) – An optional mapping of the form { FullPrecursorName : [transition_id, transition_id, ...] }
- sequences_mapping (dict) – An optional mapping of the form { StrippedSequence : [FullPrecursorName, FullPrecursorName, ...]}
-
initialize_from_directories
(runid_mapping)¶ Initialize from a directory
This assumes that all .mzML files in the same directory are from the same run. There may be multiple chromatogram (chrom.mzML) files mapped to one run id.
Parameters: runid_mapping ((dict)) – A mapping dictionary of form { run_id : directory }
-
initialize_from_files
(filenames)¶ Initialize from individual files, setting the runid as increasing integers.
This assumes that each .mzML file is from a separate run.
Parameters: filenames (list of str) – A list of filenames
-
initialize_from_sql
(filenames, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Initialize from a set of sqMass chromatogram files.
Parameters: - filenames (list(str)) – A List of files
- precursor_mapping (dict) – An optional mapping of the form { FullPrecursorName : [transition_id, transition_id, ...] }
- sequences_mapping (dict) – An optional mapping of the form { StrippedSequence : [FullPrecursorName, FullPrecursorName, ...]}
-
initialize_from_sql_map
(runid_mapping, filenames, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Initialize from a set of sqMass chromatogram files.
Parameters: - filenames (list(str)) – A List of files
- precursor_mapping (dict) – An optional mapping of the form { FullPrecursorName : [transition_id, transition_id, ...] }
- sequences_mapping (dict) – An optional mapping of the form { StrippedSequence : [FullPrecursorName, FullPrecursorName, ...]}
-
SqMass¶
-
class
openswathgui.models.SqlSwathRun.
SqlSwathRun
(runid, filename, load_in_memory=False, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Data Model for a single sqMass file.
TODO: each file may contain multiple runs!
-
runid
¶ Current run id
- Private Attributes:
- _run: A
SqlDataAccess
object - _filename: Original filename
- _basename: Original filename basename
- _precursor_mapping: Dictionary { FullPrecursorName : [transition_id, transition_id] }
- _sequences_mapping: Dictionary { StrippedSequence : [FullPrecursorName, FullPrecursorName]}
- _run: A
-
add_peakgroup_data
(precursor_id, leftWidth, rightWidth, fdrscore, intensity, assay_rt)¶
-
getTransitionCount
()¶ Get total number of transitions
-
get_all_peptide_sequences
()¶ Get all (stripped) sequences
-
get_all_precursor_ids
()¶ Get all precursor ids (full sequence + charge)
-
get_all_proteins
()¶
-
get_assay_data
(precursor)¶
-
get_data_for_precursor
(precursor)¶ Retrieve raw data for a specific precursor - data will be as list of pairs (timearray, intensityarray)
-
get_data_for_transition
(transition_id)¶ Retrieve raw data for a specific transition
-
get_id
()¶
-
get_intensity_data
(precursor)¶
-
get_precursors_for_sequence
(sequence)¶ Get all precursors mapping to one stripped sequence
-
get_range_data
(precursor)¶
-
get_score_data
(precursor)¶
-
get_sequence_for_protein
(protein)¶
-
get_transitions_for_precursor
(precursor)¶ Return the transition names for a specific precursor
-
get_transitions_for_precursor_display
(precursor)¶
-
remove_precursors
(toremove)¶ Remove a set of precursors from the run (this can be done to filter down the list of precursors to display).
-
-
class
openswathgui.models.SqlDataAccess.
SqlDataAccess
(filename)¶ Bases:
object
-
getDataForChromatogram
(myid)¶ Get data from a single chromatogram
- compression is one of 0 = no, 1 = zlib, 2 = np-linear, 3 = np-slof, 4 = np-pic, 5 = np-linear + zlib, 6 = np-slof + zlib, 7 = np-pic + zlib
- data_type is one of 0 = mz, 1 = int, 2 = rt
- data contains the raw (blob) data for a single data array
-
getDataForChromatogramFromNativeId
(native_id)¶ Get data from a single chromatogram
- compression is one of 0 = no, 1 = zlib, 2 = np-linear, 3 = np-slof, 4 = np-pic, 5 = np-linear + zlib, 6 = np-slof + zlib, 7 = np-pic + zlib
- data_type is one of 0 = mz, 1 = int, 2 = rt
- data contains the raw (blob) data for a single data array
-
getDataForChromatograms
(ids)¶ Get data from multiple chromatograms chromatogram
- compression is one of 0 = no, 1 = zlib, 2 = np-linear, 3 = np-slof, 4 = np-pic, 5 = np-linear + zlib, 6 = np-slof + zlib, 7 = np-pic + zlib
- data_type is one of 0 = mz, 1 = int, 2 = rt
- data contains the raw (blob) data for a single data array
-
MzML File¶
-
class
openswathgui.models.SwathRun.
SwathRun
(files, runid=None, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Bases:
object
Data model for an individual SWATH injection (may contain multiple mzML files).
This contains the model for all data from a single run (e.g. one panel in the viewer - in reality this could be multiple actual MS runs since in SRM not all peptides can be measured in the same injection or just multiple files generated by SWATH MS.
It abstracts all the interfaces of SingleChromatogramFile, usually all other classes directly communicate with this class.
-
runid
¶ Current run id
- Private Attributes:
_all_swathes: Dictionary of { mz : SingleChromatogramFile }
_files: List of files that are containing data for this run
_in_memory: Whether data should be held in memory
_range_mapping: Dictionary of { precursor_id : [leftWidth, rightWidth] }
_score_mapping: Dictionary of { precursor_id : FDR_score }
_intensity_mapping: Dictionary of { precursor_id : Intensity }
-
add_peakgroup_data
(precursor_id, leftWidth, rightWidth, fdrscore, intensity, assay_rt)¶
-
getTransitionCount
()¶ Aggregate transition count over all files
-
get_all_peptide_sequences
()¶
-
get_all_precursor_ids
()¶
-
get_all_proteins
()¶
-
get_assay_data
(precursor)¶
-
get_data_for_precursor
(precursor)¶ Retrieve raw data for a specific precursor (using the correct run).
-
get_data_for_transition
(transition_id)¶ Retrieve raw data for a specific transition (using the correct run).
-
get_id
()¶
-
get_intensity_data
(precursor)¶
-
get_precursors_for_sequence
(sequence)¶
-
get_range_data
(precursor)¶
-
get_score_data
(precursor)¶
-
get_sequence_for_protein
(protein)¶
-
get_transitions_for_precursor
(precursor)¶
-
get_transitions_for_precursor_display
(precursor)¶
-
remove_precursors
(toremove)¶ Remove a set of precursors from the run (this can be done to filter down the list of precursors to display).
-
-
class
openswathgui.models.SingleChromatogramFile.
SingleChromatogramFile
(run, filename, load_in_memory=False, precursor_mapping=None, sequences_mapping=None, protein_mapping={})¶ Data Model for a single file from one run.
One run may contain multiple mzML files
-
runid
¶ Current run id
- Private Attributes:
- _run: A pymzml.run.Reader object
- _filename: Original filename
- _basename: Original filename basename
- _precursor_mapping: Dictionary { FullPrecursorName : [transition_id, transition_id] }
- _sequences_mapping: Dictionary { StrippedSequence : [FullPrecursorName, FullPrecursorName]}
-
getTransitionCount
()¶ Get total number of transitions
-
get_all_peptide_sequences
()¶ Get all (stripped) sequences
-
get_all_precursor_ids
()¶ Get all precursor ids (full sequence + charge)
-
get_data_for_precursor
(precursor)¶ Retrieve raw data for a specific precursor - data will be as list of pairs (timearray, intensityarray)
-
get_data_for_transition
(transition_id)¶ Retrieve raw data for a specific transition
-
get_id
()¶
-
get_precursors_for_sequence
(sequence)¶ Get all precursors mapping to one stripped sequence
-
get_sequence_for_protein
(protein)¶
-
get_transitions_for_precursor
(precursor)¶ Return the transition names for a specific precursor
-
get_transitions_with_mass_for_precursor
(precursor)¶ Return the transition names prepended with the mass for a specific precursor
-
ChromatogramTransition
Module¶
ChromatogramTransition¶
-
class
openswathgui.models.ChromatogramTransition.
ChromatogramTransition
(name, charge, subelements, peptideSequence=None, fullName=None, datatype='Precursor')¶ Bases:
object
Internal tree structure object representing one row in the in the left side tree.
This is the bridge between the view and the data model
Pointers to objects of
ChromatogramTransition
are passed to callback functions when the selection of the left side tree changes. The object needs to have store information about all the column present in the rows (PeptideSequence, Charge, Name) which are requested by thePeptideTree
model.Also it needs to know how to access the raw data as well as meta-data for a certain transition. This is done through getData, getLabel etc.
-
getAssayRT
(run)¶ Get the intensity for a specific run and current precursor
Parameters: run ( SwathRun
) – SwathRun object which will be used to retrieve dataReturns: The intensity for a specific run and current precursor Return type: float
-
getCharge
()¶ Get charge of precursor
Returns: Charge Return type: int
-
getData
(run)¶ Get raw data for a certain object
If we have a single precursors or a peptide with only one precursor, we show the same data as for the precursor itself. For a peptide with multiple precursors, we show all precursors as individual curves. For a single transition, we simply plot that transition.
Parameters: run ( SwathRun
orSqlSwathRun
) – SwathRun object which will be used to retrieve dataReturns: Returns the raw data of the chromatograms for a given run. The dataformat is a list of transitions and each transition is a pair of (timearray,intensityarray) Return type: list of pairs (timearray, intensityarray)
-
getIntensity
(run)¶ Get the intensity for a specific run and current precursor
Parameters: run ( SwathRun
) – SwathRun object which will be used to retrieve dataReturns: The intensity for a specific run and current precursor Return type: float
-
getLabel
(run)¶ Get the labels for a curve (corresponding to the raw data from getData call) for a certain object.
If we have a single precursors or a peptide with only one precursor, we show the same data as for the precursor itself. For a peptide with multiple precusors, we show all precursors as individual curves. For a single transition, we simply plot that transition.
Parameters: run ( SwathRun
) – SwathRun object which will be used to retrieve dataReturns: The labels to display for each line in the graph Return type: list of str
-
getName
()¶ Get name of precursor
Returns: Name of precursor Return type: str
-
getPeptideSequence
()¶
-
getProbScore
(run)¶ Get the probabilistic score for a specific run and current precursor
Parameters: run ( SwathRun
) – SwathRun object which will be used to retrieve dataReturns: The probabilistic score for a specific run and current precursor Return type: float
-
getRange
(run)¶ Get the data range (leftWidth/rightWidh) for a specific run
Parameters: run ( SwathRun
) – SwathRun object which will be used to retrieve dataReturns: A pair of floats representing the data range (leftWidth/rightWidh) for a specific run Return type: list of float
-
getSubelements
()¶
-
getType
()¶
-