Math¶
Smoothing
Module¶
SmoothingNull¶
SmoothingR¶
-
class
msproteomicstoolslib.math.Smoothing.
SmoothingR
¶ Class to smooth data using the smooth.spline function from R
This is equivalent to the following code:
data1 = c(5,7,8,9,10,15,7.1,6) data2 = c(4,7,9,11,11,14,7.1,6.5) data1 = sort(data1) data2 = sort(data2) smooth.model = smooth.spline(data1,data2,cv=T) data2_pred = predict(smooth.model,data2)$y [1] 2.342662 6.615797 7.292613 7.441842 10.489440 11.858406 11.858406 [8] 13.482255 plot(data1, data2) lines(data1, data2_pred, col="blue")
Doing the same thing in Python
import rpy2.robjects as robjects # uses python-rpy2 data1 = [5,7,8,9,10,15,7.1,6] data2 = [4,7,9,11,11,14,7.1,6.5] rdata1 = robjects.FloatVector(data1) rdata2 = robjects.FloatVector(data2) spline = robjects.r["smooth.spline"] sm = spline(data1,data2,cv=T) predict = robjects.r["predict"] predicted_data = predict(sm, rdata2) numpy.array(predicted_data[1]) array([ 2.34266247, 7.2926131 , 10.48943975, 11.85840597, 11.85840597, 13.48225519, 7.44184246, 6.61579704])
-
initialize
(data1, data2)¶
-
predict
(xhat)¶
-
SmoothingRExtern¶
SmoothingPy¶
-
class
msproteomicstoolslib.math.Smoothing.
SmoothingPy
¶ Smoothing of 2D data using generalized crossvalidation
Will call _smooth_spline_scikit internally but only at a few select points. It then uses the generated smoothed spline to construct an interpolated spline on which then the xhat data is evaluated.
-
de_duplicate_array
(arr)¶
-
initialize
(data1, data2, Nhat=200, xmin=None, xmax=None)¶
-
predict
(xhat)¶
-
re_duplicate_array
(arr_fixed, duplications)¶
-
LowessSmoothingStatsmodels¶
-
class
msproteomicstoolslib.math.Smoothing.
LowessSmoothingStatsmodels
¶ Bases:
msproteomicstoolslib.math.Smoothing.LowessSmoothingBase
Smoothing using Lowess smoother and then interpolate on the result
statsmodels now also has fast Cython lowess, see https://github.com/statsmodels/statsmodels/pull/856
This faster lowess should be in version 0.5.0 of statsmodels (anaconda currently has version 0.6.0). However, Ubuntu only has version 0.5.0 from 14.04 onwards, so be careful.
frac: float Between 0 and 1. The fraction of the data used when estimating each y-value. it: int The number of residual-based reweightings to perform.
LowessSmoothingBiostats¶
-
class
msproteomicstoolslib.math.Smoothing.
LowessSmoothingBiostats
¶ Bases:
msproteomicstoolslib.math.Smoothing.LowessSmoothingBase
Smoothing using Lowess smoother and then interpolate on the result
LowessSmoothingCyLowess¶
-
class
msproteomicstoolslib.math.Smoothing.
LowessSmoothingCyLowess
¶ Bases:
msproteomicstoolslib.math.Smoothing.LowessSmoothingBase
Smoothing using Lowess smoother and then interpolate on the result
UnivarSplineNoCV¶
UnivarSplineCV¶
-
class
msproteomicstoolslib.math.Smoothing.
UnivarSplineCV
¶ Smoothing of 2D data using a Python spline (using crossvalidation to determine smoothing parameters).
Will use UnivariateSpline internally, setting the scipy smoothing parameter optimally “s” using crossvalidation with part of the data (usually 25/75 split). This prevents overfit to the data.
-
initialize
(data1, data2, frac_training_data=0.75, max_iter=100, s_iter_decrease=0.75, verb=False)¶
-
predict
(xhat)¶
-
SmoothingEarth¶
-
class
msproteomicstoolslib.math.Smoothing.
SmoothingEarth
¶ Class for MARS type smoothing based on pyearth
Get it at https://github.com/jcrudy/py-earth/
-
initialize
(data1, data2)¶
-
predict
(xhat)¶
-
SmoothingLinear¶
SmoothingInterpolation¶
LocalKernel¶
WeightedNearestNeighbour¶
-
class
msproteomicstoolslib.math.Smoothing.
WeightedNearestNeighbour
(topN, max_diff, min_diff, removeOutliers, exponent=1.0)¶ Bases:
msproteomicstoolslib.math.Smoothing.LocalKernel
Class for weighted interpolation using local linear differences
This function uses the weighted mean of the k nearest neighbors to calculate the transformation. This method may be affected by single outlier close to the transformation point.
Each neighboring point is given a weight equal to
1 -------------------------- abs( distance ) ** exp
up to a minimal distance min_diff after which the weight cannot increase any more.
-
predict
(xhat)¶
-
SmoothLLDMedian¶
-
class
msproteomicstoolslib.math.Smoothing.
SmoothLLDMedian
(topN, max_diff, min_diff, removeOutliers)¶ Bases:
msproteomicstoolslib.math.Smoothing.LocalKernel
Class for local median interpolation using local linear differences
This function uses the median of the k nearest neighbors to calculate the transformation. This is robust, unweighted method as a single outlier will not substantially affect the result.
This method assumes that the data is locally smooth and linear
-
predict
(xhat)¶
-