July 31, 2012, midnight by Sonya Alexandrova
Topics: Computational Mass Spectrometry
Comparing Spectra
Suppose you have two mass spectra, and you want to check if they both were obtained from the same protein; you will need some notion of spectra similarity. The simplest possible metric would be to count the number of peaks in the mass spectrum that the spectra share, called the shared peaks count; its analogue for simplified spectra is the number of masses that the two spectra have in common.
The shared peaks count can be useful in the simplest cases, but it does not help us if, for example, one spectrum corresponds to a peptide contained inside of another peptide from which the second spectrum was obtained. In this case, the two spectra are very similar, but the shared peaks count will be very small. However, if we shift one spectrum to the right or left, then shared peaks will align. In the case of simplified spectra, this means that there is some shift value
$x$ such that adding$x$ to the weight of every element in one spectrum should create a large number of matches in the other spectrum.
A multiset is a generalization of the notion of set to include a collection of objects in which each object may occur
more than once (the order in which objects are given is still unimportant). For a multiset
The Minkowski sum of multisets
If
Given: Two multisets of positive real numbers
Return: The largest multiplicity of
186.07931 287.12699 548.20532 580.18077 681.22845 706.27446 782.27613 968.35544 968.35544 101.04768 158.06914 202.09536 318.09979 419.14747 463.17369
3 85.03163
Observe that