Description
Hi,
My original intention for this operation is that when I use other software (such as SIRIUS) to perform downstream non-targeted screening analysis of mass spectrometry data, I can only predict the structure through the information of MS2, lacking the isotope abundance information based on MS1 (or I did not find it in mgf file). This makes it much more computationally expensive and for predicted unknown structures I need to go back to the original data to check. This is very complex and time-consuming for processing large amounts of data. , so I try to export MS1 and MS2 data separately as shown below:
app_spectra_feature_1 <- featureSpectra(app_data, msLevel = 1L, return.type = "Spectra")
app_spectra_feature_1 <- setBackend(app_spectra_feature_1, MsBackendDataFrame())
combined_app_spectra_feature_1 <- Spectra::combineSpectra(app_spectra_feature_1, f = app_spectra_feature_1$feature_id, FUN = maxTic)
applyProcessing(combined_app_spectra_feature_1)
export(combined_app_spectra_feature_1, backend = MsBackendMgf(), file = str_c(getwd(), "/export_ms1.mgf"))
for MS1 data
app_spectra_feature_2 <- featureSpectra(app_data, msLevel = 2L, return.type = "Spectra")
app_spectra_feature_2 <- setBackend(app_spectra_feature_2, MsBackendDataFrame())
combined_app_spectra_feature_2 <- Spectra::combineSpectra(app_spectra_feature_2, f = app_spectra_feature_2$feature_id, FUN = maxTic)
applyProcessing(combined_app_spectra_feature_2)
export(combined_app_spectra_feature_2, backend = MsBackendMgf(), file = str_c(getwd(), "/export_ms2.mgf"))
for MS2 data
where app_data is the data after XCMS alignment, maxTic function is copied from the tutorial of spectra package https://bioconductor.org/packages/devel/bioc/vignettes/Spectra/inst/doc/Spectra.html.
But the output of the above file didn't match.
Taking featureid=FT0006 as an example:
in MS1 file:
BEGIN IONS
TITLE=F1.S03885
msLevel=1
RTINSECONDS=324.840585373391
SCANS=3885
scanIndex=3885
centroided=TRUE
polarity=0
CHARGE=
fromFile=1
spIdx=3885
originalPeaksCount=8424
totIonCurrent=2528588
basePeakMZ=271.065690943058
basePeakIntensity=21486
ionisationEnergy=0
lowMZ=0
highMZ=0
injectionTime=0
spectrumId=sample=1 period=1 cycle=435 experiment=1
scanWindowLowerLimit=100
scanWindowUpperLimit=2000
spectrum=3885
peak_index=1757
peak_id=CP1950
feature_id=FT0006
100.983695305041 27.7389870009022
101.94674481063 17.0476436590638
102.973269367472 36.0988944285392
103.963478572949 193.78539977588
......
and that in MS2 file:
BEGIN IONS
TITLE=F1.S04517
msLevel=2
RTINSECONDS=373.171342169065
SCANS=4517
scanIndex=4517
centroided=TRUE
polarity=0
PEPMASS=104.017951192131
PEPMASSINT=0
CHARGE=0-
collisionEnergy=0
isolationWindowLowerMz=103.517951192131
isolationWindowTargetMz=104.017951192131
isolationWindowUpperMz=104.517951192131
fromFile=1
spIdx=4517
originalPeaksCount=0
totIonCurrent=74
basePeakMZ=87.918949764033
basePeakIntensity=20
ionisationEnergy=0
lowMZ=0
highMZ=0
injectionTime=0
spectrumId=sample=1 period=1 cycle=475 experiment=2
scanWindowLowerLimit=50
scanWindowUpperLimit=1500
spectrum=4517
peak_index=1757
peak_id=CP1950
feature_id=FT0006
241.930035923613 21.8450776231407
242.145258260792 13.1129049304946
242.348550934539 13.1184082351865
242.557397242146 30.6224602596217
242.748926995843 13.1292669063043
244.634332081977 21.9668801553894
244.776038944817 8.78929660092354
245.003514817378 4.3966898489316
245.087059045271 4.39743940307835
245.478587394706 13.2028514158264
246.974983600631 4.41434381917816
END IONS
All I wanted is to generate MS1 information for each MS2 ions. Is there any way to achieve such an operation?
I also copied the information of 'app_data' here for your convenience
MSn experiment data ("XCMSnExp")
Object size in memory: 10.83 Mb
- - - Spectra data - - -
MS level(s): 1 2
Number of spectra: 29007
MSn retention times: 0:00 - 24:60 minutes
- - - Processing information - - -
Data loaded [Tue Aug 27 10:00:41 2024]
MSnbase version: 2.28.1
- - - Meta data - - -
phenoData
rowNames: 1 2
varLabels: sample_name sample_group
varMetadata: labelDescription
Loaded from:
ch4-neg.mzML, o-neg.mzML
protocolData: none
featureData
featureNames: F1.S00001 F1.S00002 ... F2.S14327 (29007 total)
fvarLabels: fileIdx spIdx ... spectrum (35 total)
fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'
- - - xcms preprocessing - - -
Chromatographic peak detection:
method: centWave
11072 peaks identified in 2 samples.
On average 5536 chromatographic peaks per sample.
Alignment/retention time adjustment:
method: peak groups
Correspondence:
method: chromatographic peak density
5836 features identified.
Median mz range of features: 0
Median rt range of features: 0
3845 filled peaks (on average 1922.5 per sample).
This is my first time asking a question on github. If there is any problem with my question, please tell me.
Thankyou!