In addition to this tutorial, you can find video tutorials on TopPIC Suite (link) and the interpretation of TopPIC and TopMG identifications (link). We thank Dr. David Tabb for making these video tutorials.
In this tutorial, we use TopPIC Suite to analyze two top-down LC-MS/MS data files on a computer with a Windows Operating System. Annotated proteoform spectrum matches (PrSMs) identified by TopPIC from the data files can be browsed here.
Create the folders below for software packages and data sets used in this tutorial.
toppic_tutorial on the C: drive of your system.
          toppic in the folder C:\toppic_tutorial\ for
            the software TopPIC suite.
          tutorial_1 in the folder C:\toppic_tutorial\.
          tutorial_2 in the folder C:\toppic_tutorial\.
          tutorial_3 in the folder C:\toppic_tutorial\.
          tutorial_4 in the folder C:\toppic_tutorial\.
        The resulting folder structure is shown in the screenshot below.
           
        
Msconvert is a software tool in ProteoWizard that converts raw files into various spectrum file formats. Follow the steps below to download ProteoWizard:
C:\toppic_tutorial\toppic\.C:\toppic_tutorial\toppic\.
           
          
In the MS experiment, the protein extract of S. typhimurium was reduced with dithiothreitol and alkylated with iodoacetamide. The protein mixture was first separated by gas-phase fractionation, resulting in 7 fractions. Each fraction was separated by an HPLC system coupled with an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific). MS and MS/MS spectra were collected at a resolution of 60,000 and 30,000, respectively. In this tutorial, we use only the data files of two fractions (st_1.raw and st_2.raw).
Click here
          to download the data set, save it in the folder C:\toppic_tutorial\tutorial_1\, and unzip it in
          the same folder.
A S. typhimurium proteome database of 4,533 proteins was downloaded from the UniProt database.
Click here
          to download the protein database and save it in the folder
          C:\toppic_tutorial\tutorial_1\.
        
The folder C:\toppic_tutorial\tutorial_1\ is shown in the screenshot below.
 
        
        We use TopIndex to generate index files from the protein database. They will speed up database search of TopPIC and TopMG. This step is optional. Skipping index generation only slows the analysis of section 4.5 for database search. While TopIndex supports multithreading, users with a spinning hard disk would experience faster speed when using only one thread instead of multple threads. TopIndex generates very large index files. For example, index files generated for the targe-decoy concatenated UniProt human proteome database are about 240 GB. To achieve high speed index generation, we suggest that a computer with at least 1 TB SSD (Solid State Drive) should be used.
topindex_gui.exe in the folder
            C:\toppic_tutorial\toppic.
          C:\toppic_tutorial\tutorial_1\uniprot-st.fasta.Carbamidomethylation on cysteine as the fixed modification. Decoy database. 
          The screenshot of topindex_gui is shown below.
        
 
        
          TopIndex generates a folder
          C:\toppic_tutorial\tutorial_1\uniprot-st.fasta_idx
          containing index files.
        
In the analysis, carbamidomethylation is selected as the fixed modification because proteins were reduced with dithiothreitol and alkylated with iodoacetamide before the MS experiment. When proteins are not reduced, no fixed modification should be selected.
We use MSConvertGUI to convert the raw files st_1.raw and st_2.raw to mzML files.
C:\toppic_tutorial\tutorial_1\st_1.raw
            and C:\toppic_tutorial\tutorial_1\st_2.raw as input
            files.The screenshot of MSConvertGUI is shown below.
 
        
In the above file format conversion, the peak picking filter (step 3) is used to generate centroid, not profile, mzML data files, which are required by the spectral deconvolution tool TopFD.
The resulting mzML files are
C:\toppic_tutorial\tutorial_1\st_1.mzMLand
C:\toppic_tutorial\tutorial_1\st_2.mzMLThe sizes of the two files are about 41 MB and 47 MB, respectively. They can be downloaded here. The running time for the file format conversion is less than one minute.
We use topfd_gui for top-down mass spectral deconvolution.
topfd_gui.exe in the folder
            C:\toppic_tutorial\toppic.
          C:\toppic_tutorial\tutorial_1\st_1.mzML
            and C:\toppic_tutorial\tutorial_1\st_2.mzML as input files.
          The screenshot of topfd_gui is shown below.
        
 
        
TopFD reports ten text files and two folders.
C:\toppic_tutorial\tutorial_1\st_1_ms1.msalign
C:\toppic_tutorial\tutorial_1\st_2_ms1.msalign
C:\toppic_tutorial\tutorial_1\st_1_ms2.msalign
C:\toppic_tutorial\tutorial_1\st_2_ms2.msalign
C:\toppic_tutorial\tutorial_1\st_1_ms1.feature
C:\toppic_tutorial\tutorial_1\st_1_ms2.feature
C:\toppic_tutorial\tutorial_1\st_2_ms1.feature
C:\toppic_tutorial\tutorial_1\st_2_ms2.feature
C:\toppic_tutorial\tutorial_1\st_1_feature.xml
C:\toppic_tutorial\tutorial_1\st_2_feature.xml
C:\toppic_tutorial\tutorial_1\st_1_html\topfd
C:\toppic_tutorial\tutorial_1\st_2_html\topfd
The output files and folders can be downloaded here.
We use toppic_gui to search the MS/MS spectra in
            st_1_ms2.msalign and st_2_ms2.msalign
            against the protein database uniprot-st.fasta to
            identify PrSMs with a variable PTM file var_mods.txt,
            in which oxidation on methionine is set
            as a variable PTM. The variable PTM file can be downloaded
            here.
          
toppic_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_1\uniprot-st.fasta as the protein
              database file.C:\toppic_tutorial\tutorial_1\st_1_ms2.msalign
              and C:\toppic_tutorial\tutorial_1\st_2_ms2.msalign as
              mass spectrum data files. Carbamidomethylation on cysteine as the fixed modification. Decoy database. FDR as the spectrum level cutoff type. FDR as the proteoform level cutoff type. 
            The screenshots of toppic_gui are shown below.
          
 
          
 
          
For each input msalign file, TopPIC reports four TSV files, two XML files, and collections of HTML files for identified proteoforms. For example, the output files for st_1_ms2.msalign are
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_prsm.tsv
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_prsm_single.tsv
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_proteoform.tsv
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_proteoform_single.tsv
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_proteoform.xml
C:\toppic_tutorial\tutorial_1\st_1_ms2_toppic_prsm.xml
C:\toppic_tutorial\tutorial_1\st_1_html\toppic_prsm_cutoff
C:\toppic_tutorial\tutorial_1\st_1_html\toppic_proteoform_cutoff
C:\toppic_tutorial\tutorial_1\st_1_html\topmsv
In addition, the identifications reported for st_1_ms2.msalign and st_2_ms2.msalign are combined, and filtered by a 1% spectrum-level FDR and a 1% proteoform-level FDR. The combined results are reported in the following files.
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_prsm.tsv
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_prsm_single.tsv
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_proteoform.tsv
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_proteoform_single.tsv
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_proteoform.xml
C:\toppic_tutorial\tutorial_1\combined_ms2_toppic_prsm.xml
In the analysis, carbamidomethylation is selected as the fixed modification because proteins were reduced with dithiothreitol and alkylated with iodoacetamide before the MS experiment. When proteins are not reduced, no fixed modification should be selected.
            A shuffled decoy database is concatenated
            to the target database to estimate spectrum-level and proteoform-level
            FDRs. All identified PrSMs are first filtered by a
            1% spectrum-level FDR and the resulting PrSMs are reported in the
            file combined_ms2_toppic_prsm.tsv. The proteoforms corresponding to the PrSMs
            are further filtered using a 1% proteoform-level FDR and
            the resulting proteoforms and their corresponding best PrSMs are reported in the file
            combined_ms2_toppic_proteoform.tsv. Microsoft Excel can be used to open these two files.
	    
          
The output files can be downloaded here.
We use topindex to generate index files from the protein database uniprot-st.fasta
            to speed up database search of TopPIC and TopMG.
          
C:\toppic_tutorial\toppic\topindex.exe
C:\toppic_tutorial\tutorial_1\uniprot-st.fasta
cd C:\toppic_tutorial\tutorial_1
..\toppic\topindex -f C57 -d uniprot-st.fasta
We use topfd for top-down mass spectral deconvolution.
C:\toppic_tutorial\toppic\topfd.exe
C:\toppic_tutorial\tutorial_1\st_1.mzML
C:\toppic_tutorial\tutorial_1\st_2.mzML
cd C:\toppic_tutorial\tutorial_1
..\toppic\topfd st_*.mzML
We use toppic to search the MS/MS spectra in st_1_ms2.msalign
            and st_2_ms2.msalign
            against the protein database uniprot-st.fasta to identify PrSMs.
          
C:\toppic_tutorial\toppic\toppic.exe
C:\toppic_tutorial\tutorial_1\uniprot-st.fasta
C:\toppic_tutorial\tutorial_1\st_1_ms1.msalign
C:\toppic_tutorial\tutorial_1\st_2_ms1.msalign
C:\toppic_tutorial\tutorial_1\st_1_ms2.msalign
C:\toppic_tutorial\tutorial_1\st_2_ms2.msalign
C:\toppic_tutorial\tutorial_1\st_1_ms1.feature
C:\toppic_tutorial\tutorial_1\st_1_ms2.feature
C:\toppic_tutorial\tutorial_1\st_2_ms1.feature
C:\toppic_tutorial\tutorial_1\st_2_ms2.feature
C:\toppic_tutorial\tutorial_1\var_mods.txt
cd C:\toppic_tutorial\tutorial_1
..\toppic\toppic -f C57 -d -t FDR -T FDR -b var_mods.txt -c combined uniprot-st.fasta st_*_ms2.msalign
We will use TopMG to analyze the data set st_1.raw described in Tutorial 1. TopMG is still in the development stage. Please let us know if you find any bugs in it. .
C:\toppic_tutorial\tutorial_2\, and
          unzip it. It includes the following files.
          C:\toppic_tutorial\tutorial_2\uniprot-st.fasta
C:\toppic_tutorial\tutorial_2\st_1_ms1.msalign
C:\toppic_tutorial\tutorial_2\st_1_ms2.msalign
C:\toppic_tutorial\tutorial_2\st_1_ms1.feature
C:\toppic_tutorial\tutorial_2\st_1_ms2.feature
C:\toppic_tutorial\tutorial_2\var_mods.txt
C:\toppic_tutorial\tutorial_2\st_1_html\topfd
To speed up database search, follow the steps in Section 4.2.1 to generate index files for the database file uniprot-st.fasta. If index files have been generated, it is not necessary to regenerate index files. You can copy the index folder to the folder C:\toppic_tutorial\tutorial_2\.
topmg_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_2\uniprot-st.fasta as the protein
              database file.C:\toppic_tutorial\tutorial_2\st_1_ms2.msalign as a
              mass spectrum data file. C:\toppic_tutorial\tutorial_2\variable_mods.txt as the file of variable PTMs. Carbamidomethylation on cysteine as the fixed modification. Decoy database. FDR as the spectrum level cutoff type. FDR as the proteoform level cutoff type. 
            The screenshots of topmg_gui are shown below.
          
 
          
 
          
TopMG reports four TSV files, two XML files, and a collection of HTML files for identified proteoforms.
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_prsm.tsv
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_prsm_single.tsv
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_proteoform.tsv
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_proteoform_single.tsv
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_proteoform.xml
C:\toppic_tutorial\tutorial_2\st_1_ms2_topmg_prsm.xml
C:\toppic_tutorial\tutorial_2\st_1_html\topmg_prsm_cutoff
C:\toppic_tutorial\tutorial_2\st_1_html\topmg_proteoform_cutoff
C:\toppic_tutorial\tutorial_1\st_1_html\topmsv
The output files can be downloaded here.
            To browse the PrSM identifications,
            go to the folder st_1_html\topmsv and use Google
              Chrome (Windows Edge and Firefox are not recommended)
            to open the file index.html.
          
C:\toppic_tutorial\toppic\topmg.exe
C:\toppic_tutorial\tutorial_2\uniprot-st.fasta
C:\toppic_tutorial\tutorial_2\st_1_ms1.msalign
C:\toppic_tutorial\tutorial_2\st_1_ms2.msalign
C:\toppic_tutorial\tutorial_2\st_1_ms1.feature
C:\toppic_tutorial\tutorial_2\st_1_ms2.feature
C:\toppic_tutorial\tutorial_2\var_mods.txt
cd C:\toppic_tutorial\tutorial_2
..\toppic\topindex -f C57 -d uniprot-st.fasta
..\toppic\topmg -f C57 -d -t FDR -v 0.05 -T FDR -V 0.05 -i variable_mods.txt uniprot-st.fasta st_1_ms2.msalign
We will use TopPIC and TopDiff to compare the abundance of proteoforms and find differentially expressed proteoforms using two MS data files of Escherichia coli cells (ecoli_1.raw and ecoli_2.raw).
In the MS experiment, the protein extract of E. coli was reduced with dithiothreitol and alkylated with iodoacetamide. The protein mixture was separated by capillary zone electrophoresis and analyzed by an LTQ-Orbitrap mass spectrometer (Thermo Fisher Scientific). Technical duplicates were generated for testing proteoform quantification in two runs of the same sample.
C:\toppic_tutorial\tutorial_3\, and
          unzip it. It includes the following files.
          C:\toppic_tutorial\tutorial_3\uniprot-ecoli.fasta
C:\toppic_tutorial\tutorial_3\ecoli_1_mzML
C:\toppic_tutorial\tutorial_3\ecoli_2_mzML
C:\toppic_tutorial\tutorial_3\ecoli_1_ms1.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms1.msalign
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms1.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_1_ms1.feature
C:\toppic_tutorial\tutorial_3\ecoli_2_ms1.feature
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.feature
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.feature
C:\toppic_tutorial\tutorial_3\ecoli_1_feature.xml
C:\toppic_tutorial\tutorial_3\ecoli_2_feature.xml
C:\toppic_tutorial\tutorial_3\ecoli_1_html\topfd
C:\toppic_tutorial\tutorial_3\ecoli_2_html\topfd
To speed up database search, follow the steps in Section 4.2.1 to generate index files for the database file uniprot-ecoli.fasta. If index files have been generated, it is not necessary to regenerate index files.
We use toppic_gui to search the MS/MS spectra in
            ecoli_1_ms2.msalign and ecoli_2_ms2.msalign
            against the protein database uniprot-ecoli.fasta to
            identify PrSMs.
          
toppic_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_3\uniprot-ecoli.fasta as the protein
              database file.C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.msalign
              and C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.msalign as
              mass spectrum data files. Carbamidomethylation on cysteine as the fixed modification. Decoy database. FDR as the spectrum level cutoff type. FDR as the proteoform level cutoff type. 
            The screenshots of toppic_gui are shown below.
          
 
          
 
          
For each input msalign file, TopPIC reports two TSV files, two XML files, and collections of html files for identified proteoforms. As a result, the output files for ecoli_1_ms2.msalign, ecoli_2_ms2.msalign are
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_prsm.tsv
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_prsm.tsv
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_prsm_single.tsv
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_prsm_single.tsv
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_proteoform.tsv
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_proteoform.tsv
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_proteoform_single.tsv
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_proteoform_single.tsv
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_proteoform.xml
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_proteoform.xml
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_prsm.xml
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_prsm.xml
C:\toppic_tutorial\tutorial_3\ecoli_1_html\toppic_prsm_cutoff
C:\toppic_tutorial\tutorial_3\ecoli_2_html\toppic_prsm_cutoff
C:\toppic_tutorial\tutorial_3\ecoli_1_html\toppic_proteoform_cutoff
C:\toppic_tutorial\tutorial_3\ecoli_2_html\toppic_proteoform_cutoff
C:\toppic_tutorial\tutorial_3\ecoli_1_html\topmsv
C:\toppic_tutorial\tutorial_3\ecoli_2_html\topmsv
The output files can be downloaded here.
topdiff_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.msalign and
              C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.msalign as
              mass spectrum data files.
            
            The screenshots of topdiff_gui are shown below.
          
 
          
TopDiff reports one TSV file for identified proteoforms with their abundances in the input mass spectrum data
C:\toppic_tutorial\tutorial_3\sample_diff.tsv
The output file can be downloaded here.
C:\toppic_tutorial\toppic\toppic.exe
C:\toppic_tutorial\tutorial_3\uniprot-ecoli.fasta
C:\toppic_tutorial\tutorial_3\ecoli_1_ms1.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms1.msalign
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_1_ms1.feature
C:\toppic_tutorial\tutorial_3\ecoli_2_ms1.feature
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.feature
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.feature
cd C:\toppic_tutorial\tutorial_3
..\toppic\topindex -f C57 -d uniprot-ecoli.fasta
..\toppic\toppic -f C57 -d -t FDR -T FDR uniprot-ecoli.fasta ecoli_*_ms2.msalign
C:\toppic_tutorial\toppic\topdiff.exe
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2.msalign
C:\toppic_tutorial\tutorial_3\ecoli_1_ms2_toppic_proteoform.xml
C:\toppic_tutorial\tutorial_3\ecoli_2_ms2_toppic_proteoform.xml
cd C:\toppic_tutorial\tutorial_3
..\toppic\topdiff ecoli_1_ms2.msalign ecoli_2_ms2.msalign
In the MS experiment, the protein extract of E. coli was separated by an HPLC system and then analyzed with an Thermo Fusion Lumos mass spectrometer (Thermo Fisher Scientific) using the DIA mode. A total of six MS runs were carried out. In each run, quadrupole gas-phase fractionation was used to acquire MS1 scans for precursor ions in a fixed 80 m/z range. Specifically, the six runs covered the following m/z ranges: 720–800, 800–880, 880–960, 960–1040, 1040–1120, and 1120–1200. MS1 spectra were collected with a resolution of 240,000 (at 200 m/z), 4 micro scans, an automatic gain control (AGC) target value of 1 × 106, and a maximum injection time of 200 ms. A 4 m/z isolation window was used to generate MS/MS spectra, resulting in a total of 20 MS/MS spectra for each cycle. MS/MS spectra were obtained with a scan range of 400–2000 m/z, a resolution of 60,000 (at 200 m/z), 1 micro scan, an AGC target value of 1 × 106, and a maximum injection time of 500 ms. Fragmentation was performed using higher-energy collisional dissociation (HCD) with 30% nominal collision energy (NCE). The MS run with an m/z range 800-880 was used in the tutorial (ecoli_800_880.raw).
The method in Section 4.3 can be used to convert the raw file to its correponding centroided mzML file.
Click here
            to download the data set, save it in the folder C:\toppic_tutorial\tutorial_4\, and unzip it in
            the same folder. It includes the following files.
          
C:\toppic_tutorial\tutorial_4\uniprot-ecoli.fasta
C:\toppic_tutorial\tutorial_4\ecoli_800_880.raw
C:\toppic_tutorial\tutorial_3\ecoli_800_880.mzML
We use topdia_gui for top-down mass spectral deconvolution.
topdia_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_4\ecoli_800_880.mzML
              as the input file.
            The screenshot of topdia_gui is shown below.
          
 
          
TopDIA reports 2 output files for each isolation window. In the data file, a total of 20 isolation windows are used: [800, 804], [804, 808], ..., [876, 880]. For example, the output files for the isolation window [800, 804] are:
C:\toppic_tutorial\tutorial_4\ecoli_800_880_800.000000_ms2.csv
C:\toppic_tutorial\tutorial_4\ecoli_800_880_800.000000_frac_ms2.mzrt.csv
In addition, TopDIA reports the following output files for the MS data file.
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms1.msalign
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms2.msalign
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms1.csv
C:\toppic_tutorial\tutorial_4\ecoli_800_880_frac_ms1.mzrt.feature
C:\toppic_tutorial\tutorial_4\ecoli_800_880_feature.xml
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms1.feature
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms2.feature
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms2_raw.msalign
C:\toppic_tutorial\tutorial_1\ecoli_800_880_html
The output files and folders can be downloaded here.
We use toppic_gui to search the MS/MS spectra in
            ecoli_800_880_ms2.msalign
            against the protein database uniprot-ecoli.fasta to
            identify PrSMs.
          
toppic_gui.exe in the folder
              C:\toppic_tutorial\toppic.
            C:\toppic_tutorial\tutorial_4\uniprot-ecoli.fasta as the protein
              database file.C:\toppic_tutorial\tutorial_4\ecoli_800_880_pseudo_ms2.msalign
              as the mass spectrum data file. Decoy database. MS1 feature is missing. FDR as the spectrum level cutoff type. FDR as the proteoform level cutoff type. 
            The screenshots of toppic_gui are shown below.
          
 
          
 
          
The prsm identifications are stored in the file
          ecoli_800_880_ms2_prsm_single.tsv, and
            the proteoform identifications are stored in the file
            ecoli_800_880_ms2_protoeform_single.tsv.
            Detailed information of the output files of TopPIC can be found in
            Section 4.5.
          
The output files and folders can be downloaded here.
We use topdia for the generation of pseudo MS/MS spectra.
C:\toppic_tutorial\toppic\topdia.exe
C:\toppic_tutorial\tutorial_4\ecoli_800_880.mzML
cd C:\toppic_tutorial\tutorial_4
..\toppic\topdia ecoli_800_880.mzML
We use toppic to search the MS/MS spectra in ecoli_800_880_ms2.msalign
                against the protein database uniprot-ecoli.fasta to identify PrSMs.
              
C:\toppic_tutorial\toppic\toppic.exe
C:\toppic_tutorial\tutorial_4\uniprot-ecoli.fasta
C:\toppic_tutorial\tutorial_4\ecoli_800_880_ms2.msalign
cd C:\toppic_tutorial\tutorial_4
..\toppic\toppic -x -d -t FDR -T FDR -x uniprot-ecoli.fasta ecoli_800_880_pseudo_ms2.msalign