Output & Results

Command Line Output

When running the workflow you should see output similar to:

N E X T F L O W  ~  version 22.10.5
Launching `/home/mriffle/tmp/nf-skyline-dia-ms/main.nf` [stoic_feynman] DSL2 - revision: 34335bd586
executor >  slurm (10)
[16/ab4074] process > get_input_files:PANORAMA_GET_FASTA                   [100%] 1 of 1 ✔
[skipped  ] process > get_wide_mzmls:MSCONVERT (1)                         [100%] 1 of 1, stored: 1 ✔
[32/5df5b3] process > ENCYCLOPEDIA_BLIB_TO_DLIB                            [100%] 1 of 1 ✔
[skipped  ] process > get_narrow_mzmls:MSCONVERT (1)                       [100%] 2 of 2, stored: 2 ✔
[e9/12221d] process > encyclopeda_export_elib:ENCYCLOPEDIA_SEARCH_FILE (2) [100%] 2 of 2 ✔
[73/eb8a16] process > encyclopeda_export_elib:ENCYCLOPEDIA_CREATE_ELIB     [100%] 1 of 1 ✔
[e1/dcca0b] process > encyclopedia_quant:ENCYCLOPEDIA_SEARCH_FILE (1)      [100%] 1 of 1 ✔
[ec/89bf5e] process > encyclopedia_quant:ENCYCLOPEDIA_CREATE_ELIB          [100%] 1 of 1 ✔
[ca/247f6e] process > skyline_import:SKYLINE_ADD_LIB                       [100%] 1 of 1 ✔
[9c/56ec4e] process > skyline_import:SKYLINE_IMPORT_MZML (1)               [100%] 1 of 1 ✔
[5e/09549d] process > skyline_import:SKYLINE_MERGE_RESULTS                 [100%] 1 of 1 ✔
Completed at: 07-Jun-2023 13:24:28
Duration    : 40m 23s
CPU hours   : 45.2
Succeeded   : 10

The first line shows the version of Nextflow you are running. The second line shows the version of the workflow you are running. The third line shows the executor you are using. An executor in Nextflow describes the actual system the steps of the workflow are running on. In this case the slurm computer cluster executor was used. The next several lines show the actual steps of the workflow as they are running. If a particular step is run multiple times (e.g., converting many RAW files to mzML using msconvert), the percent complete shows the percentage of the RAW files that have been converted. The final four lines appear when the workflow completes, showing the completion time, how long it took, and the number of steps that succeeded.

Workflow Log

The log file called .nextflow.log will appear in the directory in which the workflow was run. It can be helpful for determining the cause of any problems. A log file will also be generated for each task executed by the workflow, which will be described below.

Workflow Results

All results will be output to the results/nf-skyline-dia-ms subdirectory in the directory in which the workflow was run. In this directory is a subdirectory for each program that was run as part of the workflow. By default, the final Skyline document called final.sky.zip will be present in the skyline/import-spectra directory (this file name can be customized). The final EncyclopeDIA results file called wide-combined-results.elib will be present in the encyclopedia/create-elib drectory. A full description of output files can be found below.

Output Files

Below are each subdirectory created in results/nf-skyline-dia-ms and a description of files that will be found in those directories.

panorama Subdirectory

This directory contains logs for file transfers from PanoramaWeb. There will be a *.stderr and *.stdout for each file that was transferred. Any errors encountered transferring that file will be present in the stderr file. And the command line output of the transfer program can be found in the stdout file.

encyclopedia/convert-blib Subdirectory

If the user submits a blib spectral library, EncyclopeDIA will be used to convert it to a dlib, which is used in subsequent steps of the workflow. The files present in this directory will be:

  • file.dlib - The dlib that was generated by EncyclopeDIA. file will be the base filename of the blib file.

  • encyclopedia-convert-blib.stdout - The command line output of EncyclopeDIA for the conversion.

  • encyclopedia-convert-blib.stderr - Any error output generated for the conversion.

encyclopedia/search-file Subdirectory

This directory contains the output from EncyclopeDIA when searching individual scan files (mzML files). There will be a set of files for each scan file that was searched, where all files in that set will have the same base name as the scan file. E.g., if the scan filed was named my_scan_file.raw, each file in the set would begin with my_scan_file.

Note

The encyclopedia.save_output configuration parameter (see Workflow Parameters) affects which files will be present in this directory. If encyclopedia.save_output is set to true, all files will be present. If encyclopedia.save_output is set to false, only the stdout and stderr files will be present.

The files present for each scan file will be:

  • my_scan_file.dia - EncyclopeDIA converts mzML files into the dia format, which is how it represents scan data. These files may be quite large.

  • my_scan_file.mzML.elib - The elib file generated by EncyclopeDIA for this scan file.

  • my_scan_file.mzML.encyclopedia.decoy.txt - The decoy peptides identified by EncyclopeDIA for this scan file.

  • my_scan_file.mzML.encyclopedia.txt - The target peptides identified by EncyclopeDIA for this scan file.

  • my_scan_file.mzML.features.txt - Features generated by EncyclopeDIA for this scan file as input for Percolator.

  • my_scan_file.stdout - The command line output of EncyclopeDIA for the search of this scan file.

  • my_scan_file.stderr - Any error output generated for the search of this scan file.

encyclopedia/create-elib Subdirectory

When EncyclopeDIA is done searching individual scan files, the results are combined into a single elib file. This happens both for the narrow window chromatogram generation step (if it is being performed) and the quantification step (wide window). The files present in this directory will be:

  • narrow-combined-results.elib - If a narrow window chromatogram generation step is being performed, this is the resulting elib from that step.

  • narrow.stdout - If a narrow window chromatogram generation step is being performed, this is the command line output of EncyclopeDIA during elib generation.

  • narrow.stderr - If a narrow window chromatogram generation step is being performed, this is the error output of EncyclopeDIA during elib generation.

  • wide-combined-results.elib - This is the elib generated by merging and quantifying peptides and proteins from the individual scan files.

  • wide.stdout - This is the command line output of EncyclopeDIA during this step.

  • wide.stderr - This is the error output of EncyclopeDIA during this step.

diann Subdirectory

This directory contains the output of DIA-NN when search_engine = 'diann'. The exact set of files depends on whether a chromatogram-library / subset search was performed, but typically includes:

  • *.tsv and *.parquet - DIA-NN precursor and protein report files.

  • *.speclib - The spectral library used (or predicted) by DIA-NN.

  • *.quant - Per-file DIA-NN quantification artifacts.

  • *.stdout / *.stderr - Command-line and error output for each DIA-NN step.

  • diann_version.txt - The version of DIA-NN used.

When the Skyline branch is enabled, DIA-NN’s results are also packaged into a .blib for Skyline import.

msconvert Subdirectory

This directory holds converted mzML files when msconvert_only is true. In normal runs converted mzMLs are usually staged into the cache directory rather than published here, but stdout/stderr logs from msconvert may still appear.

cascadia Subdirectory

This directory contains the output from Cascadia. There will be some files for each scan file that was searched, where all files in that set will have the same base name as the scan file. E.g., if the scan filed was named my_scan_file.raw, each file in the set would begin with my_scan_file.

The files present for each scan file will be:

  • my_scan_file.ssl - The ssl file containing the search results reported by Cascadia. More about the ssl format: https://skyline.ms/wiki/home/software/BiblioSpec/page.view?name=BiblioSpec%20input%20and%20output%20file%20formats

  • my_scan_file.fixed.ssl - A processed ssl file where scan numbers have been corrected to align with the input mzML.

  • my_scan_file.stderr - Any output to standard error generated by Cascadia when searching this file.

  • my_scan_file.out - Any output to standard out generated by Cascadia when searching this file.

  • output_file_stats_my_scan_file.txt - A text file containing the MD5 hash of the input mzML and output ssl file generated by Cascadia for this search.

In addition the following files will be present:

  • cascadia-utils_version.txt - The version of the cascadia-utils image used in the workflow. This Docker images contains utility scripts that transform Cascadia output.

  • cascadia_version.txt - The version of the cascadia image used in the workflow.

  • combined.ssl - The combined Cascadia results from searching all input raw or mzML files.

  • combined.fasta - A FASTA format file containing the peptides identified by Cascadia.

  • lib.blib - A spectral library containing the Cascadia search results.

skyline/add-lib Subdirectory

The first step to creating the final Skyline document is importing the results of EncyclopeDIA into the Skyline template document. This directory contains the results of this step. The files present in this directory will be:

  • results.sky.zip - The intermediate Skyline document, containing EncyclopeDIA results.

  • skyline_add_library.log - The log output generated by Skyline for this step.

skyline/import-spectra Subdirectory

Skyline imports scan data in parallel for each scan file and merges those results into a single, final Skyline document. For each scan file, these files will be present:

  • my_scan_file.mzML.skyd - The intermediate skyd file generated by Skyline when importing this scan file.

  • my_scan_file.log - The log output generated by Skyline for this step.

Then for the merge step, these files will be present:

  • final.sky.zip - The final Skyline document containing all scan data and all search results. This file name can be customized using the skyline.document_name parameter. In multi-batch mode, one document per batch is produced (e.g., final_BatchA.sky.zip, final_BatchB.sky.zip); see Workflow Parameters for batch-mode details.

  • skyline-merge.log - The log output generated by Skyline for the merge step.

skyline/minimize Subdirectory

Created when skyline.minimize is true. Contains the minimized Skyline document with chromatograms for unused isotopic peaks removed and a reduced spectral library.

skyline/reports Subdirectory

If .skyr files are specified in the parameters, all reports defined in those files will be run after the Skyline document is populated. The files generated are:

  • report_name.report.tsv - The output of the report_name report in TSV format. report_name is the name of the reported defined in the .skyr file.

  • report_name.report-generation.log - The log generated by Skyline when generating the report for report_name.

  • skyline-import-skyr_file_name.log - The log generated by Skyline when importing the skyr_file_name .skyr file.

qc_report Subdirectory

The files generated are:

  • qc_report_data.db3 - The sqlite database with all the data used to generate the QC report.

  • qc_report.qmd - The quarto document which is rendered to generate the QC report.

  • qc_report.html - The rendered QC report in html format. Produced when 'html' is in qc_report.report_format (the default).

  • qc_report.pdf - The rendered QC report in pdf format. Produced only when 'pdf' is in qc_report.report_format.

In multi-batch mode, one set of QC files is generated per batch (per Skyline document).

qc_report/tables Subdirectory

Created when qc_report.export_tables is true. Contains TSV exports of the normalized precursor and protein quantity matrices used in the QC report.

batch_report Subdirectory

Created when batch_report.skip is false. Contains the rendered batch report (HTML / PDF) and the underlying Quarto document. Subdirectories batch_report/tables and batch_report/plots hold any TSV tables and standalone plots referenced by the report.

gene_reports Subdirectory

Created for PDC runs when pdc.gene_level_data is set and the QC step ran. Contains per-batch (and overall) gene-level quantification tables derived from the QC database, joined against the user-supplied gene metadata.

aws Subdirectory

Present only on AWS Batch runs that need authenticated Panorama access. Contains logs from the secret-setup step that publishes the Panorama API key into AWS Secrets Manager.