Outputs

This page describes the format of the outputs generated by the cellmaps_utilscmd.py tools that either create RO-Crate directories or a table file summarizing RO-Crate directories that have been created.

RO-Crate directory summary table

Outputs described below are created by the invocation of cellmaps_utilscmd.py rocratetable

  • data.tsv:

    Contents of `tsv`_ data.tsv file:

    FAIRSCAPE ARK ID        Date    Version Type    Cell Line       Tissue  Treatment       Gene set        Generated By Software   Name    Description     KeywordDownload RO-Crate Data Package   Download RO-Crate Data Package Size MB  Generated By Software   Output Dataset  Responsible Lab
    d4d80b1d-8d49-4204-8c0d-209c5b9ccdf2:cm4ai_chromatin_kolf2.1j_undifferentiated_untreated_crispr_4channel_0.1_alpha      2024-04-29      0.1 alpha       Data    KOLF2.1J        undifferentiated        untreated       chromatin               CRISPR  CM4AI 0.1 alpha KOLF2.1J untreated CRISPR undifferentiated 4channel chromatin   CM4AI,0.1 alpha,KOLF2.1J,untreated,CRISPR,undifferentiated,4channel,chromatin   https://cm4ai.org/Data/cm4ai_chromatin_kolf2.1j_undifferentiated_untreated_crispr_4channel_0.1_alpha.tar.gz     1       Mali Lab
    134e01c8-90ea-457d-9e6e-ca046ecc860f:cm4ai_chromatin_mda-mb-468_paclitaxel_ifimage_0.1_alpha    2024-04-29      0.1 alpha       Data    MDA-MB-468      breast; mammary gland   paclitaxel      chromatin               IF images       CM4AI 0.1 alpha MDA-MB-468 paclitaxel IF microscopy images breast; mammary gland chromatin      CM4AI,0.1 alpha,MDA-MB-468,paclitaxel,IF microscopy,images,breast; mammary gland,chromatin      https://cm4ai.org/Data/cm4ai_chromatin_mda-mb-468_paclitaxel_ifimage_0.1_alpha.tar.gz   1       Lundberg Lab
    7240c7d7-327c-423c-834d-1e99ab8a417b:cm4ai_chromatin_mda-mb-468_untreated_apms_0.1_alpha        2024-04-29      0.1 alpha       Data    MDA-MB-468      breast; mammary gland   untreated       chromatin               AP-MS   CM4AI 0.1 alpha MDA-MB-468 untreated breast; mammary gland AP-MS edgelist chromatin     CM4AI,0.1 alpha,MDA-MB-468,untreated,breast; mammary gland,AP-MS edgelist,chromatin     https://cm4ai.org/Data/cm4ai_chromatin_mda-mb-468_untreated_apms_0.1_alpha.tar.gz       1                       Krogan Lab
    

Perturbation/CRISPR

Outputs described below are created by the invocation of cellmaps_utilscmd.py crisprconverter

Affinity Purification Mass Spectrometry (AP-MS)

Outputs described below are created by the invocation of cellmaps_utilscmd.py apmsconverter

  • apms.tsv:

    Columns:

    • Bait:

      Name of the pull downed protein

    • Prey:

      Uniprot ID number of identified proteins by MS in pull down (putative bait interactor).

    • PreyGene.x:

      Uniprot protein name of identified protein by MS in pull down (putative bait interactor).

    • Spec:

      Number of spectral count in each test biological replicates (separated by | ).

    • SpecSum:

      Sum of Spectral counts in test samples.

    • AvgSpec:

      Average Spectral counts across replicates in test samples.

    • NumReplicates.x:

      Number of replicates in test samples.

    • ctrlCounts:

      Number of spectral count in each control replicates (separated by | ).

    • AvgP.x:

      Average probability that an interaction is true, measure of the likelihood that a given interaction is a true positive rather than a random or non-specific interaction. A lower AvgP indicates higher confidence in the interaction being genuine.

    • MaxP.x:

      maximum probability associated with a protein interaction in the context of its prey-bait pair. Similar to AvgP, a lower MaxP suggests a higher likelihood of the interaction being true.

    • TopoAvgP.x:

      extension of the AvgP score that also takes into consideration the topology of the interaction network. It incorporates information about the hierarchical structure of the interaction data to provide a refined assessment of the interactions.

    • TopoMaxP.x:

      topology-aware score that considers the maximum probability of an interaction in the context of the interaction network’s topology.

    • SaintScore.x:

      composite score that integrates multiple aspects of the interaction data, including spectral counts and probability estimates. It’s designed to prioritize interactions based on their strength and reliability. Higher SaintScores indicate interactions that are more likely to be true.

    • logOddsScore:

      Logarithm of the odds ratio between test and control conditions for each prey as a measure of interaction significance. The LogOddsScore is a statistical score that represents the logarithm of the odds ratio for a protein-protein interaction. It’s used to quantify the strength and significance of the association between two proteins in an interaction network. The odds ratio compares the likelihood of the interaction occurring to the likelihood of it not occurring. Taking the logarithm of the odds ratio often helps to transform the score into a more symmetric and interpretable form, making it easier to compare and analyze the interactions. Higher LogOddsScores typically indicate stronger evidence for the interaction.

    • FoldChange.x:

      represents the ratio of the abundance of a protein or interaction in one experimental condition (Test) compared to another (control). It helps assess whether the abundance of a protein changes significantly between different conditions.

    • BFDR.x:

      Bayesian False Discovery Rate

  • dataset_info.json

  • readme.txt

  • ro-crate-metadata.json

Size Exclusion Chromatography with Mass Spectrometry (SEC-MS)

Outputs described below are created by the invocation of cellmaps_utilscmd.py secmsconverter

TODO

Immunofluorescent Image (IFImage)

Outputs described below are created by the invocation of cellmaps_utilscmd.py ifconverter

  • antibody_gene_table.tsv:

    The .tsv file describes each image in the data set. Each row represents one image. The columns describe the staining from which the image was taken: “Antibody ID” describes the antibody ID for the antibody applied to stain the protein visible in the “green” channel. The antibody ID can be looked up at proteinatlas.org to find out more information about the antibody. “ENSEMBL ID” indicates the ENSEMBL ID(s) of the gene(s) of the proteins visualized in the “green” channel. Treatment refers to how the cells that are depicted in the image were treated (with Paclitaxel, Vorinostat, or untreated) “Well” refers to the well coordinate on the 96-well plate “Region” is a unique identifier for the position in the well, where the cells were acquired.

  • red eg. B2AI_1_Paclitaxel_C1_R1_z01_red.jpg

  • blue eg. B2AI_1_Paclitaxel_C1_R1_z01_blue.jpg

  • green eg. B2AI_1_Paclitaxel_C1_R1_z01_green.jpg

  • yellow eg. B2AI_1_Paclitaxel_C1_R1_z01_yellow.jpg

  • dataset_info.json

  • readme.txt

  • ro-crate-metadata.json