This workflow allows the user to download subsets of data from the CRAPome repository. Each control experiment (CRAPome Control, or CC) is assigned a unique identifier (CC1-CCx), linked to a protocol, and annotated with standard vocabulary (such as the epitope tag type, cell line, affinity matrix, etc.). These attributes can be used to filter the list of available CRAPome controls. These filters are available on the left as shown in Fig. 5.
Step 1: Use the filters on the left (Fig. 5) to narrow down the list of negative controls.
Step 2: “Add” each desired CRAPome control of interest by clicking the button in the table. If desired, select the “Add All” button instead. Added controls will appear in the “Selected controls” box on the right. There is a limit of 30 controls that can be selected at the same time. A link at the top of the home page provides an option to download the entire database content as a tab-delimited text.
Step 3 (optional): Give this list of selected controls a name and save it for future use (note that this option is restricted to registered users). If you wish to reload a previously saved list, you can do so by clicking the “load” link.
Step 4: Click on “Next” button at the top of the page to view and/or download the data matrix (Fig. 6). The file will be downloaded as an Excel compatible table.
Step 5 (optional): Specific proteins can be queried in the data matrix by typing partial or complete gene name (wild cards are automatically added at the beginning and end). The data matrix can be downloaded as a tab delimited file using the “download data matrix” option.
Workflow 3: Use the CRAPome to analyze your data.
This workflow allows the user to process his/her data online using the CRAPome controls and the scoring tools implemented within the system. This workflow is only available to registered users. The minimum requirement is for the user to submit information regarding one bait (one sample), though we strongly advocate the use of biological replicates for the bait, and recommend that the user also uploads his/her own negative control runs.
Step 1: Upload user data (See Fig. 7). The data should be formatted as per instructions on the webpage (also see Fig. 7). Once uploaded, the data appear in the ‘user data’ section below.
Step 2: Select the CRAPome database controls that are most similar to the user data using controlled vocabularies and detailed protocols as shown for workflow 2 above (see Fig. 5). Selected controls can be saved as a list and reloaded as needed as in workflow 2. Press the blue “Next” button to navigate to the next page.
Step 3 (optional): If the user would like to exclude some of his/her data from the analysis, it can be done at this stage by clicking on ‘remove’ button. Similarly, one can go back and add/remove CRAPome controls. For a quick preview of the data matrix, click on ‘Preview Data Matrix’. After the analysis is complete, the data can be deleted by clicking on ‘clear uploaded user data’ (See Fig. 7).
Step 4: Proceed to the analysis section by clicking on ‘Next’. Here, Fold Change calculations and SAINT probability scoring can be used to generate ranked lists of bait-prey interactions.
Step 5: Select desired scoring options for Fold Change calculations (Fig. 8). Two different Fold Change calculations are generated by default. The first one (FC-A; standard) estimates the background by averaging the spectral counts across the selected controls while the second one (FC-B; stringent) estimates the background by combining the top 3 values for each prey. Combining scores from biological replicates of a bait purification is performed in FC-A by a simple averaging, while FC-B performs a more stringent geometric mean calculation. These parameters are preselected by default, but may be modified by the user as required. The user can also specify what set of controls to use (user controls alone or in combination with selected CRAPome controls). A more detailed explanation of the score is availbe
here
Step 6 (optional): The user can specify whether to run SAINT or not, and which SAINT options (‘lowMode’, ‘minFold’, ‘norm’) should be employed. For details regarding SAINT and the options, please refer to the Choi et al., Current Protocols in Bioinformatics (PMID 22948730). As with the Fold Change calculations, the user may select which controls to use, and how replicates should be combined. Note that if the number of controls is greater than 10, SAINT generates 10 “virtual controls” by selecting the 10 highest counts for each protein.
Step 7: Once the desired options are selected, press “Run Analysis”. The new entry will appear at the top of the ‘Analysis Results’ list (the list includes all previous jobs run by the user). The status of the job(last column) will be displayed ( either as submitted, queued, running or complete). If the user chooses to run generate empirical scores alone (by not selecting the SAINT option), then the status will turn to complete immediately. However, It takes more time to analyze using SAINT, since SAINT is computationally intensive. If the user chooses to run SAINT (by selecting the SAINT option), the initial status will be "queued". It wil change to ‘running’ and then ‘complete’ in 3-5 minutes time depending on the size of the data set, the number of iterations (niter), and the current load on the system). The column “Score Options” lists selected options for the Fold Change calculations for both the primary (FC-A; here labeled S1) and the secondary, more stringent (FC-B, here labeled S2) scores. SAINT options (when applicable) are listed in the next column.
Step 8: Refresh the web page periodically by clicking on ‘Refresh’ to check the current status of a submitted job. When the job is finished and the results are ready to be viewed, the Status will change to ‘complete’. A link called ‘view results’ will appear. When the submitted job includes SAINT, the user will also receive an email notification with a link to the results page.
Step 9: Click on ‘view results’ link to view the results. At the top of the page, you will see graphical views of the data that summarize the results for each of the baits analyzed, or for all baits at once. The left panel compares SAINT (when run) to FC-A; when SAINT is not used, this panel displays a comparison between FC-A and FC-B (see Fig. 9 and10). In both cases, the left panel describes the Receiver Operating Characteristic (ROC) analysis of the scoring (benchmarked to the interactions reported in iRefIndex). This visualization can assist in deciding which scoring function to use on the data. The middle panel displays a histogram of the interactions reported in iRefIndex versus those not reported, at different bins of SAINT probability or FC-A score when SAINT was not run (see Fig. 9 and10). Finally, the panel on the right compares two different scores (by default, SAINT and FC-B if SAINT is used; FC-A vs. FC-B otherwise) at the level of individual proteins. Mousing over any of the graphs will display relevant information (e.g. gene names).
Step 10: The results can be viewed online in a matrix form or downloaded in a tabular format (see Fig. 9 and 10)
Results Matrix
Results Visualizations
Bait-Prey Heatmap
Pathway Heatmap
Network