The deployment of a portal relies on two data files – an expression matrix and a table with observed measures (e.g. clinical measures) – and a configuration file that defines which modules should be displayed in the portal. In the simplest case, the data will have only one sample per subject – an interactive command-line wizard can guide you step-by-step through the creation of the configuration file. If your data contains more than one sample per subject, for example, for different time points, the best approach is to run a function that creates an empty configuration file with placeholders for the fields in the configuration file that must be completed. This configuration file can then be modified in an editor such as RStudio or any other text editor. If your data has multiple samples per subject, you should also check the Data Preparation Guide vignette, which describes the expected format for each data file.
Using the wizard
The use of the interactive wizard requires placing files in the correct folders before starting. The following steps guide you through this.
- Create a folder where the app will be located
To facilitate the organization and deployment of the portal, it is better to create a folder that contains only the files that are related to the project. If you use RStudio, you may prefer to create a project.
- Copy the expression matrix file to the project folder
The matrix can be a CSV, TSV (tab-separated columns) or .rds file
with a matrix
object (not a data.frame
). The
matrix should have HGNC or similar gene names in rows and sample
identifiers in columns.
If your matrix has the following format, you can move on to the next step:
S1_01 | S1_02 | S2_01 | S2_02 | S3_01 | S3_02 | |
---|---|---|---|---|---|---|
ABC | -1.4000435 | 0.2553171 | -2.4372636 | -0.0055713 | 0.6215527 | 1.1484116 |
BCD | -1.8218177 | -0.2473253 | -0.2441996 | -0.2827054 | -0.5536994 | 0.6289820 |
CDE | 2.0650249 | -1.6309894 | 0.5124269 | -1.8630115 | -0.5220125 | -0.0526019 |
DEF | 0.5429963 | -0.9140748 | 0.4681544 | 0.3629513 | -1.3045435 | 0.7377763 |
EFG | 1.8885049 | -0.0974451 | -0.9358474 | -0.0159503 | -0.8267890 | -1.5123997 |
FGH | 0.9353632 | 0.1764886 | 0.2436855 | 1.6235489 | 0.1120381 | -0.1339970 |
GHI | -1.9100875 | -0.2792372 | -0.3134460 | 1.0673079 | 0.0700349 | -0.6391233 |
HIJ | -0.0499649 | -0.2514834 | 0.4447971 | 2.7554176 | 0.0465314 | 0.5777091 |
IJK | 0.1181949 | -1.9117205 | 0.8620865 | -0.2432367 | -0.2060872 | 0.0191776 |
JKL | 0.0295608 | 0.5498275 | -2.2741149 | 2.6825572 | -0.3612213 | 0.2133557 |
- Copy a measures table file to the project folder
This table can be a CSV, TSV or .rds file with a
data.frame
object. In this file, each row corresponds to a
different subject and the order must match the order of samples in the
expression matrix (if your data contains more than one sample per
subject, you should follow the Data Preparation
Guide and not follow these steps). The first column of this table
should be named and contain subject or sample identifiers.
If your measures table has the following format, you can move on to the next step:
Sample_ID | Platelets_m01 | Platelets_m02 | Age | drugNaive |
---|---|---|---|---|
S1 | 235.8666 | 241.8803 | 54 | Yes |
S2 | 206.6894 | 236.7350 | 76 | Yes |
S3 | 175.2997 | 174.8539 | 37 | No |
- Optionally copy a metadata table file to the folder
This table should also be in any the formats above and should not have any sample or subject identifier columns. The columns of this table will be used to populate the interface with radio buttons to allow sample subset selection. It should also follow a one row per subject/sample format.
Finally,
- In R, load the package and run
create_config_wizard(getwd())
If you are not using an RStudio project, ensure that the folder with
the files is the current working directory. You can check the current
working directory with getwd()
and use
setwd("path/to/folder")
to modify it. You should then run
create_config_wizard(getwd())
.
The wizard will inform you about what each step is doing and will
ask you questions about names of files and other details to create the
configuration file. It will also wait when you are required to do
additional actions such as creating folders and copying files. Depending
on your choices, at least two files will have been created when you
finish it: app.R
and config.yaml
.
- Open and execute the code of app.R to test the portal
Still using R (or RStudio) you can source the app.R file to run the code and open the portal on your browser. You can also copy the project folder to a Shiny server or use the rsconnect package to deploy it to shinyapps.io.
Creating a config template
If your data is more complex than the case outlined above, you can
run create_config_template(getwd())
to create a config.yaml
file that will contain placeholder names to be replaced. If you decide
to use this method, you will have to create a lookup table file, by
default named lookup_table.csv
, which matches samples with
subjects in the measures table and looks like the following:
#> source sample_id subject_id group
#> 1 microarray sample_1 subject_1 control
#> 2 microarray sample_2 subject_2 treatment
#> 3 microarray sample_3 subject_3 treatment
As you can see above, the lookup table also includes sample metadata information (group). Any metadata that you want to use to create subsets in the interface (e.g. to compute correlations only for a control group) should be included in this table and then defined in the configuration file under sample_categories, as following:
In the modules of the portal that allow the selection of subset of samples, the configuration above will appear as the following control:
Including new modules in the configuration
After the config.yaml file has been created, you can edit it to modify the setup of modules that have already been defined or include new ones. The modules available in the package vary between their requirements and aims: some of them are more exploratory and only require changes to the configuration file, while others were designed to help showcase and explore results of analysis. If you have computed sets of genes using a package such as WGCNA, you can create a table to load them into the geneModulesHeatmap module, for example. In the current version, the following modules are available:
shinyExprPortal::show_available_modules()
#> [1] "cohortOverview" "degModules" "degSummary"
#> [4] "degDetails" "corrModules" "singleGeneCorr"
#> [7] "singleMeasureCorr" "geneModulesHeatmap" "multiMeasureCorr"
#> [10] "compareTrajGroups" "geneProjectionOverlay"
The modules are split in their requirements as follows:
No additional files needed
- Single gene/multiple measures correlations
- Single measure/all genes correlation
- Multi-measure/all genes correlation
- Expression/measures changes over time
Additional files needed
- Differential expression models summary page: table of models, DE packages outputs (e.g. limma, deseq2, edgeR)
- Differential expression models visualization: table of models, DE packages outputs (e.g. limma, deseq2, edgeR)
- Gene modules heatmap: data frame with gene lists (WGCNA, genes of interest, etc.)
- 2D gene projection (e.g. MDS, UMAP): data frame with 2D coordinates for all genes
Check the Full Configuration Guide for details about each module and how to set up the additional files required by each of them.
Deploying the portal remotely
You can deploy the app in your Posit/RStudio Connect server or in the public shinyapps.io website (note that you cannot password-protect the portal under the free plan). You can follow the guide to set up your account and install the required packages. The only other requirement is to change the app.R file by including the optional dependencies for each module (as listed in the configuration guide).
For example, the original app.R would look like this:
library(shinyExprPortal)
run_app("config.yaml")
If you want to use the geneModulesHeatmap module, to visualize heatmaps of lists of genes, you must also have the RColorBrewer installed. To deploy it in shinyapps.io, you must then import it as in the example below:
```