Welcome to the IntLIM shiny app!

The goal of this app is to provide users with a user-friendly platform for integrating multi-omics data. Specifically, the software finds analyte relationships that are specific to a given phenotype (e.g. cancer vs non-cancer). For example, a given analyte pair could show a strong correlation in one phenotype (e.g. cancer) and no correlation in the other (e.g. non-cancer).

More details can be found in our publication “IntLIM: integration using linear models of metabolomics and gene expression data”.

Getting started (loading in data)

Please be sure that all files noted in the CSV file, including the CSV file, are in the same folder. Do not include path names in the filenames.

Users will need to input files for analyte levels for analyte type 1 (e.g. metabolite abundance data), analyte levels for analyte type 2 (e.g. gene expression data), sample meta-data, analyte type 1 meta-data (optional) and analyte type 2 meta-data (optional).

Users also need to input a CSV file named 'input.csv' with two required columns: 'type' and 'filenames'.

The CSV file is expected to have the following 2 columns and 6 rows:

  1. type,filenames
  2. analyteType1,myfilename
  3. analyteType2,myfilename
  4. analyteType1MetaData,myfilename (optional)
  5. analyteType2MetaData,myfilename (optional)
  6. sampleMetaData,myfilename"

NOTE: For the ShinyApp, the meta-file must be named 'input.csv'

Note also that the input data files should be in a specific format:

  • analyteType1: rows are analytes, columns are samples; the first row is assumed to have sample ids and these ids should be unique; the first column is assumed to have feature ids and those should be unique.
  • analyteType2: rows are analytes, columns are samples; the first row is assumed to have sample ids and these ids should be unique; the first column is assumed to have feature ids and those should be unique.
  • analyteType1MetaData: rows are analytes, features are columns
  • analyteType2MetaData: rows are analytes, features are columns
  • sampleMetaData: rows are samples, features are columns

NOTE: The first column of the sampleMetaData file is assumed to be the sample id, and those sample ids should match the first row of analyteType1 and analyteType2 (e.g. it is required that all sample ids in the analyteType1 and analyteType2 are also in the sampleMetaDatafile).

Test data

The package includes a reduced set of the original NCI-60 dataset. The CSV input file location for this test dataset can be located by typing the following in the R console:

     dir <- system.file("extdata", package="IntLIM", mustWork=TRUE)
     csvfile <- file.path(dir, "NCItestinput.csv")
     csvfile

Please see the vignette at [https://github.com/ncats/IntLIM/tree/liz_dev/vignettes/IntLimVignette.Rmd) for additional information.

In addition, additional NCI-60 and breast cancer demo datasets can be found at https://github.com/ncats/IntLIM2.0ExtraDataVignettes.

Contact

If you have any questions, comments, or concerns on how to use IntLIM please contact Ewy Mathe at ewy.mathe@nih.gov or Tara Eicher at tara.eicher@nih.gov.

Load Data

This step takes all the relevant CSV files as input, including the following (See About for more details):
  • input.csv (required): contains the names of all files input (See About)
  • analyteType1Data (required): rows are analytes of the first type, columns are samples; the first row is assumed to have sample ids and these ids should be unique; the first column is assumed to have feature ids and those should be unique.
  • sampleMetaData (required): rows are samples, features are columns
  • analyteType2Data (required): rows are analytes of the second type, columns are samples; the first row is assumed to have sample ids and these ids should be unique; the first column is assumed to have feature ids and those should be unique.
  • analyteType1MetaData (optional): rows are analytes of the first type, features are columns
  • analyteType2MetaData (optional): rows are analytes of the second type, features are columns

                        

Loading...
Summary Statistics
                          
Distribution of Input Data
                          

Filter Data (optional)

This step allows you to filter the data by a user-defined percentile cutoff.

The statistic summary of origin data
The statistic summary of filtered data
The distribution of the origin data.
Verify the distribution of the filtered data.

Run IntLIM

This step performs the linear models for all combinations of analyte pairs and then plots distribution of p-values.
The linear model performed is 'a_i ~ a_j + p + a_j:p' where
  • 'a_i' is the outcome analyte level (may be of types 1 or 2)
  • a_j is the independent analyte level (may be of types 1 or 2
  • p is the phenotype (e.g. tumor vs non-tumor)
  • a_j:p is the interaction between phenotype and independent analyte level
A statistically significant p-value of the the interaction term a_j:p indicates that the analyte pair relationship is phenotype-specific. Please see manuscript for more details.


Loading...(It might take several minutes depending on the size of the dataset,please be patient!)

Process the result

Process the results and filter pairs of analytes based on adjusted p-values, R^2 values, and interaction coefficient cutoffs between the two groups being compared.
Then plot beta graph of significant gene-metabolite pairs by filling out parameters below and clicking 'Run'.

Scatter plot

This step presents the table of analyte pairs that are significant.
You can plot the scatter plot of preferred analyte pairs by clicking table
Significant pairs

Loading...