This function creates a 'volc3d' object of S4 class for downstream plots containing the p-values from a 2x3 factor analysis, expression data sample data and polar coordinates. For RNA-Seq count data, two functions deseq_2x3 followed by deseq_2x3_polar() can be used instead.

polar_coords_2x3(
  data,
  metadata = NULL,
  outcome,
  group,
  pvals = NULL,
  padj = pvals,
  pcutoff = 0.05,
  fc_cutoff = NULL,
  padj.method = "BH",
  process = c("positive", "negative", "two.sided"),
  scheme = c("grey60", "red", "gold2", "green3", "cyan", "blue", "purple", "black"),
  labs = NULL,
  ...
)

Arguments

data

Dataframe or matrix with variables in columns and samples in rows

metadata

Dataframe of sample information with samples in rows

outcome

Either the name of column in metadata containing the binary outcome data. Or a vector with 2 groups, ideally a factor. If it is not a factor, this will be coerced to a factor. This must have exactly 2 levels.

group

Either the name of column in metadata containing the 3-way grouping data. Or a vector with 3 groups, ideally a factor. If it is not a factor, this will be coerced to a factor. This must have exactly 3 levels. NOTE: if pvals is given, the order of the levels in group must correspond to the order of columns in pvals.

pvals

Optional matrix or dataframe with p-values in 3 columns. If pvals is not given, it is calculated using the function calc_stats_2x3. The p-values in 3 columns represent the comparison between the binary outcome with each column for the 3 groups as specified in group.

padj

Matrix or dataframe with adjusted p-values. If not supplied, defaults to use nominal p-values from pvals.

pcutoff

Cut-off for p-value significance

fc_cutoff

Cut-off for fold change on radial axis

padj.method

Can be "qvalue" or any method available in p.adjust. The option "none" is a pass-through.

process

Character value specifying colour process for statistical significant genes: "positive" specifies genes are coloured if fold change is >0; "negative" for genes with fold change <0 (note that for clarity the polar position is altered so that genes along each axis have the most strongly negative fold change values); or "two.sided" which is a compromise in which positive genes are labelled as before but genes with negative fold changes and significant p-values have an inverted colour scheme.

scheme

Vector of colours starting with non-significant variables

labs

Optional character vector for labelling groups. Default NULL leads to abbreviated labels based on levels in outcome using abbreviate(). A vector of length 3 with custom abbreviated names for the outcome levels can be supplied. Otherwise a vector length 8 is expected, of the form "ns", "A+", "A+B+", "B+", "B+C+", "C+", "A+C+", "A+B+C+", where "ns" means non-significant and A, B, C refer to levels 1, 2, 3 in outcome, and must be in the correct order.

...

Optional arguments passed to calc_stats_2x3

Value

Returns an S4 'volc3d' object containing:

  • 'df' A list of 2 dataframes. Each dataframe contains both x,y,z coordinates as well as polar coordinates r, angle. The first dataframe has coordinates on scaled data. The 2nd dataframe has unscaled data (e.g. log2 fold change for gene expression). The type argument in volcano3D, radial_plotly and radial_ggplot corresponds to these dataframes.

  • 'outcome' The three-group contrast factor used for comparisons, linked to the group column

  • 'data' Dataframe or matrix containing the expression data

  • 'pvals' A dataframe containing p-values in 3 columns representing the binary comparison for the outcome for each of the 3 groups.

  • 'padj' A dataframe containing p-values adjusted for multiple testing

  • 'pcutoff Numeric value for cut-off for p-value significance

  • 'scheme' Character vector with colour scheme for plotting

  • 'labs' Character vector with labels for colour groups

Details

This function is designed for manually generating a 'volc3d' class object for visualising a 2x3 way analysis comparing a large number of attributes such as genes. For RNA-Seq data we suggest using deseq_2x3() and deseq_2x3_polar() functions in sequence instead.

Scaled polar coordinates are generated using the t-score for each group comparison. Unscaled polar coordinates are generated as difference between means for each group comparison. If p-values are not supplied they are calculated by calc_stats_2x3() using either t-tests or wilcoxon tests.

The z axis for 3d volcano plots does not have as clear a corollary in 2x3 analysis as for the standard 3-way analysis (which uses the likelihood ratio test for the 3 groups). For 2x3 polar analysis the smallest p-value from the 3 group pairwise comparisons for each gene is used to generate a z coordinate as -log10(p-value).

The colour scheme is not as straightforward as for the standard polar plot and volcano3D plot since genes (or attributes) can be significantly up or downregulated in the response comparison for each of the 3 groups. process = "positive" means that genes are labelled with colours if a gene is significantly upregulated in the response for that group. This uses the primary colours (RGB) so that if a gene is upregulated in both red and blue group it becomes purple etc with secondary colours. If the gene is upregulated in all 3 groups it is labelled black. Non-significant genes are in grey.

With process = "negative" genes are coloured when they are significantly downregulated. With process = "two.sided" the colour scheme means that both significantly up- and down-regulated genes are coloured with downregulated genes labelled with inverted colours (i.e. cyan is the inverse of red etc). However, significant upregulation in a group takes precedence.

See also