Reppo for internal functions.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

200 lines
12 KiB

---
title: "CitFuns"
output: rmarkdown::html_vignette
description: |
How to use CitFuns functions.
vignette: >
%\VignetteIndexEntry{CitFuns Package}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, message = F, warning = F, error = F)
```
## Installation
In order to install CitFuns package from Git reppository, you must install `devtools` package:
```{r, eval=F}
install.packages('devtools')
```
In order to install it, you will have to install (if not already done) [Rtools](https://cran.r-project.org/bin/windows/Rtools/rtools40.html).
You will also need [Git](https://git-scm.com/downloads) installed in your computer.
Now we are ready to install CitFuns package:
```{r, eval=F}
devtools::install_git("https://git.ratg.cat/marcelcosta/CitFuns.git", build_vignettes = T)
```
## Update
Any time you want to update the package, you must *reinstall* it:
```{r, eval=F}
detach("package:CitFuns", unload = TRUE) # Only required if you have loaded the package in this session.
devtools::install_git("https://git.ratg.cat/marcelcosta/CitFuns.git")
```
## ggheatmap
**ggheatmap** generates ggplot heatmaps easily.
We start by loading required packages, including *CitFuns*
```{r}
library(tidyverse)
library(CitFuns)
```
Now we will create an example dataframe:
```{r}
df<-data.frame("pats"=paste0("PAT", 1:20), "CytA"=rnorm(20,5), "CytB"=rnorm(20,5),
"CytC"=c(rnorm(5,10),rnorm(5,5),rnorm(5,10),rnorm(5,5)),"CytD"=rnorm(20,5),
"CytE"=c(rnorm(5,10),rnorm(5,5),rnorm(5,10),rnorm(5,5)),"CytF"=rnorm(20,5),
"CytG"=c(rnorm(5,10),rnorm(5,5),rnorm(5,10),rnorm(5,5)))
df<-gather(df, Cyt, Value,-pats)
head(df)
```
Usually, this package works with the dataframes in **Long** format, as it is intended in ggplot workflow.
And now we will generate the heatmap.
```{r, fig.width=8}
ggheatmap(df)
```
As we can observe, X and Y axis are sorted by cluster detection (using *hclust*).
*ggheatmap* groups the results if more there is more than one observation for each X-Y coordinate. By default, it calculates the mean, but the median can also be used instead. To show this, we will create another variable, "Met", and we plot Cytokine expression *versus* Met status.
```{r, fig.width=8}
clinics<-data.frame("pats"=df$pats %>% unique, "Met"=rep(c("0","1"), 10))
df<-merge(df, clinics)
ggheatmap(df, x="Cyt",y="Met", grouping = "median")
```
By default, *other variables not used are eliminated* during the grouping process. However, if you want to further use them (for faceting, for example), you can use the *exclude_group* parameter to keep them in the data.frame.
```{r, fig.width=8}
ggheatmap(df, exclude_group = "Met")+facet_grid(.~Met, scales = "free")
```
You can specify a color instead of being transparent to the tiles:
```{r, fig.width=8}
ggheatmap(df, color="black")
```
Finally, you can scale the heatmap either by rows or by columns:
```{r, fig.width=8}
ggheatmap(df, scale="rows")
```
It is worth noticing that ggheatmap outputs a ggplot object, so you can further modify it as you are used to:
```{r, fig.width=8, fig.height=5}
ggheatmap(df)+
scale_fill_gradient(low = "black", high = "yellow")+
scale_y_discrete(position = "right")+
theme(legend.position = "bottom")
```
## ggcorrplot
ggcorrplot generates a correlation matrix. Using the same example dataframe, you have to specify which variable and value columns will be used to test correlation.
```{r, fig.width=5, fig.height=4}
ggcorrplot(df, var = "Cyt", value = "Value")
```
You can specify a color for the tile lines, transparent by default:
```{r, fig.width=5, fig.height=4}
ggcorrplot(df, var = "Cyt", value = "Value", color = "white")
```
By default, ggcorrplot converts the p-value into *star* significance equivalence. You can show the pvalue or nothing ("none").
```{r, fig.width=5, fig.height=4}
ggcorrplot(df, var = "Cyt", value = "Value", color = "white", stat="pval")
```
ggcorrplot uses "pearson" by default to obtain a correlation coeficient, although "spearman" is also available.
```{r, fig.width=5, fig.height=4}
ggcorrplot(df, var = "Cyt", value = "Value", color = "white", stat="pval", method="spearman")
```
Finally, you can show only the upper part or the lower part of the specular matrix.
```{r, fig.width=5, fig.height=4}
ggcorrplot(df, var = "Cyt", value = "Value", color = "black", tri="lower")
```
## gglegend
gglegend generates a geom_label that calculates the median or mean coordinate from a group in a dotplot with two numerical axis. This is specially designed for results from tSNE+Phenograph cytometry analysis or scRNAseq analysis.
Lets use some coordinates generated from tSNE and Phenograph alghorithms.
```{r, collapse=TRUE}
v1<-c("12.2003375696557","9.61179669498037","9.71523107560819","9.36085913975228","12.4250267807441","14.3028296270378","10.1352873281603","11.5029598315655","8.56811495574298","10.0823452299921","13.6266459325397","10.9933175551642","9.33699055226273","8.4753653954153","14.8718666962033","15.2360998395043","14.2924029287808","12.3444533761381","14.2993507234786","13.4864775273824","12.7460734690735","13.161500730014","10.1359520632972","12.2185983825899","11.2616934436752","9.90490138562409","11.9609343515576","12.5884534548523","12.0364418615964","10.1870261767374","10.0422315951637","12.7783195658305","14.5992551781486","14.9493239728631","10.158835739269","10.6054386317344","13.2848626188882","11.936894948252","8.7128504500476","11.7737895359261","11.8983735877432","8.02689567143797","9.12243498461827","12.6527770922061","13.9105401986929","9.35741259585257","13.5911333739262","9.58527361989105","13.4456024538875","11.0895916895462","-5.85077686923467","-4.52715357511742","-6.29563605082642","0.575879987115163","-5.00038311800316","-1.78476165367942","-4.78291504182624","2.14145872648519","-4.95470848078674","0.527964928988151","2.0407874996449","-2.01707703338718","0.295902773532714","-4.00219087667774","0.646856668064106","-4.88234548321471","-2.03349635088723","-0.49193567263671","-2.79335406976441","0.448763877743354","-4.75643812997927","-1.42563453004395","-6.23450075793563","-3.62615912431353","-3.76809728669077","-4.60208220139584","-5.85501795453493","-7.05239199090951","-3.20043054503014","1.15876766287925","0.895292755360344","1.14444064679356","-0.176884624041849","-6.35832840326582","-1.65398473342688","-3.79355073906942","-5.5958811065813","-2.73187337265253","-0.874422976991474","0.261500351579398","-0.561805573681923","-3.75892543868909","-0.236866399218848","2.09008746058036","-0.548227337920367","-1.21516952408529","-1.17389146089581","-2.97942173316255","2.22364389707379","-0.740849099358617","-11.53883548612","-6.10528935020874","-12.060454035126","-9.30052482902171","-10.8000103031182","-13.5180286021874","-0.350552508950501","-12.8614336490877","-10.5151593701859","-12.7997253357382","-8.9616405672664","-8.4882832838348","-10.5542279802031","-6.07143684739709","-6.38549255444235","-10.0449160880572","-9.44413278886994","-13.777705761209","-13.7964826297576","-6.8363880475111","-11.5388213442119","-5.6282354747176","-13.6138636099046","-6.381104642862","-11.1812519418261","-12.3831679874129","-5.77325698424297","-5.48616485259332","-9.96244295723046","-12.114107326341","-12.7616465926232","-13.7599167738886","-10.1159812826476","-6.62864900674468","-8.06745259542743","-13.1987304489436","-10.8870376481335","-9.36038405695135","-5.13930391264302","-10.5111656232542","-10.9829759447952","-9.82553470541781","-11.7458951813431","-11.4450763813922","-9.87343759006578","-6.95726085424341","-9.03829069857169","-10.4246847556504","-5.70838830660222")
v2<-c("21.0246799882543","21.1604668637415","22.3065974726199","22.167058907011","21.485793337356","21.034495141115","22.8527942337251","21.4484506937578","22.2141672875484","21.345153106709","20.4880451876175","22.2111273969088","21.4727323576141","22.8241608452394","20.0044199930559","20.4988095357755","20.5599765528411","21.1969434679012","20.0020746075624","21.4337624510195","19.9736219509137","21.5709896392616","23.6577471746758","22.5535814744334","22.8384868736052","20.8650810158446","22.1613883712501","20.7983446457222","20.5655252200496","22.2090780986936","21.7806687207636","20.1194553082707","21.3069459537366","20.662407531373","21.4251709330106","20.765690762975","19.8859038873819","21.7056631952668","22.5041194163155","20.9642232055426","21.3422306162038","21.6367978620734","22.9262096626072","22.4646597147181","22.1141613861558","21.5054873462004","21.4889032804239","22.5008054544748","20.76786026091","21.2016450580895","-8.72858183844766","-9.95487383065406","-9.0344478298343","-10.2479477494554","-9.75424269139439","-11.3104503130975","-10.6530974583924","-10.84576176497","-9.3121096499565","-11.3954821057958","-10.459830426721","-10.7685510257121","-9.6028748910031","-10.8250931981046","-11.4576048366608","-8.99623858309278","-11.6991335601257","-10.3000992136515","-8.80499949660716","-10.5287181221244","-12.1008107052572","-10.0265692008998","-10.9708704570088","-10.5302978352576","-9.54786020396403","-9.23397400708739","-9.24386152038901","-9.65261439968974","-10.9281368746817","-10.9473794707607","-10.4850238433817","-10.4882084917208","-10.4990514736718","-12.0128979043322","-11.941610711928","-11.6458333185284","-9.24844012046275","-8.88646414055994","-11.3227097840251","-10.8268151538455","-11.5279408155446","-10.7575990470602","-10.360283532249","-10.7913818861931","-11.091926796243","-11.0159937813687","-10.982258258046","-10.0920126917865","-10.9583452116076","-10.8108899751332","-9.56368334300102","-12.8367178840931","-11.3709591119047","-11.652506805406","-11.1169982547012","-11.6968611726564","-8.86077529188209","-11.9662163203006","-12.2158702454966","-10.5422338216268","-10.1439305299441","-11.3349056863203","-10.8133385736638","-13.1632816376455","-13.5059352283161","-9.88618491633494","-11.2477309742765","-10.9615658268355","-12.0131549369708","-12.187081729914","-10.6769895449992","-13.1713800408861","-11.9147981603324","-11.3127268327162","-10.6744434631015","-11.6870047899695","-11.4368856559989","-11.7991750334205","-11.4332997824959","-12.0583096122799","-11.9469443936353","-10.9598799506617","-11.3275484766061","-11.0525813106846","-12.3015921365232","-11.4579022030789","-9.5065760211426","-11.203396001218","-11.9253283065003","-10.5294239353879","-10.3400935752971","-9.87954083672559","-10.6401156456368","-10.0740874914124","-10.1578866019443","-11.5501223255682","-10.5308167593782","-9.35047777292819","-12.4071042980126")
clust<-c("1","2","2","2","1","1","2","1","2","2","1","2","2","2","1","1","1","1","1","1","1","1","2","1","2","2","1","1","1","2","2","1","1","1","2","2","1","1","2","1","1","2","2","1","1","2","1","2","1","2","3","3","3","4","3","4","3","4","3","4","4","4","4","3","4","3","4","4","4","4","3","4","3","3","3","3","3","3","3","4","4","4","4","3","4","3","3","4","4","4","4","3","4","4","4","4","4","4","4","4","5","3","5","5","5","5","4","5","5","5","5","5","5","3","3","5","5","5","5","3","5","3","5","3","5","5","3","3","5","5","5","5","5","3","3","5","5","5","3","5","5","5","5","5","5","3","5","5","3")
```
```{r}
df<-data.frame("V1"=as.numeric(v1),"V2"=as.numeric(v2),"Clust"=clust)
head(df)
```
Usually, you would do:
```{r}
ggplot(df, aes(V1,V2, color=Clust))+
geom_point()
```
But the colors might be confusing when having more than 8 categories. It's easier to do:
```{r}
ggplot(df, aes(V1,V2, color=Clust))+
geom_point()+
gglegend(df, V1, V2, Clust)+
guides(color="none")
```
## SEM
sem is just a small implementation to calculate the standard error of the mean. Useful for ploting errorbars with it.
```{r}
v<-rnorm(10)
sem(v)
```
## perc
perc transforms a numeric vector to percentage. Useful when using dplyr summarise function.
```{r}
v<-c(2,5,10,3)
perc(v)
# It can be used with dplyr tables
library(tidyverse)
df<-data.frame("X"=c("A","A","B","B"), "Y"=v)
df %>% group_by(X) %>% summarise(Y=perc(Y))
# Or it can be under 1
df %>% group_by(X) %>% summarise(Y=perc(Y, per100=F))
```