class: center, middle, inverse, title-slide .title[ # Data Visualization ] .subtitle[ ## Introduction of unhcthemes package ] --- <style type="text/css"> .primary { color: var(--unhcr-blue); } .w-75 { max-width: 75%; } .w-50 { max-width: 50%; } .w-25 { max-width: 25%; } .table { font-size: 18px; } .remark-code { font-size: 14px; } </style> ## Objectives .pull-left[ **About today:** - Basic of the grammar of graphics and [**`ggplot2`**](https://ggplot2.tidyverse.org/index.html) - Introduction to the [**`unhcrthemes`**](https://vidonne.github.io/unhcrthemes/) package - Best practices to create **UNHCR branded** visuals in R, with examples ] -- .pull-right[ **Not today:** - Data import: [`readr`](https://readr.tidyverse.org/), [`readxl`](https://readxl.tidyverse.org/), etc. - Data manipulation: [`dplyr`](https://dplyr.tidyverse.org/), [`tidyr`](https://tidyr.tidyverse.org/), etc. - R programming: [R for Data Science](https://r4ds.had.co.nz/), [Advanced R Programming](https://adv-r.hadley.nz/), etc. ] --- class: inverse, center, middle # ggplot2 and unhcrthemes ### Introduction --- ## The `ggplot2` package .pull-left[ - **`ggplot2`** is an R package for declaratively creating graphics - **`ggplot2`** is an implementation [The Grammar of Graphics](https://link.springer.com/chapter/10.1007/978-3-642-21551-3_13) by Leland Irving - **The idea** don't start with the final form of the graphic (Excel approach) but **decompose the graphic** into its constituents ] .pull-right[ .center[![ggplot2 HEX](data:image/png;base64,#https://raw.githubusercontent.com/tidyverse/ggplot2/main/man/figures/logo.png)] ] ??? You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. What does it take to create a graphic? Data, axis, geometric objects, etc. --- ## Structure of `ggplot2` How a **`ggplot2`** graph is built on the grammar of graphics elements: <table class="table" style="font-size: 16px; margin-left: auto; margin-right: auto;"> <caption style="font-size: initial !important;">Credit: Cedric Scherer</caption> <thead> <tr> <th style="text-align:left;"> Layer </th> <th style="text-align:left;"> Function </th> <th style="text-align:left;"> Explanation </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;font-weight: bold;"> Data </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> ggplot(data) </td> <td style="text-align:left;"> The raw data that you want to plot. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Aesthetics </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> aes() </td> <td style="text-align:left;"> Aesthetics mappings of the geometric and statistical objects, such as position, color, size, shape, and transparency. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Geometries </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> geom_*() </td> <td style="text-align:left;"> The geometric shapes that will represent the data. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Statistical transformations </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> stat_*() </td> <td style="text-align:left;"> Statistical summaries of the data, such as quantiles, fitted curves, and sums. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Scales </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> scale_*() </td> <td style="text-align:left;"> Maps between the data and the aesthetic dimensions, such as data range to plot width or factor values to colors. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Coordinate System </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> coord_*() </td> <td style="text-align:left;"> The transformation used for mapping data coordinates into the plane of the data rectangle. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Facets </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> facet_*() </td> <td style="text-align:left;"> The arrangement of the data into a grid of plots. </td> </tr> <tr> <td style="text-align:left;font-weight: bold;"> Visual Themes </td> <td style="text-align:left;font-weight: bold;color: #0072bc !important;"> theme_*() </td> <td style="text-align:left;"> The overall visual defaults of a plot, such as background, grids, axes, default typeface, sizes and colors. </td> </tr> </tbody> </table> ??? 1. Data - without data, you don't have a plot! 2. Mapping - linking variables to graphical properties. 3. Geometries - interpret aesthetics as graphical representations. 4. Statistics - compute/transform numbers for us. 5. Scales - interpret values in data to graphical properties. 6. Coordinates - define physical mapping. 7. Facets - split plot into panels. 8. Theme - what does your plot look like? --- ## `unhcrthemes` package .pull-left[ 1. **Branded** `ggplot2` theme ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-3-1.png" width="2100" /> ] --- ## `unhcrthemes` package .pull-left[ 1. **Branded** `ggplot2` theme 2. A series of color palette for: - A **categorical palette** for UNHCR main data visualization colors - A **categorical palette** for people of concern to UNHCR categories - A **categorical palette** for geographical regional divisions of UNHCR - Six **sequential color palettes** for all the main data visualization colors - Two recommended **diverging color palette** ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-4-1.png" width="2100" /> ] --- ## `unhcrthemes` package .pull-left[ 1. **Branded** `ggplot2` theme 2. A series of color palette for: - A **categorical palette** for UNHCR main data visualization colors - A **categorical palette** for people of concern to UNHCR categories - A **categorical palette** for geographical regional divisions of UNHCR - Six **sequential color palettes** for all the main data visualization colors - Two recommended **diverging color palette** 3. Available on [github](https://github.com/vidonne/unhcrthemes/), dedicated [documentation page](https://vidonne.github.io/unhcrthemes/index.html) and throughout [examples of the data visualization platform](https://dataviz.unhcr.org/tools/r/). ] .pull-right[ .center[<img src="https://raw.githubusercontent.com/vidonne/unhcrthemes/master/man/figures/unhcrthemes_sticker.png" alt="unhcrthemes HEX" style="max-width:60%">] ] --- class: inverse, center, middle # ggplot2 and unhcrthemes ### In action --- ## Aim .pull-left[ Replicate a chart example from the [Global Trends 2021](https://www.unhcr.org/globaltrends.html) webpage, using `ggplot2` and make it brand compliant with `unhcrthemes` packages. ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-5-1.png" width="2100" /> ] --- ## Setup .pull-left[ ```r # Install packages # if needed uncomment lines below # install.packages('tidyverse') # remotes::install_github("vidonne/unhcrthemes") # Load packages library(tidyverse) library(unhcrthemes) # Load data displ <- read_csv("data/displaced_pop.csv") # Check data structure View(displ) ``` ] .pull-right[ <table> <thead> <tr> <th style="text-align:right;"> Year </th> <th style="text-align:left;"> Population type </th> <th style="text-align:right;"> # of people </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 2012 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 26387120 </td> </tr> <tr> <td style="text-align:right;"> 2013 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 33340830 </td> </tr> <tr> <td style="text-align:right;"> 2014 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 37877320 </td> </tr> <tr> <td style="text-align:right;"> 2015 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 40451900 </td> </tr> <tr> <td style="text-align:right;"> 2016 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 40220850 </td> </tr> <tr> <td style="text-align:right;"> 2017 </td> <td style="text-align:left;"> IDPs </td> <td style="text-align:right;"> 39934042 </td> </tr> </tbody> </table> ] --- ## Data .pull-left[ ```r *ggplot(data = displ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-9-1.png" width="2100" /> ] ??? 1. Data - without data, you don't have a plot! But nothing happens here because we haven't mapped the raw data to anything. SO we just get a empty canvas. --- ## Aesthetics .pull-left[ ```r ggplot( data = displ, * aes(x = year, y = num) ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-11-1.png" width="2100" /> ] ??? 2. Mapping - linking variables to graphical properties. We have now mapped the year to the x axis and the number displaced to y but we still don't see anything special except the axis value --- ## Geoms .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + * geom_col() ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-13-1.png" width="2100" /> ] --- ## Scale .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col() + * scale_x_continuous( * breaks = scales::pretty_breaks(n = 10) * ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-15-1.png" width="2100" /> ] --- ## Scale .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + * scale_y_continuous( * labels = scales::label_number_si(), * expand = expansion(c(0, 0.1)) * ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-17-1.png" width="2100" /> ] --- ## Context .pull-left[ Before playing with `unhcthemes` let's add some information on the chart. ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( * title = "People forced to flee worldwide | 2012-2022", * caption = "Source: UNHCR Refugee Data Finder" ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-19-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + * theme_unhcr() ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-21-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( * grid = "Y", * axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-23-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col( * color = unhcr_pal(n = 1, name = "pal_blue") ) + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] -- .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-25-1.png" width="2100" /> ] ??? Is color the right property? Also notice that we haven't mapped the color to anything but we're just setting it. --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes(x = year, y = num) ) + geom_col( * fill = unhcr_pal(n = 1, name = "pal_blue") ) + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-27-1.png" width="2100" /> ] ??? Is color the right property? Also notice that we haven't mapped the color to anything but we're just setting it. --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes( x = year, y = num, * fill = pop ) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-29-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes( x = year, y = num, fill = pop ) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + * scale_fill_unhcr_d() + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-31-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ **Recommended colours from dataviz guideline** ![Colour palette for UNHCR's people of concerns](data:image/png;base64,#https://raw.githubusercontent.com/vidonne/unhcrpractice/main/workshop/2022-07-unhcrthemes-stats/img/poc_palette.png) ] .pull-right[ **Check available colours in the package** ```r display_unhcr_all() ``` <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-32-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes( x = year, y = num, fill = pop ) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + scale_fill_unhcr_d( * palette = "pal_unhcr_poc" ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-34-1.png" width="2100" /> ] --- ## unhcrthemes .pull-left[ ```r ggplot( data = displ, aes( x = year, y = num, fill = pop ) ) + geom_col() + scale_x_continuous( breaks = scales::pretty_breaks(n = 10) ) + scale_y_continuous( labels = scales::label_number_si(), expand = expansion(c(0, 0.1)) ) + scale_fill_unhcr_d( palette = "pal_unhcr_poc", * nmax = 9, order = c(4, 1:3, 9, 8) ) + labs( title = "People forced to flee worldwide | 2012-2022", caption = "Source: UNHCR Refugee Data Finder" ) + theme_unhcr( grid = "Y", axis_title = FALSE ) ``` ] .pull-right[ <img src="data:image/png;base64,#dataviz_with_r_unhcrthemes_files/figure-html/unnamed-chunk-36-1.png" width="2100" /> ] --- ## UNHCR R packages * [unhcrdown](https://github.com/vidonne/unhcrdown): Set of UNHCR branded report templates in various formats * [unhcrdatapackage](https://github.com/Edouard-Legoupil/unhcrdatapackage): Use UNHCR Open data * [hcrdata](https://github.com/UNHCR-WEB/hcrdata/): API to connect to internal data source * [HighFrequencyChecks](https://github.com/unhcr/HighFrequencyChecks/): Perform High Frequency Check * [koboloadeR](https://github.com/unhcr/koboloadeR/): Process data crunching for survey dataset * [popdata](https://gitlab.com/dickoa/popdata): Download data from UNHCR POPDATA * [ridl](https://gitlab.com/dickoa/ridl): R client to UNHCR Raw Internal Data Library --- class: inverse, center, middle # Thank you ### Questions?