class: center, middle # Data Visualization — Summary and Further Topics ## Data Analysis with R and Python ### Deepayan Sarkar
--- layout: true # Traditional graphics --- * Implemented in the __graphics__ package * Add-on packages provide further functionality * Mainly two types - High-level functions - Low-level functions
$$ \newcommand{\sub}{_} $$
--- layout: false # Common High-Level functions in __graphics__ | Function | Default Display | |:-----------------|:--------------------------------------------------| | `stripchart()` | Strip Chart (Comparative 1-D Scatter Plots) | | `boxplot()` | Comparative Box-and-Whisker Plots | | `hist()` | Histogram | | `plot(density())`| Kernel Density Plot | | `plot()` | Scatter Plot, Time-series Plot (with `type="l"`) | | `barplot()` | Bar Plot | | `dotchart()` | Cleveland Dot Plot | | `qqnorm()` | Normal Quantile-Quantile Plot | | `qqplot()` | Two-sample Quantile-Quantile Plot | | `pairs()` | Scatter-Plot Matrix | --- # Common High-Level functions in __lattice__ | Function | Default Display | |:-----------------|:--------------------------------------------------| | `stripplot()` | Strip Chart (Comparative 1-D Scatter Plots) | | `bwplot()` | Comparative Box-and-Whisker Plots | | `histogram()` | Histogram | | `densityplot()` | Kernel Density Plot | | `xyplot()` | Scatter Plot, Time-series Plot (with `type="l"`) | | `barchart()` | Bar Plot | | `dotplot()` | Cleveland Dot Plot | | `qqmath()` | Normal Quantile-Quantile Plot | | `qq()` | Two-sample Quantile-Quantile Plot | | `splom()` | Scatter-Plot Matrix | --- # Some useful low-level functions in __graphics__ | Function | Purpose | |:--------------|:--------------------------------------| | `text()` | Add Text to a Plot | | `lines()` | Add Connected Line Segments to a Plot | | `points()` | Add Points to a Plot | | `polygon()` | Add Polygons to a Plot | | `rect()` | Add Rectangles to a Plot | | `segments()` | Add Line Segments to a Plot | --- # Some other low-level utility functions in __graphics__ | Function | Purpose | |:-------------|:-----------------------------------| | `abline()` | Add Straight Lines to a Plot | | `arrows()` | Add Arrows to a Plot | | `axis()` | Add an Axis to a Plot | | `box()` | Draw a Box around a Plot | | `grid()` | Add Grid to a Plot | | `legend()` | Add Legends to Plots | | `title()` | Add Annotation (Titles and Labels) | --- # Common low-level functions in __lattice__ | Function | Purpose | |:-------------------|:---------------------------------------| | `panel.text()` | Add Text to a Panel | | `panel.lines()` | Add Connected Line Segments to a Panel | | `panel.points()` | Add Points to a Panel | | `panel.polygon()` | Add Polygons to a Panel | | `panel.rect()` | Add Rectangles to a Panel | | `panel.segments()` | Add Line Segments to a Panel | | `panel.abline()` | Add Straight Lines to a Panel | | `panel.arrows()` | Add Arrows to a Panel | | `panel.grid()` | Add Grid to a Panel | --- # The ggplot2 approach * Specify a plot using a __grammar__ consisting of * Geometric constructs * Aesthetic Mappings * Statistical Summaries --- # ggplot2: List of geoms ``` r library(ggplot2) apropos("^geom_") ``` ``` [1] "geom_abline" "geom_area" "geom_bar" [4] "geom_bin_2d" "geom_bin2d" "geom_blank" [7] "geom_boxplot" "geom_col" "geom_contour" [10] "geom_contour_filled" "geom_count" "geom_crossbar" [13] "geom_curve" "geom_density" "geom_density_2d" [16] "geom_density_2d_filled" "geom_density2d" "geom_density2d_filled" [19] "geom_dotplot" "geom_errorbar" "geom_errorbarh" [22] "geom_freqpoly" "geom_function" "geom_hex" [25] "geom_histogram" "geom_hline" "geom_jitter" [28] "geom_label" "geom_line" "geom_linerange" [31] "geom_map" "geom_path" "geom_point" [34] "geom_pointrange" "geom_polygon" "geom_qq" [37] "geom_qq_line" "geom_quantile" "geom_raster" [40] "geom_rect" "geom_ribbon" "geom_rug" [43] "geom_segment" "geom_sf" "geom_sf_label" [46] "geom_sf_text" "geom_smooth" "geom_spoke" [49] "geom_step" "geom_text" "geom_tile" [52] "geom_violin" "geom_vline" ``` ??? --- # ggplot2: List of stats ``` r apropos("^stat_") ``` ``` [1] "stat_align" "stat_bin" "stat_bin_2d" [4] "stat_bin_hex" "stat_bin2d" "stat_binhex" [7] "stat_boxplot" "stat_contour" "stat_contour_filled" [10] "stat_count" "stat_density" "stat_density_2d" [13] "stat_density_2d_filled" "stat_density2d" "stat_density2d_filled" [16] "stat_ecdf" "stat_ellipse" "stat_function" [19] "stat_identity" "stat_qq" "stat_qq_line" [22] "stat_quantile" "stat_sf" "stat_sf_coordinates" [25] "stat_smooth" "stat_spoke" "stat_sum" [28] "stat_summary" "stat_summary_2d" "stat_summary_bin" [31] "stat_summary_hex" "stat_summary2d" "stat_unique" [34] "stat_ydensity" ``` --- # Interactive Graphics * R works on many platforms: Windows, Mac, Linux, UNIX -- * Static graphics achieves portability through _graphics devices_ -- * Cross-platform interactive graphics is even more difficult --- # Interactive Graphics: Cross-platform technology - OpenGL (R package __rgl__) - Browser-based visualization technology (R packages __plotly__, __rbokeh__) --- layout: true # Plotly: Example --- ``` r library(package = "ggplot2") library(package = "plotly") library(package = "gapminder") p1 <- ggplot(data = gapminder, mapping = aes(x = log(gdpPercap), y = lifeExp, size = pop, color = continent)) + facet_wrap(~ year) + geom_point() ``` --- ``` r p1 ```  --- ```r ggplotly(p1) ```
--- * Want to add country name * Easiest way is to add a fake aesthetic ``` r p2 <- ggplot(data = gapminder, mapping = aes(x = log(gdpPercap), y = lifeExp, size = pop, color = continent, tooltip = country)) + facet_wrap(~ year) + geom_point() ``` --- ```r ggplotly(p2) ```