Data Visualization with R

TRAINING COURSE

Details

To interpret your data and draw insights, you need to be able to visualise it. R is a programming language enjoyed by data scientists for its ability to process and utilise data. With R, you'll be able to take your data and use it to tell insightful stories with compelling visualisations to convince stakeholders and build trust.

In this course, you will learn:

  • Data visualization principles

  • How to communicate data-driven findings

  • How to use ggplot2 to create custom plots

  • The weaknesses of several widely-used plots and why you should avoid them

Delivery Methods

Delivery Method Duration
Classroom
4 Days Get a Quote
Live Virtual Training
4 Days Get a Quote

Discounts Available

Brochure:

Download Brochure

Information may change without notice.

Audience

  • Data scientists
  • Programmers
  • Business analysts 
  • Engineers
  • Scientists

Pre-Requisites

Leading Training's R Programming Course or equivalent knowledge

Course Outline / Curriculum

Introduction to data visualization 

ggplot2 

  •  The components of a graph
  •  ggplot objects
  •  Geometries
  •  Aesthetic mappings
  •  Layers
    • Tinkering with arguments
  •  Global versus local aesthetic mappings
  •  Scales
  •  Labels and titles
  •  Categories as colors
  •  Annotation, shapes, and adjustments
  •  Add-on packages
  •  Putting it all together
  •  Quick plots with qplot
  •  Grids of plots
  •  Exercises

Visualizing data distributions 

  •  Variable types
  •  Case study: describing student heights
  •  Distribution function
  •  Cumulative distribution functions
  •  Histograms
  •  Smoothed density
    • Interpreting the y-axis
    • Densities permit stratification
  •  Exercises
  •  The normal distribution
  •  Standard units
  •  Quantile-quantile plots
  •  Percentiles
  •  Boxplots
  •  Stratification
  •  Case study: describing student heights (continued)
  •  Exercises
  •  ggplot2 geometries
    • Barplots
    • Histograms
    • Density plots
    • Boxplots
    • QQ-plots
    • Images
    • Quick plots
  •  Exercises

Data visualization in practice 

  •  Case study: new insights on poverty
    • Hans Rosling’s quiz
  •  Scatterplots
  •  Faceting
    • facet_wrap
    • Fixed scales for better comparisons
  •  Time series plots
    • Labels instead of legends
  •  Data transformations
    • Log transformation
    • Which base?
    • Transform the values or the scale?
  •  Visualizing multimodal distributions
  •  Comparing multiple distributions with boxplots and ridge plots
    • Boxplots
    • Ridge plots
    • Example: 1970 versus 2010 income distributions
    • Accessing computed variables
    • Weighted densities
  •  The ecological fallacy and importance of showing the data
    • Logistic

Data visualization principles 

  •  Encoding data using visual cues
  •  Know when to include 
  •  Do not distort quantities
  •  Order categories by a meaningful value
  •  Show the data
  •  Ease comparisons
    • Use common axes
    • Align plots vertically to see horizontal changes and horizontally to see
  • vertical changes
    • Consider transformations
    • Visual cues to be compared should be adjacent
    • Use color
  •  Think of the color blind
  •  Plots for two variables
    • Slope charts
    • Bland-Altman plot
  •  Encoding a third variable
  • Avoid pseudo-three-dimensional plots
  • Avoid too many significant digits
  • Know your audience
  • Exercises
  • Case study: vaccines and infectious diseases
  • Exercises

Robust summaries 

  •  Outliers
  •  Median
  •  The inter quartile range (IQR)
  •  Tukey’s definition of an outlier
  •  Median absolute deviation
  •  Exercises
  •  Case study: self-reported student heights

Schedule Dates and Booking

There are currently no scheduled dates.

Add me to the waiting list

Submit Enquiry

Name
Email
Telephone
Query