Practical 2: Exploratory data analysis
Practical
This is the second practical lesson where we learn how to perform exploratory data analysis
Objectives
- Explore the data
- Summaries
- Tabulation (One- and two-way tables)
- Measures of central tendency, dispersion, distribution
- Graphics
- Categorical
- Barchart
- Numeric
- Histogram, density plot, boxplot, QQplot, violin plot,
- Categorical vs Categorical
- Barplot
- Numeric vs numeric
- Categorical vs Numeric
- Categorical
To do
Categorical variable
- We use
titanic.xlsx
- Statistics
- Look out for missing
- One-way tabulation of
class
of the passengers - Two-way tabulation of the
class
andoutcome
of the passengers - Multiple-way tabulation of
class,
sex
and outcome of the passengers
- Graphics
- One-way bar plot of
- Two-way bar plot
Numeric variable
- Use “blood.xlsx”
- Statistics – Measure of central tendency
- Mean
- Median
- Mode
- Statistics – measures of dispersion
- Standard deviation
- Variance
- Range
- IQR
- Minimum, maximum
- Outliers
- Distribution
- Skewness
- Kurtosis
- Normality
- Graphics
- Histogram
- Density plot
- Single
- Comparative
- Boxplot
- Single
- Comparative
- Violin plot (+ data points)
- Single
- Comparative