Introduction

Descriptive Statistics (Exploratory Data Analysis)

  • Summary Statistics
  • Used to understand and define the sample data
  • Measures of Central Tendency
    • Mean - average, sensitive to outliers
    • Median - middle point of the data, robust to outliers
    • Mode - most frequent value
  • Measures of Dispersion
    • Variance - average of the squared distance of each point to the mean
    • Standard Deviation - how much our data is spread out around the mean, most commonly used
    • Range - difference between largest and smallest observation in the data, sensitive to outliers
    • Interquartile Range - difference between the 25th and 75th percentile of the data
  • Don’t allow us to make conclusions beyond the data we’ve analyzed or reach conclusions regarding any hypothesis we might have made

Inferential Statistics

  • Used to make generalizations about a population
  • Key
    • Make sure samples are representative
    • Selecting a good sample is critical for making inferences about the population
  • Sampling error