14 KiB

Raw Blame History Unescape Escape

Descriptive Statistics

Title slide
Descriptive Statistics
Graphical Displays of Quantitative Information: Dispersion
Graphical Displays of Quantitative Information: Common Pitfalls
Paul Krugman on Fiscal Austerity

Title slide slide

(org-show-animate '("Quantitative Methods" "Descriptive Statistics" "Vikas Rawal" "Prachi Bansal" "" "" ""))

Descriptive Statistics slide

Frequency
Measures of central tendency
Summary positions
Measures of dispersion

Frequency slide

      library(data.table)
      data.table(names=c("Anil","Neeraj","Savita","Srimati",
                         "Rekha","Pooja","Alex","Shahina",
                         "Ghazal","Lakshmi","Rahul","Shahrukh",
                         "Naman","Deepak","Shreya","Rukhsana"
                         ),
                 salary=c(71,50,65,40,
                          45,42,46,43,
                          45,43,45,45,
                          850,100,46,48
                          )*1000,
                 sex=c("M","M","F","F",
                       "F","F","M","F",
                       "F","F","M","M",
                       "M","M","F","F"
                       ))->workers
      workers$sno<-c(1:nrow(workers))
      workers[,.(sno,names,sex,salary)]

sno	names	sex	salary
1	Anil	M	71000
2	Neeraj	M	50000
3	Savita	F	65000
4	Srimati	F	40000
5	Rekha	F	45000
6	Pooja	F	42000
7	Alex	M	46000
8	Shahina	F	43000
9	Ghazal	F	45000
10	Lakshmi	F	43000
11	Rahul	M	45000
12	Shahrukh	M	45000
13	Naman	M	850000
14	Deepak	M	1e+05
15	Shreya	F	46000
16	Rukhsana	F	48000

  workers[,.(frequency=length(sno)),.(sex)]

sex	frequency
M	7
F	9

sex	frequency
M	7
F	9

Measures of Central Tendency slide

  workers[,.(mean_salary=round(mean(salary),1),
              median_salary=quantile(salary,prob=0.5))]

mean_salary	median_salary
101500	45500

  workers[,.(mean_salary=round(mean(salary),1),
             median_salary=quantile(salary,prob=0.5)),.(sex)]

sex	mean_salary	median_salary
M	172428.6	50000
F	46333.3	45000

Measures of Position slide

First quartile
Second quartile (median)
Third quartile
Deciles
Quintiles
Percentiles

Measures of Dispersion slide

Range and other measures based on positions slide

$range=max-min$

min_salary	max_salary	range
40000	850000	810000

    workers[,.(min_salary=min(salary),
                max_salary=max(salary),
                range=max(salary)-min(salary))]

Range and other measures based on positions slide

Distance between any two positions (Deciles, Quintiles, Percentiles) can be used as a measure of dispersion.

$inter.quartile.range=Q3-Q1$

  25%   75% 
44500 53750
  10%   90% 
42500 85500
   10%    95% 
 42500 287500
   25%    95% 
 44500 287500
   0%   75% 
40000 53750

##  summary(workers$salary)
  quantile(workers$salary,probs=c(0.25,0.75))
  quantile(workers$salary,probs=c(0.1,0.9))
  quantile(workers$salary,probs=c(0.1,0.95))
  quantile(workers$salary,probs=c(0.25,0.95))
  quantile(workers$salary,probs=c(0,0.75))

Variance, Standard Deviation and Coefficient of Variation

$variance=\frac{1}{n} \times \sum(x_{i}-x)^{2}$

$standard.deviation = \sqrt{variance}$

$cov=\frac{standard.deviation}{mean}$

  workers[,.(var_salary=round(var(salary),1),
             sd_salary=round(sqrt(var(salary)),1),
             cov_salary=round(sqrt(var(salary))/mean(salary),2))
          ]

var_salary	sd_salary	cov_salary
40075200000	200187.9	1.97

    students[,.(var_salary=round(var(salary),1),
                sd_salary=round(sqrt(var(salary)),1),
                cov_salary=round(sqrt(var(salary))/mean(salary),2)),.(sex)]

sex	var_salary	sd_salary	cov_salary
M	89680952381	299467.8	1.74
F	54500000	7382.4	0.16

Graphical Displays of Quantitative Information: Dispersion slide

Histogram slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/productionhist1.png

Histogram with relative densities slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/productionhist2.png

Boxplot slide

Invented by John Tukey in 1970
Many variations proposed since then, though the essential form and idea as remained intact.

Boxplot of wheat yields slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/boxplotyield1.png

Violin plots slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/vioplotyield1.png

Boxplots: Useful to identify extreme values slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/boxplotyield2.png

Boxplots: Useful for comparisons across categories slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/boxplotyield3.png

Violin plots slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/vioplotyield3.png

Graphical Displays of Quantitative Information: Common Pitfalls

Common uses of statistical graphics slide

To show trends over time
To show mid-point variations across categories
To show composition
(less commonly, though more usefully) to show/analyse dispersion

Mis-representation slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/tufte-insanity.png — "and sometimes the fact that numbers have a magnitude as well as an order is simply forgotten"

Mis-representation slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/tufte-fuel.png — Another example borrowed from Tufte

Mis-representation slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/tufte-fuel2.png — Tufte's graph on fuel economy of cars

Mis-representation slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/nobel-wrong.png — Nobel prizes awarded in science (National Science Foundation, 1974)

Mis-representation slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/nobel-right.png — Nobel prizes awarded in science (corrected by Tufte)

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/piketty1_o.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/piketty1_c.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/piketty2_o.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/piketty2_c.png

The problem multiplied with the coming in of spreadsheets slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/chart1.png

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/chart2.png

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/graphics/chart3.png

Paul Krugman on Fiscal Austerity

What does this graph show? slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/krugman1.png Source: https://www.nytimes.com/2018/11/02/opinion/the-perversion-of-fiscal-policy-slightly-wonkish.html

What did Paul Krugman say? slide

"Here’s what fiscal policy should do: it should support demand when the economy is weak, and it should pull that support back when the economy is strong. As John Maynard Keynes said, “The boom, not the slump, is the right time for austerity.” And up until 2010 the U.S. more or less followed that prescription. Since then, however, fiscal policy has become perverse: first austerity despite high unemployment, now expansion despite low unemployment.

How could we better show the relationship between unemployment and fiscal austerity slide

/Courseware/quantitative-methods/src/commit/5ebc8c1d276f287435b47c91639058c2d9e6eaf2/krugman2.png

14 KiB Raw Blame History Unescape Escape

Descriptive Statistics

Title slide slide

Descriptive Statistics slide

Frequency slide

Measures of Central Tendency slide

Measures of Position slide

Measures of Dispersion slide

Range and other measures based on positions slide

Range and other measures based on positions slide

Variance, Standard Deviation and Coefficient of Variation

Graphical Displays of Quantitative Information: Dispersion slide

Histogram slide

Histogram with relative densities slide

Boxplot slide

Boxplot of wheat yields slide

Violin plots slide

Boxplots: Useful to identify extreme values slide

Boxplots: Useful for comparisons across categories slide

Violin plots slide

Graphical Displays of Quantitative Information: Common Pitfalls

Common uses of statistical graphics slide

Mis-representation slide

Mis-representation slide

Mis-representation slide

Mis-representation slide

Mis-representation slide

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright) slide

The problem multiplied with the coming in of spreadsheets slide

Paul Krugman on Fiscal Austerity

What does this graph show? slide

What did Paul Krugman say? slide

How could we better show the relationship between unemployment and fiscal austerity slide

14 KiB

Raw Blame History Unescape Escape