You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

14 KiB

Quantitative Methods

Title slide   slide

(org-show-animate '("Quantitative Methods, Part-II" "Descriptive Statistics" "Vikas Rawal" "Prachi Bansal" "" "" ""))

Title slide

(org-show-animate '("Why do financial journalists need to know quantitative methods?" "" "" ""))

What do we aim to achieve in this course?   slide

Make friends with numbers

Learn how to read numbers, how to present them, and how to write about them

Learn how to use computers to work with numbers

Two Types of Statistics   slide

Descriptive Statistics

Use summaries of data for the entire population to describe a population

Use summaries of sample data to describe a sample

Inferential Statistics

Use sample data to describe a population

Descriptive Statistics   slide

  • Frequency
  • Measures of central tendency
  • Summary positions
  • Measures of dispersion

Frequency   slide

      library(data.table)
      data.table(names=c("Anil","Neeraj","Savita","Srimati",
                         "Rekha","Pooja","Alex","Shahina",
                         "Ghazal","Lakshmi","Rahul","Shahrukh",
                         "Naman","Deepak","Shreya","Rukhsana"
                         ),
                 salary=c(71,50,65,40,
                          45,42,46,43,
                          45,43,45,45,
                          850,100,46,48
                          )*1000,
                 sex=c("M","M","F","F",
                       "F","F","M","F",
                       "F","F","M","M",
                       "M","M","F","F"
                       ))->workers
      workers$sno<-c(1:nrow(workers))
      workers[,.(sno,names,sex,salary)]
sno names sex salary
1 Anil M 71000
2 Neeraj M 50000
3 Savita F 65000
4 Srimati F 40000
5 Rekha F 45000
6 Pooja F 42000
7 Alex M 46000
8 Shahina F 43000
9 Ghazal F 45000
10 Lakshmi F 43000
11 Rahul M 45000
12 Shahrukh M 45000
13 Naman M 850000
14 Deepak M 1e+05
15 Shreya F 46000
16 Rukhsana F 48000
  workers[,.(frequency=length(sno)),.(sex)]
sex frequency
M 7
F 9
sex frequency
M 7
F 9

Measures of Central Tendency   slide

  workers[,.(mean_salary=round(mean(salary),1),
              median_salary=quantile(salary,prob=0.5))]
mean_salary median_salary
101500 45500
  workers[,.(mean_salary=round(mean(salary),1),
             median_salary=quantile(salary,prob=0.5)),.(sex)]
sex mean_salary median_salary
M 172428.6 50000
F 46333.3 45000

Measures of Position   slide

  • First quartile
  • Second quartile (median)
  • Third quartile
  • Deciles
  • Quintiles
  • Percentiles

Measures of Dispersion   slide

Range and other measures based on positions   slide

$range=max-min$

min_salary max_salary range
40000 850000 810000
    workers[,.(min_salary=min(salary),
                max_salary=max(salary),
                range=max(salary)-min(salary))]

Range and other measures based on positions   slide

  • Distance between any two positions (Deciles, Quintiles, Percentiles) can be used as a measure of dispersion.

$inter.quartile.range=Q3-Q1$

  25%   75% 
44500 53750
  10%   90% 
42500 85500
   10%    95% 
 42500 287500
   25%    95% 
 44500 287500
   0%   75% 
40000 53750
##  summary(workers$salary)
  quantile(workers$salary,probs=c(0.25,0.75))
  quantile(workers$salary,probs=c(0.1,0.9))
  quantile(workers$salary,probs=c(0.1,0.95))
  quantile(workers$salary,probs=c(0.25,0.95))
  quantile(workers$salary,probs=c(0,0.75))

Variance, Standard Deviation and Coefficient of Variation

$variance=\frac{1}{n} \times \sum(x_{i}-x)^{2}$

$standard.deviation = \sqrt{variance}$

$cov=\frac{standard.deviation}{mean}$

  workers[,.(var_salary=round(var(salary),1),
             sd_salary=round(sqrt(var(salary)),1),
             cov_salary=round(sqrt(var(salary))/mean(salary),2))
          ]
var_salary sd_salary cov_salary
40075200000 200187.9 1.97
    students[,.(var_salary=round(var(salary),1),
                sd_salary=round(sqrt(var(salary)),1),
                cov_salary=round(sqrt(var(salary))/mean(salary),2)),.(sex)]
sex var_salary sd_salary cov_salary
M 89680952381 299467.8 1.74
F 54500000 7382.4 0.16

Graphical Displays of Quantitative Information: Common Pitfalls

Common uses of statistical graphics   slide

  • To show trends over time
  • To show mid-point variations across categories
  • To show composition
  • (less commonly, though more usefully) to show/analyse dispersion

Mis-representation   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/tufte-insanity.png
"and sometimes the fact that numbers have a magnitude as well as an order is simply forgotten"

Mis-representation   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/tufte-fuel.png
Another example borrowed from Tufte

Mis-representation   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/tufte-fuel2.png
Tufte's graph on fuel economy of cars

Mis-representation   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/nobel-wrong.png
Nobel prizes awarded in science (National Science Foundation, 1974)

Mis-representation   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/nobel-right.png
Nobel prizes awarded in science (corrected by Tufte)

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/piketty1_o.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/piketty1_c.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/piketty2_o.png

Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/piketty2_c.png

The problem multiplied with the coming in of spreadsheets   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/chart1.png /Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/chart2.png /Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/graphics/chart3.png

Graphical Displays of Quantitative Information: Dispersion   slide

Histogram   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/productionhist1.png

Histogram with relative densities   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/productionhist2.png

Boxplot   slide

  • Invented by John Tukey in 1970
  • Many variations proposed since then, though the essential form and idea as remained intact.

Boxplot of wheat yields   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/boxplotyield1.png

Violin plots   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/vioplotyield1.png

Boxplots: Useful to identify extreme values   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/boxplotyield2.png

Boxplots: Useful for comparisons across categories   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/boxplotyield3.png

Violin plots   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/vioplotyield3.png

Paul Krugman on Fiscal Austerity

What did Paul Krugman say?   slide

"Heres what fiscal policy should do: it should support demand when the economy is weak, and it should pull that support back when the economy is strong. As John Maynard Keynes said, “The boom, not the slump, is the right time for austerity.” And up until 2010 the U.S. more or less followed that prescription. Since then, however, fiscal policy has become perverse: first austerity despite high unemployment, now expansion despite low unemployment.

How could we better show the relationship between unemployment and fiscal austerity   slide

/Courseware/quantitative-methods/src/commit/5be831d0e9cd8d8b22d8d48a1b6fbca112cc9da0/krugman2.png