14 KiB
14 KiB
Quantitative Methods
- Title slide
- Title slide
- What do we aim to achieve in this course?
- Two Types of Statistics
- Descriptive Statistics
- Graphical Displays of Quantitative Information: Common Pitfalls
- Common uses of statistical graphics
- Mis-representation
- Mis-representation
- Mis-representation
- Mis-representation
- Mis-representation
- Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)
- Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)
- Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)
- Mis-representation: illustrations from Thomas Piketty's work (source Noah Wright)
- The problem multiplied with the coming in of spreadsheets
- Graphical Displays of Quantitative Information: Dispersion
- Paul Krugman on Fiscal Austerity
Title slide slide
(org-show-animate '("Quantitative Methods, Part-II" "Descriptive Statistics" "Vikas Rawal" "Prachi Bansal" "" "" ""))
Title slide
(org-show-animate '("Why do financial journalists need to know quantitative methods?" "" "" ""))
What do we aim to achieve in this course? slide
Make friends with numbers
Learn how to read numbers, how to present them, and how to write about them
Learn how to use computers to work with numbers
Two Types of Statistics slide
Descriptive Statistics
Use summaries of data for the entire population to describe a population
Use summaries of sample data to describe a sample
Inferential Statistics
Use sample data to describe a population
Descriptive Statistics slide
- Frequency
- Measures of central tendency
- Summary positions
- Measures of dispersion
Frequency slide
library(data.table)
data.table(names=c("Anil","Neeraj","Savita","Srimati",
"Rekha","Pooja","Alex","Shahina",
"Ghazal","Lakshmi","Rahul","Shahrukh",
"Naman","Deepak","Shreya","Rukhsana"
),
salary=c(71,50,65,40,
45,42,46,43,
45,43,45,45,
850,100,46,48
)*1000,
sex=c("M","M","F","F",
"F","F","M","F",
"F","F","M","M",
"M","M","F","F"
))->workers
workers$sno<-c(1:nrow(workers))
workers[,.(sno,names,sex,salary)]
| sno | names | sex | salary |
|---|---|---|---|
| 1 | Anil | M | 71000 |
| 2 | Neeraj | M | 50000 |
| 3 | Savita | F | 65000 |
| 4 | Srimati | F | 40000 |
| 5 | Rekha | F | 45000 |
| 6 | Pooja | F | 42000 |
| 7 | Alex | M | 46000 |
| 8 | Shahina | F | 43000 |
| 9 | Ghazal | F | 45000 |
| 10 | Lakshmi | F | 43000 |
| 11 | Rahul | M | 45000 |
| 12 | Shahrukh | M | 45000 |
| 13 | Naman | M | 850000 |
| 14 | Deepak | M | 1e+05 |
| 15 | Shreya | F | 46000 |
| 16 | Rukhsana | F | 48000 |
workers[,.(frequency=length(sno)),.(sex)]
| sex | frequency |
|---|---|
| M | 7 |
| F | 9 |
| sex | frequency |
|---|---|
| M | 7 |
| F | 9 |
Measures of Central Tendency slide
workers[,.(mean_salary=round(mean(salary),1),
median_salary=quantile(salary,prob=0.5))]
| mean_salary | median_salary |
|---|---|
| 101500 | 45500 |
workers[,.(mean_salary=round(mean(salary),1),
median_salary=quantile(salary,prob=0.5)),.(sex)]
| sex | mean_salary | median_salary |
|---|---|---|
| M | 172428.6 | 50000 |
| F | 46333.3 | 45000 |
Measures of Position slide
- First quartile
- Second quartile (median)
- Third quartile
- Deciles
- Quintiles
- Percentiles
Measures of Dispersion slide
Range and other measures based on positions slide
$range=max-min$
| min_salary | max_salary | range |
|---|---|---|
| 40000 | 850000 | 810000 |
workers[,.(min_salary=min(salary),
max_salary=max(salary),
range=max(salary)-min(salary))]
Range and other measures based on positions slide
- Distance between any two positions (Deciles, Quintiles, Percentiles) can be used as a measure of dispersion.
$inter.quartile.range=Q3-Q1$
25% 75% 44500 53750 10% 90% 42500 85500 10% 95% 42500 287500 25% 95% 44500 287500 0% 75% 40000 53750
## summary(workers$salary)
quantile(workers$salary,probs=c(0.25,0.75))
quantile(workers$salary,probs=c(0.1,0.9))
quantile(workers$salary,probs=c(0.1,0.95))
quantile(workers$salary,probs=c(0.25,0.95))
quantile(workers$salary,probs=c(0,0.75))
Variance, Standard Deviation and Coefficient of Variation
$variance=\frac{1}{n} \times \sum(x_{i}-x)^{2}$
$standard.deviation = \sqrt{variance}$
$cov=\frac{standard.deviation}{mean}$
workers[,.(var_salary=round(var(salary),1),
sd_salary=round(sqrt(var(salary)),1),
cov_salary=round(sqrt(var(salary))/mean(salary),2))
]
| var_salary | sd_salary | cov_salary |
|---|---|---|
| 40075200000 | 200187.9 | 1.97 |
students[,.(var_salary=round(var(salary),1),
sd_salary=round(sqrt(var(salary)),1),
cov_salary=round(sqrt(var(salary))/mean(salary),2)),.(sex)]
| sex | var_salary | sd_salary | cov_salary |
|---|---|---|---|
| M | 89680952381 | 299467.8 | 1.74 |
| F | 54500000 | 7382.4 | 0.16 |
Graphical Displays of Quantitative Information: Common Pitfalls
Common uses of statistical graphics slide
- To show trends over time
- To show mid-point variations across categories
- To show composition
- (less commonly, though more usefully) to show/analyse dispersion
Mis-representation slide
Graphical Displays of Quantitative Information: Dispersion slide
Boxplot slide
- Invented by John Tukey in 1970
- Many variations proposed since then, though the essential form and idea as remained intact.
Paul Krugman on Fiscal Austerity
What does this graph show? slide
What did Paul Krugman say? slide
"Here’s what fiscal policy should do: it should support demand when the economy is weak, and it should pull that support back when the economy is strong. As John Maynard Keynes said, “The boom, not the slump, is the right time for austerity.” And up until 2010 the U.S. more or less followed that prescription. Since then, however, fiscal policy has become perverse: first austerity despite high unemployment, now expansion despite low unemployment.




















