R Statistical Language for Analysis – Summary of Data

October 26, 2013 Billy Aung Myint No comments

I have no time. I think most people doesn’t have either , in today’s economics climate. So I will keep this short and sweet.

You have data. You want the summary. No more , no less.

Here is one such data , of cars. Its built-in to R so you just have to do just one line before using it

> data(cars)

Here it is.

> cars
speed dist
1 4 2
2 4 10
3 7 4
4 7 22
5 8 16
6 9 10
7 10 18
8 10 26
9 10 34
10 11 17
11 11 28
12 12 14
13 12 20
14 12 24
15 12 28
16 13 26
17 13 34
18 13 34
19 13 46
20 14 26
21 14 36
22 14 60
23 14 80
24 15 20
25 15 26
26 15 54
27 16 32
28 16 40
29 17 32
30 17 40
31 17 50
32 18 42
33 18 56
34 18 76
35 18 84
36 19 36
37 19 46
38 19 68
39 20 32
40 20 48
41 20 52
42 20 56
43 20 64
44 22 66
45 23 54
46 24 70
47 24 92
48 24 93
49 24 120
50 25 85

To see the structure , use str(dataset-name)

> str(cars)
‘data.frame’: 50 obs. of 2 variables:
$ speed: num 4 4 7 7 8 9 10 10 10 11 …
$ dist : num 2 10 4 22 16 10 18 26 34 17 …

Basically it is 2 columns or 2 variables , speed and dist.

To see the summary of the data ,

> summary(cars)
speed dist
Min. : 4.0 Min. : 2.00
1st Qu.:12.0 1st Qu.: 26.00
Median :15.0 Median : 36.00
Mean :15.4 Mean : 42.98
3rd Qu.:19.0 3rd Qu.: 56.00
Max. :25.0 Max. :120.00

What if you just want to know about 1 variable , speed for example ,

> summary(cars$speed)
Min. 1st Qu. Median Mean 3rd Qu. Max.
4.0 12.0 15.0 15.4 19.0 25.0

I think its sweet. With a few commands , you have almost 90% of what you need to know for most tasks. Min , Max , Mean , Quartiles and so on.

Thanks for reading
Billy AM

Comments are closed.