Calculate a summary table which has the mean/SD of the horse power variable organized by number of gears. (Bonus: export it to .html or Word.)
Make a new dataframe called my_cars which contains the columns mpg, hp columns but let the column names be miles_per_gallon and horse_power respectively.
Create a new variable in the dataframe called km_per_litre using the mutate function. Note: 1 mpg = 0.425 km/l .
Look at the sample_frac() function. Use it to make a new dataframe with a random selection of half the data.
Look at the slice function. From the original dataframe select rows 10 to 35.
Look at the tibble package and the rownames_to_column function. Make a dataset with just the “Lotus Europa” model. What would be an alternative way of reaching the same goal?
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## filter, lag
## The following objects are masked from 'package:base':
## intersect, setdiff, setequal, union
grouped<-group_by(cars, gear)
table<-summarise(grouped, Mean=mean(hp), Sd=sd(hp))
## Loading required package: stargazer
## Please cite as:
## Hlavac, Marek (2022). stargazer: Well-Formatted Regression and Summary Statistics Tables.
## R package version 5.2.3.
stargazer(table, summary=F, type="html", out= "horsepower.html", header=F)
## <table style="text-align:center"><tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left"></td><td>gear</td><td>Mean</td><td>Sd</td></tr>
## <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr><tr><td style="text-align:left">1</td><td>3</td><td>176.133333333333</td><td>47.6892720291122</td></tr>
## <tr><td style="text-align:left">2</td><td>4</td><td>89.5</td><td>25.8931370338657</td></tr>
## <tr><td style="text-align:left">3</td><td>5</td><td>195.6</td><td>102.833846568141</td></tr>
## <tr><td colspan="4" style="border-bottom: 1px solid black"></td></tr></table>
Note that I have not reloaded dplyr. Mutate will also get you there but you’d have to then remove the surplus columns.
my_cars <- cars %>% select(miles_per_gallon = mpg, horse_power=hp)
I have added it to the my_cars dataframe.
my_cars <- my_cars %>% mutate(km_per_litre = 0.425*miles_per_gallon)
Sliced some rows.
my_cars_slice = my_cars %>% slice(10:35)
my_cars_sample <- my_cars %>% sample_frac(size = 0.5, replace = FALSE)
This requires tibble. But if you loaded the tidyverse, it should be in good order.
## Loading required package: tidyverse
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.7 ✔ stringr 1.4.0
## ✔ tidyr 1.2.0 ✔ forcats 0.5.1
## ✔ readr 2.1.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
mycars_final = rownames_to_column(mtcars, var = "model")
Lotus_europa <- mycars_final %>% filter(model == "Lotus Europa")
Lotus_europa2<- mtcars %>% filter(rownames(mtcars) %in% "Lotus Europa")
Lotus_europa3 <- mtcars %>% filter(rownames(mtcars) == "Lotus Europa")
You could also make a new variable of row names via mutate.