Student survey

Introduction

In this code along we’ll work with a small but pretty “messy” survey data on favorite foods and some other information on school aged children.

Packages

We will use the tidyverse for our analysis.

library(tidyverse)

Data

The data are synthetic, so we ca make a few important points quickly.

Analysis

Read the data in and inspect it.

students_raw <- read_csv("https://data-science-with-r.github.io/data/students-raw.csv")

Rows: 6 Columns: 5
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (4): Full Name, favourite.food, mealPlan, AGE
dbl (1): Student ID

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

students_raw

# A tibble: 6 × 5
  `Student ID` `Full Name`      favourite.food     mealPlan            AGE  
         <dbl> <chr>            <chr>              <chr>               <chr>
1            1 Sunil Huffmann   Strawberry yoghurt Lunch only          4    
2            2 Barclay Lynn     French fries       Lunch only          5    
3            3 Jayendra Lyne    N/A                Breakfast and lunch 7    
4            4 Leon Rossini     Anchovies          Lunch only          <NA> 
5            5 Chidiegwu Dunkel Pizza              Breakfast and lunch five 
6            6 Güvenç Attila    Ice cream          Lunch only          6

Fix the variable names.

# add code here

Handle NAs.

# add code here

Inspect variable types and apply fixes where appropriate.

# add code here

Inspect variable classes and apply fixes where appropriate. Save the resulting data frame as students.

# add code here

Write out the students object to a CSV file in the data folder of your working directory.

# add code here

Read in the newly created students.csv and inspect the variable types and classes. Do you observe anything unexpected?

# add code here

Write out the students object to an RDS file in the data folder of your working directory.

# add code here

Read in the newly created students.rds and inspect the variable types and classes. How is this result different than the CSV file you read in?

# add code here