Author Rafael A Irizarry
If you don’t have any experience with R, this is an excellent way to start. Rafael’s explanation of R is friendly and provocative to keep the lector engaged. Certainly, it covers all the core topics and skills that a data scientist must have.
“This book is meant to be a textbook for a first course in Data Science. No previous knowledge of R is necessary, although some experience with programming may be helpful. The statistical concepts used to answer the case study questions are only briefly introduced, so a Probability and Statistics textbook is highly recommended for in-depth understanding of these concepts. If you read and understand all the chapters and complete all the exercises, you will be well-positioned to perform basic data analysis tasks and you will be prepared to learn the more advanced concepts and skills needed to become an expert.”
| Lectures |
|---|
| Part 1: Basics of R and the tidyverse |
| Learn R throughout the book |
| Building blocks needed to keep learning |
| Part 2: Data visualization with ggplot2 |
| Use ggplot2 to generate graphs |
| Describe important data visualization principles |
| Part 3: Statistics with R |
| Answer case study questions using probability, inference, and regression |
| Demonstrate the importance of statistics in data analysis |
| Part 4: Data wrangling with tidyverse |
| Familiarize the reader with data wrangling |
| Specific skills include web scraping, using regular expressions, and joining and reshaping data tables |
| Part 5: Machine learning with caret |
| Introduce machine learning through challenges |
| Use the caret package to build prediction algorithms including K-nearest neighbors and random forests |
| Part 6: Productivity tools for data science |
| Brief introduction to productivity tools used in data science projects |
| Tools include RStudio, UNIX/Linux shell, Git and GitHub, and knitr and R Markdown. |
Irizarry, R. A. (2019). Introduction to data science: Data analysis and prediction algorithms with R. CRC Press.
