R for machine learning and other stuff
Where to find good books on R to learn all kinds of skills
One of the best things about R is that there is a never-ending stream of new short textbooks being written. Most of these are now written using the neat bookdown approach, meaning that the entire book is generated from markdown code. In fact, one can go to the website of this project and get a listing of all (?) such books:
OK granting, a lot of these are random lecture notes. But still, there's a lot of good stuff. Too bad they don't have a proper search interface, or some kind of way to rank the listing by quality. Few people are interested in some random Chinese lecture notes.
What prompted me to write this advertisement blogpost was that I just finished reading one of these books, namely the new Tidymodels introduction:
It's a really nice hands-on book. Suppose you don't have a particular background in statistics, but you know how to code a bit. What do you need to get a nice machine learning model or 10 going? This book is for you. The chapters are:
Introduction
1 Software for modeling
2 A Tidyverse Primer
3 A Review of R Modeling Fundamentals
Modeling Basics
4 The Ames Housing Data
5 Spending our Data
6 Fitting Models with parsnip
7 A Model Workflow
8 Feature Engineering with recipes
9 Judging Model Effectiveness
Tools for Creating Effective Models
10 Resampling for Evaluating Performance
11 Comparing Models with Resampling
12 Model Tuning and the Dangers of Overfitting
13 Grid Search
14 Iterative Search
15 Screening Many Models
Beyond the Basics
16 Dimensionality Reduction
17 Encoding Categorical Data
18 Explaining Models and Predictions
19 When Should You Trust Your Predictions?
20 Ensembles of Models
21 Inferential Analysis
I like everything except that there was no proper conclusion chapter -- odd! -- and the last chapter on inferential statistics is weak. Fortunately, inferential statistics is pretty much everything else in statistics people learn, so we don't need to spend much time on that in a book on applied machine learning.
For those looking for more, here's some other books in the same format I enjoyed:
Welcome to Text Mining with, by Julia Silge and David Robinson, 2017
Forecasting: Principles & Practice, by Rob J Hyndman and George Athanasopoulos, 2018
R Packages: Organize, Test, Document, and Share Your Code, Hadley Wickham, 2022 (2nd ed.)
And here's some ones that look promising I haven't read yet:
Eh. Python is way better.
Why dont you do a paid stats course for Udemy? Explain to people techniques used in intelligence or genetics research. I'd buy such a course