I'd look at applying to S2DS if you can get in. C++ and ROOT have very little relevance nowadays in data science (or, ever, really). Knowing the fundamentals of some of the ML algorithms in ROOT wont hurt you, but you need to learn scikit-learn, numpy, scipy, etc in Python as a bare minimum.
Plenty of (free) courses on Coursera, too.
It's an extremely competitive market, and whilst some of the stuff she's done will be useful, she'll be (in many employers' eyes, at least) behind the people graduating with Masters in Data Science/Comp Sci.
Recommend this, too:
https://www.amazon.co.uk/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370/ref=asc\_df\_1461471370/?tag=googshopuk-21&linkCode=df0&hvadid=310848077451&hvpos=&hvnetw=g&hvrand=3789835037830509153&hvpone=...
Data Science seems to have the more robust career path. I recommend supplementing your coursework with as much of both as you can. Strongly recommend the following: Statistical applications, applied math, programming in R and/or Python, PowerBI, and this book.
I went the Biostatistics route for grad school and am now working as a data scientist/biostatistician. Very happy with where I am professionally.
I'm a statistician (well epidemiologist) and I approve this message. This book is great , but probably more advanced that you need. Even just any intro to stats book should really help teach you probability and statistical fundamentals to analyze your data
The exam is kinda its own thing, just because you need lots of practice problems. For that, I think everyone will agree CA is worth it. That being said, learning probability is a great thing, and I recommend this textbook, which my actuary-turned-prob phd professor said was the best textbook. Plus you learn R which is relevant for the job :)
Anyone interested in getting a fund going in the subreddit to help pay for people's first CA subscription? Christmas Coaching Actuaries?
There is so much overlap. That's why some authors use the term Statistical Learning (highly recommend that book BTW). In practice, undergraduates studying "machine learning" are probably expected to be more fluent at programming than people studying statistics (although this may be changing, and it might depend on your university).
For instance, for my senior machine learning class I needed access to a powerful GPU. So, I had to create and SSH into an AWS instance, clone git repositories, run the programs from the cloned repositories (pretrained GANs), and upload files and download results from my PC using SCP. I think this would've been beyond a statistics undergrad.
A major in statistics, with a minor in computer science is an awesome combination.
Thank you very much again for your reply.
"However, statistics can help in two other ways. The first is technical. It can give an understanding of what the algorithms are doing behind the scenes, knowing how they are related (or differ), knowing what exactly the criteria for accuracy (e.g. MAD) and hyperparameters are. "
I think I have an idea what you mean, after finish a course on Udemy on ML, I started to study this book. (That's the reason I started to study more statistics btw, because there are some concepts I didn't understand completely)
From there I've seen a completely different approach dealing with linear regression examples, showing what the parameters do, what is a line of best fit and what are the different error functions and what the difference between them, etc.
Thank you very much again, your answers helped me a lot!
Because, based on your initial comment and this one as well the learning curve in front of you is ... steeper than you might think.
I think you are jumping in to the real deep end, without starting with some fundamentals. The point these questions are at I would just recommend grabbing a book on Linear Regression. If you already have a strong math background them you could jump to something like https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370/ref=pd_sim_14_1?ie=UTF8&psc=1&refRID=086FTQPDGGERBQ7ZR2C5
But I often see people walk away from that book misunderstanding some of the assumptions behind the models they are building and trying to make very poor predictions. Inference is another story all to itself...
I'm not sure how advanced your statistical background is yet, but the best purchase I ever made was <strong>An Introduction to Statistical Learning: with Applications in R</strong> by Hastie et. al.
It gives you a basic, intuitive background on various machine learning methods without getting into nitty gritty probability or statistical theory. And it has really helpful problem sets at the end of each chapter that shows you how to apply each of them in R, and which packages you'll need.
Seriously, that thing is like my bible. The authors have made a pdf available on the internet as well, but I'd highly suggest springing for a hard copy. It's pretty cheap as far as textbooks go.
Other than that, I've never been one to learn through online courses or books. I'd second /u/veeeerain and just do a bunch of projects using datasets from sources like Kaggle. Maybe start a blog to keep a portfolio of all the cool things you do. ;)
An Introduction to Statistical Learning: with Applications in R:
https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370
The Elements of Statistical Learning: Data Mining, Inference, and Prediction:
https://www.amazon.com/Elements-Statistical-Learning-Prediction-Statistics/dp/0387848576
> Introduction to Statistical Learning
Who's the author? is it this book?
If you read my comment I gave you two books I recommend:
https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370
https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1492032646
For me, the best way to get started is to find a question that you want answered. A lot of people on this subreddit say things that they feel are true; investigate them and find a way to prove or disprove it.
Once you have a question to answer, you need to find the data for it. One of the better resources for this if you don't have some sort of web scraper is pbpstats. Tons of data, easy to sort, and there's a button to download a csv file of the data you're looking at. For smaller projects, basketball reference is fine too.
Then, if you know Python, the rest is easy. If you're new to statistics and statistical ideas, I'd recommend intro to statistical learning with R. Easy to read, not too technical, and they give examples in R that are easy enough to follow. They give good rules of thumb, so you don't need to dig deep to understand an idea if you're just trying to do a fun project and need some guidance.
The rest is practice! Fail spectacularly, post your results, get feedback, and do it all over again :)
If you have a practical goal of finding a relevant job, I think you'll be better of learning descriptive data analysis and basic machine learning.
Learn to use the tidyverse packages in R, https://r4ds.had.co.nz/, and maybe try to read https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370
An Introduction to Statistical Learning: with Applications in R (Springer Texts in Statistics) https://www.amazon.com/dp/1461471370/ref=cm_sw_r_cp_apa_i_4BRVCb76G28M3
http://www.urbandictionary.com/define.php?term=laughing&defid=1568845 :))
Now, seriously, if you want to get started, I'd recommend this for R (http://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370/) and this for Python (http://www.amazon.com/Python-Machine-Learning-Sebastian-Raschka/dp/1783555130//).
Also, head out to /r/datascience and /r/MachineLearning!
EDIT: Wrong link.
This. The book that accompanies these videos link is one of my main go-to's. Very well put together. Great examples.
Another real good book is Practical Data Science with R.
I'm not sure what language the John's Hopkins Coursera Data Science courses is done in, but I'd imagine either R or Python.
I have had this one reccommended to me, and it seems to fit your requirements: https://www.amazon.com/Introduction-Statistical-Learning-Applications-Statistics/dp/1461471370