{"product":{"product_id":"1466590734","title":"Using R for Introductory Statistics (Chapman \u0026 Hall/CRC The R Series)","price":"45.25","image_url":"https://m.media-amazon.com/images/I/41p+G4Skl1L._SL500_.jpg","url":"https://www.amazon.com/dp/1466590734"},"comments":[{"body":"Super minor nitpick:\r\n\r\n**R Studio** is the *development environment*.\r\n\r\n**`R`** is the *language*.\r\n\r\nPresumably you want to become well versed in the latter rather than the former. It's an easy mistake to make though, since the two are so intertwined for most people as to become almost indistinguishable.\r\n\r\nMore to your point though:\r\n\r\nBefore learning anything, it's a good idea to ask yourself why you want to learn it, and what you hope to be able to do with it. Now, you mentioned two things,\r\n\r\n* Hypothesis testing.\r\n\r\n* Graphing 4 variables.\r\n\r\nBoth of these are relatively simple, and if you have even the most rudimentary understanding of `R`, you could learn to do in a couple of minutes.\r\n\r\nSo, my question to you would be, in using `R` is your goal to get quick, simple answers to straightforward questions  **OR** are you ultimately looking to be able to do much more complicated tasks? This isn't a judgemental question, not everyone needs to aspire to become an `R` god, just needing something quick and dirty is perfectly okay.\r\n\r\nIf the things you mentioned are more or less the extent of your needs, I'd suggest just googling what you need to do at the time and pick up what you need, more or less, through osmosis.\r\n\r\nHowever, if you have designs on being able to do amazingly complicated things, if you want to push `R` to its fullest, you'll need a more structured approach.\r\n\r\nOne thing you *absolutely* must understand is `R` is a package based language. What this means for you is that beyond the numerous ways you can do any task in any language, people have written countless\\* packages which contain all sorts of handy functions to do just about anything you could conceivably want to do.\r\n\r\n\u0026gt;\\* Okay, it's not really *countless*, there are (as of this writing 12,620 packages on CRAN and 1,560 additional packages on bioconductor. There are bunches more of unofficial ones scattered about GitHub and others privately maintained, but you get the point, there's lots of them.\r\n\r\nSo, for anything you want to do, you can approach it in one of two, very broad, ways:\r\n\r\n* Base `R`.\r\n\r\n* Using packages.\r\n\r\nWhen you are starting out, I think it's very important to get a good handle on Base `R`. \r\n\r\nI would start out with basically *any* introductory `R` book. Search on Amazon and just find one you like.\r\n\r\nPersonally, I can recommend [Using R for Introductory Statistics](https://www.amazon.com/dp/1466590734) by John Verzani. It isn't for everyone, but if you're truly a beginner to both `R` and statistics more generally, it's a good reference text.\r\n\r\nAfter that it's, up to you. Where you want to take it. For me, the pantheon of `R` gods\\* I would pay tribute to are these four:\r\n\r\n* The god of tidiness - [Hadley Wickham](http://hadley.nz) [GitHub](https://github.com/hadley/)/u/hadley\r\n\r\n* The god of speed - [Dirk Eddelbuettel](http://dirk.eddelbuettel.com/) [GitHub](https://github.com/eddelbuettel)\r\n\r\n* The god of art - [Winston Chang](https://www.rdocumentation.org/collaborators/name/Winston%20Chang) [GitHub](https://github.com/wch)\r\n\r\n* The god of sharing - [Yihui Xie](https://yihui.name/en/) [GitHub](https://github.com/yihui)\r\n\r\n\u0026gt;\\*I'm sure every single person on that list would balk at being called a \"god,\" but they'd be lying.\r\n\r\nIt's no mistake that 3/4 of them work for R Studio. \r\n\r\n### The god of tidiness.\r\n\r\nHadley must be a complete neat-freak because he's the driving force behind the [`tidyverse`](https://www.tidyverse.org),\r\n\r\n\u0026gt;The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.\r\n\r\nOnce you branch out of base `R`, the `tidyverse` *should be* your **first** destination. It's not quite a new language unto itself, more like a *very* sophisticated dialect of the language you already know. Once you can speak \"tidy,\" you can still communicate with the \"base\" speaking plebs, you just won't be able to imagine every wanting to.\\*\r\n\u0026gt;\\* this is not *exactly* true, and might come across as gross and elitist, but the `tidy` paradigm really is ***substantially*** better. If you were designing a completely new language to do statistical competing, from scratch, today, the language would probably feel a *lot* like the `tidyverse`.\r\n\r\nAnyway, *any* book by [Hadley Wickham](http://hadley.nz) is gold, and they're all available online for free. But [R for Data Science](http://r4ds.had.co.nz) is a good first step into a larger world.\r\n\r\n### The god of speed.\r\n\r\nI imagine Dirk is not a patient man. He's *very* active on forums, basically every *meaningful* response on stackexchange for an Rcpp related question is his (or his collaborator, lesser-god [Romain Francois](https://purrple.cat)), but sometimes his responses can seem a little... terse?\r\n\r\nNow, `R` is notoriously slow. It's much maligned for this, usually fairly, sometimes not.\r\n\r\nMuch of the perceived slowness can be mitigated in base `R` by learning the suite of `apply` functions which are *vectorized*. That is they take a multivalued variable (a vector, matrix, or list) and they *apply* the same function to each element. Its typically much, much faster than using a for-loop. However, you can't always get away from needing a for-loop, and sometimes your loop will need to run thousands (or millions) of times. That's where the `Rcpp` package which Dirk maintains comes into play.\r\n\r\nIt is an interface between `R` and `C++`, there's not *much* to say about the package itself. You'll need to learn at least some rudimentary `C++` to make use of it, but simply breaking out a computationally intensive for-loop into an `Rcpp` function can yield a huge improvement in run times. 10x-100x (or more) depending on how well (or poorly) optimized your `R` and `C++` code is. There's some weirdness involved (like you can't call an `Rcpp` function in a parallel apply function (separate package) unless your `Rcpp` function is loaded as part of a package, so for *maximum* benefit you'll need to learn how to [write your own packages](http://r-pkgs.had.co.nz) - praise be to Hadley).\r\n\r\n`Rcpp` includes some semantic [\"sugar\"](http://gallery.rcpp.org/articles/sugar-for-high-level-vector-operations/) which allows you to write some things in `C++` more like you would in `R`, but that's yet a third thing to learn.\r\n\r\nAlso `Rcpp`, much like the `tidyverse` is more an ecosystem of interconnected packages than a single package.\r\n\r\n### The god of art.\r\n\r\nBase `R` plots are ugly as sin. They just are, no one should use them ever, for any reason.\\*\r\n\r\n\u0026gt;\\*Exaggeration.\r\n\r\nThat said, Winston's\\* `ggplot2` is a revelation and a revolution in how graphics are created and presented.\r\n\r\n\u0026gt;\\* Yes, *technically* `ggplot2` is also Hadley's and is part of the `tidyverse`, but Winston *literally* [wrote the book on it](http://www.cookbook-r.com/Graphs/). Okay, okay, Hadley *technically* created the package *and* has written [books about it](https://www.amazon.com/gp/aw/d/331924275X), I just find Chang's book more fitting to my needs.\r\n\r\nThe \"gg\" in `ggplot2` stands for [\"grammer of graphics\"](https://www.amazon.com/dp/0387245448/), a common structure for describing the components of a visualization in a concise way.\r\n\r\nLearning `ggplot2` will take you a long way toward being able to make beautiful graphical visualizations.\r\n\r\n### The god of sharing.\r\n\r\nAfter you've learned all of the above. You can wrangle your messy data into something tidy and manageable, you can work on it cleanly and power through massive computations, and you can create stunning images from your data, it all means nothing if you're the only one who sees it.\r\n\r\nThis is where Yihui shines. He is the maintainer for the `knitr` package, and the author of [Dynamic Documents with R and knitr](https://github.com/yihui/knitr-book). This will allow you to turn all of your work into PDFs or web pages to share with the world.\r\n\r\nIt's super easy to get started with, much more complicated to master, but definitely worth it.\r\n\r\nTo use it *effectively*, you'll need to learn `rmarkdown` also by Yihui. You'll also want to start dabbling with [`LaTeX`](https://www.latex-project.org) (if your not proficient already) and to truly bend documents to your whim you'll need to learn to tinker with [`YAML`](http://pandoc.org/MANUAL.html#extension-yaml_metadata_block).\r\n\r\n### Closing remarks.\r\n\r\nIt's a *lot* to master. Few ever will. Not everyone will agree on everything I've said, but I think the park to true mastery looks something like that.\r\n\r\nBest of luck!","subreddit_name":"RStudio","author":"CapaneusPrime"},{"body":"Are you learning R or learning statistics? R is not a general-purpose language like Java or Python. It does one thing and one thing well: statistics. Learning it without understanding stats would be very difficult. That said, some of its syntax is very odd indeed.\r\n\r\nYou might find [Kaggle](http://www.kaggle.com) helpful. There are tons of R workbooks you can look at to see how things are done.\r\n\r\nI know you are asking for on-line resources, but you might find [Using R for Introductory Statistics](https://www.amazon.com/Using-Introductory-Statistics-Chapman-Hall/dp/1466590734/ref=sr_1_1?dchild=1\u0026amp;keywords=john+verzani\u0026amp;qid=1610805394\u0026amp;sr=8-1).","subreddit_name":"learnprogramming","author":"chicken_system"}]}