I have 2 recommendations:
Use rStudio. It is a front end to r that is, Imo easier to use in general. It can be downloaded here
When you download r, use swirl. It is a tutorial directly in r that helps teach the basics. It can be run by typing the commands
> install.packages("swirl")
> library("swirl")
> swirl()
Good luck with your adventures!
Don’t have time to Skype unfortunately but can recommend the package dpylr. It’s very straight forward compared to other methods. The cheat sheet found here is also very helpful. Good luck
I would also suggest taking a look at the IDE for R called "RStudio". It will make installing packages, viewing results, help searching for arguments and script management much, much, easier. RStudio also has great cheatsheets for different packages.
In your second example
strsplit(x,",? (- )? ?!")
This is splitting on (maybe) a comma, a space, (maybe) a dash followed by a space, (maybe) a space, an exclamation point.
So, the smallest thing it could split on is (removing the maybes) a space followed by an exclamation point (" !"
).
That doesn't exist in your string, so no splitting.
Easiest is to split on anything which isn't a word character (a-z, A-Z, _) by doing,
strsplit(x, "\W+")
A good site to learn and practice regex is http://regex101.com
Just be aware in your patterns in R, you need to "escape the escape" character, so "\\W+"
instead of "\W+"
.
Have you already created the repository? You can use Rstudio's version control features which are great. I recommend getting used to using git through the terminal though.
Here is a good cheat-sheet for git commands
If you can provide some more specific questions I will help you as much as possible. I use git quite a lot, and I feel it is very important for everyone to be a good user. It will make your teammates very happy!
I'll preface this with the fact that I haven't ever really used a mac with R.
You can either copy the contents of the cd to your working directory, or you can give your script the location of the CD drive... e.g.
list.files("/path/to/CD/drive")
This may help you in terms of figuring out where your CD is mounted according to the operating system.
💡 Here's a tip on markdown and reddit: reddit uses markdown and you can write up hyperlinks like so:
[Link description](link address)
i.e.,
[github markdown syntax](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet#links)
Your welcome 🗿
You could try using the free version of R Studio server to host dashboards and reports. This would still require IT buy in but it doesn't require a Linux system to use.
It looks like you have consistently repeated words in those character strings: I would recommend reading up on using regular expressions to target those words and chop up the strings in a systematic way.
here's a link to a useful RStudio cheat sheet on working with strings
And the rebus package by Richie Cotton is also a great tool for working with strings: rebus documentation
Depends on what direction you want to take with your R skills along with your current skill level.
About a year ago I decided I needed to learn more about ML, since the company I worked for had been duped multiple times into hiring AI consultants who (intentionally or not) wasted the company's resources due to the still-mystifying nature of ML when it comes to nontechnical users (especially the ones who get duped by tech hustlers with the next "game-changing AI program")
I bought the book Hands On Machine Learning With R, which has a whole bunch of little projects related to the subject of each chapter. I highly recommend this book for anyone dipping the toes into ML with R, and especially for you (OP) if you're trying to wrangle together some projects.
Hands-On Machine Learning with R (Chapman & Hall/CRC The R Series) https://www.amazon.com/dp/1138495689/ref=cm_sw_r_apan_glt_i_29ZKTTPX6C6MHHZ1Q2XT
If ML isn't what you wanna do right now, figure out where you want to take your R skills and someone here can probably make a recommendation.
Install a new operating system on your computer. https://www.ubuntu.com Based on how newbie you sound, (basically 100%), this will take you between 1 and 3 months given you put in a good hour or four every evening. If you don't have a workstation that is conductive to "install your own operating system" then you'll have to buy one. Shit's not going to work at all if you're here with a smartphone, netbook, imac or macbook pro product, these walled gardens are exactly that: delightful and hedged-in prisons. Job number 1 for a programmer is to "Break out of Alcatraz Island" by installing your own operating system. There are 20 thousand questions you must ask and answer yourself before being able to do this. It's step 1 of your 10 year journey to get where we are.
After you're able to get Ubuntu or something similar up and running, and you're able to login to reddit, then you'll be ready for step 2.
If you are younger than age 23, then it may behoove you to go into debt and go to College majoring in Computer Science from a respectable university reknown for shitting out good programmers. It will cost you, but they will also hold your fucking face to the fire, which is what you need on your 10 year journey. It's not easy, that's way it can pay up to $150k/yr and in rare cases, up to $300k/yr if you're better than 1 in 10 thousand (pay packages for backend pipeline river-of-money developers and ML developers at linkedin).
I’ve used the free version of shiny server as the sole developer in my organization and it’s worked great for me. You can easily add authentication by running both shiny-server and a nodeJS app that implements auth0 behind a reverse proxy with nginx. It took a couple days to set up, but I’ve been very happy with it.
I didn’t follow this tutorial exactly so I can’t vouch for it 100%, but it lays out the general process well: https://auth0.com/blog/adding-authentication-to-shiny-server/
OpenRefine is almost exactly what you want. It's not specific to R, but it does have an R package for it's API rrefine
Without knowing exactly how your data are formatted (e.g., a column for each species, or one column containing all of the species data, etc.), there are several ways you could solve this problem. You might find the tidyverse helpful for this (here is a cheatsheet).
If you have length, weight, and sampling date data organized by species, ultimately what you're trying to do is group records (data) for a given variable by sampling date and species. The group_by()
function from the dplyr package (included in the tidyverse) will help you with this. summarise()
will allow you to summarize the data for a given variable by the groups you defined with grouo_by()
. Depending on how your data are organized and the desired output format, you may need to reformat your data. The two functions pivot_longer()
and pivot_wider()
may help you.
Take a look at these functions on the cheatsheet I linked above and see if you can put these pieces together.
Sounds like tidyr
and dplyr
would be what you want. There's a good cheat sheet for data wrangling here.
If your left table is t1 and the right table is t2, the following code will produce t3:
library(tidyr) library(dplyr)
# Create Original tables for this demonstration
t1 <- data.frame(State = c("UT", "AZ", "NY"), Prediction = c("Product A", "Product B", "Product C")) t2 <- data.frame(Product = c("Product A", "Product B", "Product C"), AZ = c("A-AZ", "B-AZ", "C-AZ"), NY = c("A-NY", "B-NY", "C-NY"), UT = c("A-UT", "B-UT", "C-UT"))
# Transform t2 to have matching State column t2 <- t2 %>% gather(key = State, value = Instruction, -Product)
# Produce t3
t3 <- left_join(t1, t2, by = "State") %>% select(-Product)
This returns the following for t3:
> t3 State Prediction Instruction 1 UT Product A A-UT 2 UT Product A B-UT 3 UT Product A C-UT 4 AZ Product B A-AZ 5 AZ Product B B-AZ 6 AZ Product B C-AZ 7 NY Product C A-NY 8 NY Product C B-NY 9 NY Product C C-NY
Hope this helps.
Headsup. Code to create cards
​
https://www.freecodecamp.org/news/build-your-first-web-app-dashboard-using-shiny-and-r-ec433c9f3f6c/
Credit Risk Analytics -- this book is for credit risk analytics with SAS, but the authors also wrote an R companion for this book so if you get both, you can read the main book for the conceptual details and use the R workbook to apply what the main book teaches.
I got these when I was in banking, but changed careers before I even got past the first chapter.
there's always the book by the author of ggplot: ggplot2: Elegant Graphics for Data Analysis. i haven't actually read it, but i'd imagine it's pretty thorough. there's going to be a new edition coming out, coinciding with a newer version of ggplot, iirc. there's also the official documentation here (looks like they just revamped the site).
but i learned it just by figuring out what i wanted to plot, and then using google/stackoverflow