Take a look at the Data Wrangling Cheatsheet section on reshaping data with tidr. gather() should get you the result you’re looking for more or less.
>Shiny by RStudio
>A web application framework for R
>Turn your analyses into interactive web applications
>No HTML, CSS, or JavaScript knowledge required
That pretty much sums it up.
I think what you're seeing is a very common problem everybody eventually runs into. You can read about it on stackoverflow https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal or read this blurb that I wrote 3 years ago about the issue:
Floating point numbers
This is somewhat technical, so don’t worry if you don’t understand it, but the main take home message is this: R (and any other programming language) can be a little inaccurate when representing certain numbers. Explanation: Because of the way computers work and store information, only integers and fractions whose denominator is a power of 2 can be represented exactly. Any other number has to be approximated and rounded off, though you generally don’t see this behavior because the rounding is performed at a very insignificant digit. As a result, two fractions that should be equal might not be equal in R, simply because different algorithms are used to compute them, so they may be rounded a little bit differently.
For example, anyone can tell you that 1 - 0.9 = 0.1, right? Let’s see what R says
1 - 0.9 ## [1] 0.1
No problem, what was I making such a big fuss about?? Let’s look again…
1 - 0.9 == 0.1 ## [1] FALSE
That’s the problem I was talking about. Another way to see this more clearly:
1 - 0.9 - 0.1 ## [1] -2.775558e-17
As you can see, the difference is tiny. Most people can live their lives perfectly happy never knowing about this seemingly horrible shortcoming of computers, but it’s good to keep in mind.
You can combine plumber and future to make an asynchronous web API. You just wrap the logic of your function with future()
library(plumber) library(future)
#' @get /my/route function() { future::future({ my_route_function() }) }
Here's a little resource on that as well: https://www.rstudio.com/resources/rstudioglobal-2021/plumber-and-future-async-web-apis/
The chart you're showing is called a Gantt chart. The best way I've discovered to create Gantt charts in R is to use mermaid.js, which is available via the DiagrammeR package.
This has nothing to do with R and everything to do with learning how to reconcile conflicts from merges in Git before you ever push. For exsmple, https://githowto.com/resolving_conflicts or https://git-scm.com/docs/git-merge.
Try it. You probably can’t run the latest RStudio version on Windows 7 — the RStudio download page has a list of older versions with annotations. You might want to try version 1.1.463 for 32 bit.
R itself should work, according to their FAQ.
That being said, you should upgrade your OS. Windows 7 is obsolete and unmaintained. You do not want to use it, except to maintain legacy systems, and even then only until 2023, which is when Microsoft will stop providing yearly security fixes for it.
Specifically from the R Studio website, their cheat sheets are a great resource for beginners in visualizing the functionality of some packages, including tidyverse. (https://www.rstudio.com/resources/cheatsheets/)
i haven't had a chance to verify this, but if i were you, i'd put all the dataframes into a list (if you have them in a single directory, you can get the paths from a list.files() call, and then do a lapply to read the csvs into a list. then you can use the reduce function. see the top answer here: https://stackoverflow.com/questions/22644780/merging-multiple-csv-files-in-r-using-do-call
> Is there a better language I can learn before diving into this?
Yes^(1) but if your end goal is to learn R anyway (even if just for a course) just go for it, no point in making a detour.
All things told, R isn’t a bad language to start. There are several courses, including the one already posted, as well as the (very basic) Try-R and Roger Peng’s R programming course on Coursera.
One thing to look out in all these courses and tutorials is that R is primarily a language by statisticians for statisticians. Unfortunately this means that some core insights of software engineering are treated with a cavalier attitude (i.e. ignored), to the detriment of code quality.
^1 The reason being that R is pretty particular about several things that (a) make it harder to reason about the correctness of a program, and (b) has a lot o “magic” whose inner workings are not obvious until much later. Those make it especially hard to pick up other, less magical languages subsequently.
Here is a photopea file if you would like to change the background. Simply click file> open & place... and select your favorite background of the same dimension (2560x1440). You might need to change the text color (or size if you want), that would be in the "Text" folder. Unfortunately, it's a mess since I didn't think anyone would use it. To save the folder : file > export as.. > JPEG then select max settings. If you need help with something let me know.
If you're interested in maps, check out the plotly library. If you're just starting, I recommend finding a dataset you're interested in and trying to make some decent graphs (ggplot2 is great).
Going from a dataset to a basic visualization can be more work than you expect. 😉
If you need to convert all your R code to JS, then you will have to do it by hand.
If it's fine to have an R backend powering the charts, then this is certainly doable - have a Shiny server serve up charts or snippets that the frontend can use. This would be my preferred way.
You can also take a look at the htmlwidgets package, which allows you to generate HTML + JS code from R. There are bindings for JS vizualization libraries such as D3.js already - see the "reverse imports" section.
Alternatively, you can have R output the data as eg. JSON, and then the frontend guys can choose a JS charting framework of their liking to make the charts, calling your API for the data.
Some quick googling turned up this:
https://www.djangoproject.com/
Python is a 'real' programming language with a MUCH larger user base, so anything that's been developed for R probably already exists in triplicate for Python.
I'm not trying to dog on R here (it's definitely the language I use 98% of the time), but part of what makes it so good is it has a well-defined niche. It's the de facto analysis software for my entire field, and is currently replacing SPSS, SAS, etc. Part of that niche, though, is that it's really accessible to non-programmers, like you mentioned with the tidyverse example.
Are you always extracting numbers out? If so, this code works.
library(tidyverse) df <- df %>% # Extract number into new column mutate(replicate = str_extract(identifier, "\d")) %>% # Replace numbers with empty string, then trim extra spaces at end of string mutate(identifier = str_trim(str_replace(identifier, "\d", ""))) %>% # Rename identifier column rename(name = identifier)
But to answer your question specifically, you're looking for the function ifelse().
library(tidyverse) df <- df %>% # If type == "unkn" then extract the number, otherwise make the value NA mutate(replicate = ifelse(type == "unkn", yes = str_extract(identifier, "\d"), no = NA)) %>% # Replace numbers with empty string, then trim extra spaces at end of string mutate(identifier = str_trim(str_replace(identifier, "\d", ""))) %>% # Rename identifier column rename(name = identifier)
The second one works, but technically ifelse() isn't necessary if you're always extracting numbers. Up to you which one you need. In either case str_extract uses things called "regular expressions". There are cheat sheets for the package stringr and regular expressions in general here.
If you have any questions let me know, happy to help.
If you don't want to learn how to setup your own Linux server and install/configure ShinyServer yourself you might want to consider hosting it at www.shinyapps.io. Free for limited bandwidth across upto 5 sites, but if your site is going to see lots of traffic or you've more than 5 sites to host you'll likely hit the limits and want to upgrade.
https://orgmode.org/worg/org-contrib/babel/languages/ob-doc-R.html
Create a snippet for each type of chunk you are likely to use. I have one for plots and one for everything else.
Don't edit chunks directly in org-mode, use C-c ' (org-edit-special) to switch to a mini ESS mode
Test run your code from within ESS mode and make sure it works before you execute the block from within org-mode, or it can hang and be hard to debug
Even when you know your code works and you are ready to run your chunk in org-mode, it helps to have your session open in another window so you can watch your code run
Always open your R session in the same directory as your org file if you want in-line plots
I always run all my chunks in the same session, e.g. :session *R*
so you aren't starting from scratch each time
If you want your chunk to output a data table, add :colnames yes
If you want a plot, add :file 1.svg :results output graphics file
, then change the filename for each chunk
I just bought R in Action on Amazon. Seems to come well-regarded!
Edit: Also ordered R for Spatial Analysis and Mapping.
I'll semi-endorse Coursera for R. I assume you are referring to their Data Science specialization offered through Johns Hopkins since they promote it a fair bit and is more of a curriculum than a one-off course.
I took "Computing for Data Analysis" a while back, which as far as I know is essentially the same course as "R Programming", which is one of the first courses in the above specialization. I liked the course, but several people have a differing opinion. The main criticisms tend to state that if you know absolutely nothing about R, this course is not for you. I tend to agree with that as I already had used R for a course in University, but I do find the reviews overly harsh overall.
tl;dr. Coursera might be a decent option, but might not be the best if you are an absolute beginner.
I don't know how much this would help with job prospects, but I've got a CompSci background (including lots of work with SQL like you), and I'm taking and thoroughly enjoying Coursera's Data Science Specialization: https://www.coursera.org/specialization/jhudatascience/1?utm_medium=courseDescripTop
I don't live in the Bay Area but you may want to check out meetup.com for R Users groups in your area. That might be a way to meet others and find out about local workshops and classes.
So your best bet is to describe your data a and the desired outcome a bit so we can get a better sense of what you're trying to accomplish. My gut instinct is that you're attempting to do something that would be better served in dplyr by using merge. There's a short explanation of something quite similar (I think...) here on StackOverflow. If that's not quite what you're looking for, perhaps you could clarify your starting stating and desired output a bit better.
For data.table and ggplot2, basically. Being able to run a query, do some basic data.table analysis, and have a full-blown reactive website with visualizations is something I think is easy to take for granted, especially if someone's never written that stuff from scratch.
Coming from d3js, yeah d3 is more low-level and you have total control, but for LOC-to-quality ratio absolutely nothing can beat shiny + ggplot2. One line of R code that would amount to 200+ lines of d3 plus REST endpoint definitions and coding web service layers. No other language has anything even close to that and the R libraries are designed with extreme quality.
I wrote a DataTables wrapper in PHP back in Feb that would accept either an SQL query or associative array in the constructor and lost a lot of steam after doing the same with just c3js. Shiny does DataTables.js out of the box (not a big deal since it's just a table wrapper and one line of JavaScript, but it's still out-of-box functionality).
HTML wrappers are caveman simple to write, they just take so much time. 99% of the data structures I work with every day are tables, doesn't matter if it's statistical matrices or a storefront product catalog, having a native table data structure to work with is something I cannot give up now that I've gotten used to it. Pandas does it well, in my opinion R does it better. Python handles lists, maps, comprehensions, and generators better, but it's easier to get used to thinking in terms of R lists, vertices, and function application than it is to apply the converse principle to Python.
As for PHP, sure it works, but I'll die happy never seeing another foreach($x as $k => $v)
loop for the rest of my life. I get depressed just looking at it.
I'm happy to see a recommendation for Shiny as the first comment. All other factors being equal, I think Shiny is the best way to go. However, if the users need Tableau like functionality and there will be nobody to maintain an instance of Shiny after the OP leaves, Plotly is another option to consider.
plotly? This may not exactly be what you're looking for. Out of curiosity, why shiny instead of d3.js? Isn't purpose of shiny to make quick and dirty user interface apps for R users?
That's fine. Just use RStudio. Almost everyone uses it to run R.
https://www.rstudio.com/products/rstudio/download/
Get familiar with it and it will save you lots of headache in the future.
st_within in the sf package. If you Ctrl+f "Simple Features" on this page, you'll find a cheat sheet for the sf package.
You want to convert the lat/lng coordinates to a POINT geometry column in one data frame, and the region polygons into a POLYGON column in another data frame. Then do st_within(coords_dat, polygon_dat) and it should return the answer to "for each of these points, which polygon is it in?"
Do you know what projection the region polygons are in? If they're WGS84 or NAD83 (if they come from the government they're probably NAD83), then you'll have an approximation within a few meters to true 3D shape of the earth if you assume a flat plane.
You want to convert your coordinates to POINT objects in a geometry tibble (
This definitely sounds like a Hugo question. I'd start by looking at the docs over on their page.
Off the top of my head though, I'd personally have a random/ folder in my public/ directory, and an index.html file with a javascript redirect there. The tricky part is getting to the list of pages on your site from which you'll pick one at random. Maybe check into reading in your sitemap.xml and pulling out all <loc> tags without an associated <priority>0</priorirty> value?
Haven’t used the package, but the output is a LaTeX table, i.e. it will render if copied and pasted into a LaTeX document.
If this is gobbledegook to you, maybe the simplest way to render the table would be through ShareLaTeX. Note - you will also need to edit out all of the ‘##’s.
Here's a relevant stack overflow answer.
The code I used to solve your problem:
#subset the data to get just columns 4-7 df <- df_original[,4:7] na_df <- which(is.na(df), arr.ind = T) df[na_df] <- rowMeans(df, na.rm = T)[na_df[,1]]
Awesome, I'm doing a beginners course on Coursera R Programing and supplementing it with lots of google queries. These links look great parallel tools. Something I notice that really motivates me is trying to use practical things for my practice. For instance, I was able to turn some arbitrary lesson into making a chart of my pace per km for a 10km run. I'm hoping there will be really practical examples in the course, but if you have any suggestions that relate to fitness, marketing etc. (I have yet to start the marketing course) it would be great. I haven't examined all your links indepth yet, so forgive me if there are examples in there.
Thanks!
Hello, i uploaded all the data i have - source files, a short presentation of the assignment and a script of my code. Here is the link: http://www.filedropper.com/assignment_1 Can we use another way of communication?
I haven't actually taken the specialization. I took Computing for Data Analysis in 2013 and started taking, but dropped Data Analysis.
How do you find the specialization?
Lastly: as you could hopefully figure out on your own, dplyr
did not install successfully. As to why that might be, I don't know, I don't use Windows for such things.
library(stringr)
library(dplyr)
df <- df %>%
mutate(
mycol = ifelse(str_detect(mycol, "\\)$"), mycol, str_c(mycol, "(1a")))
)
Check out the Rstudio stringr cheat sheet at https://www.rstudio.com/resources/cheatsheets/
If you're interested in computer science, make sure you take the Harvard's Intro to CSS. It's free. https://www.edx.org/course/cs50s-introduction-to-computer-science
Start with HTML and CSS. Not computer programming languages, but they're good for beginners and will introduce you to concepts that will help you with other languages.
Once you learn the basics of those two, move to Python. https://www.freecodecamp.org
I made an attempt, OP.
The resulting plot: https://i.imgur.com/f9EctxH.png
And the code (based on this):
library(magrittr) # For the pipe operator: %>% library(ggplot2)
# bunch of tweets obtained using library(twitteR): data
df <- data.frame()
# Collect all created dates into a data frame: for (row in data) { df <- rbind(df,data.frame(created = row$created)) }
# Create new columns for date and time: df$TweetDate <- as.Date(df$created) df$TweetTime <- format(df$created, format = "%H:%M:%S") %>% as.POSIXct(format = "%H:%M:%S")
df %>% ggplot(aes(x = TweetDate, y = TweetTime)) + geom_point() + scale_y_datetime(date_labels = "%H:%M")
As someone new to R, these cheatsheets have been helping me a ton for remembering what functions exist and how to use them.
https://www.rstudio.com/resources/cheatsheets/
For table manipulation check out the ones on dplyr and tidyr.
Try adding layout(xaxis = list(autorange = "reversed"))
to your plot_ly call.
https://plot.ly/r/axes/#reversed-axes
plot_ly(topNsales,x= ~town, y= ~avg_resaleprice, type='bar') %>% layout(autosize = F, margin = m) %>% layout(xaxis = list(autorange = "reversed"))
The above is not ideal as only a few lines are pulled in. The below gets you the full chat as of the page load and saves it locally to be able to pull in and parsed with rvest. Unfortunately, R has some character issue with the formatting into a string and fucks it up. Python probably excels at handling that kind of thing.
rD <- rsDriver(browser = "phantomjs" ) remDr <- rD[["client"]] remDr$navigate("https://poloniex.com/trollbox") source = remDr$getPageSource()
sink("source.html") source[1][[1]] sink() file.show("source.html")
I do not think rvest will be able to natively grab the content of the chat since it renders via javascript. You will need the javascript to render the HTML and then grab it, which can be a pain in the ass. RSelenium + PhantomJS should be able to get you there...
library(RSelenium) rD <- rsDriver(browser = "phantomjs") remDr <- rD[["client"]] remDr$navigate("https://poloniex.com/trollbox") remDr$findElement("id", "trollboxTable")$getElementText()
Have an Rvest question... say I want to grab just one of those ID's - http://imgur.com/a/Jo1Nc. I tried copying the Xlink and plugging it in as follows:
link = "https://poloniex.com/trollbox"
#Get the html source of the URL
hlink = read_html(link)
# Grab text from a specific ID
hlink %>% html_nodes(xpath = '//*[@id="14946443"]') %>% html_text()
But that's not working... if you're familiar with Rvest, should the xpath call be different?
Whether installing the binary or cask version of R
, OP will need to install XQuartz if they want to make use of the X libraries, since X11 isn't included with macOS.
The cask version of R comes with full support--it's the same as if OP went to the cran package and installed the .pkg
file. The binary R
package from Homebrew, yes, is relatively limited. However, the cask available, installs the R.app
, which is the same from the R for Mac OS X CRAN page, here: https://cran.r-project.org/bin/macosx/.
If you go to homebrew-cask/r.rb, you can see that the cask installs from the same url as above.
Hi! Here is a code template to insert the icon next to tab names.
Navbarpage("App title", Tabpanel("Buyer Analysis", icon = icon("bar-chart-o")), .... ... ...
There are a couple websites with all the available icon you can use. Just insert the name of the icon you want in the icon= icon("....")
The website where I got these icon:- https://fontawesome.com/icons?from=io
This is a HTML function and you can easily run it in Rshiny without any additional packages
Maybe I'm misunderstanding what you're trying to do. Can you elaborate a bit more on the purpose of those scripts?
A package is just a set of functions that make the usage of R easier for common data analysis tasks. They will need to be called via line commands. The only way to use UI with R is Shiny.
I would suggest going through this webinar to get acquainted with the concept of Shiny: http://shiny.rstudio.com/tutorial/video/
That sounds interesting, I am going to look into a method of saving the graphs. I could actually just save the locations the user has selected, and use them as inputs to generate the graphs on the fly.
One issue with data storage on shinyapps.io (which is what I am using) is found here: http://shiny.rstudio.com/articles/share-data.html
>set.seed(i)
af_shuffled <- af[,sample(ncol(af))]
cov_shuffled <- cov[,sample(ncol(cov))]
Just a heads up, following two block will produce different results (learnt it the hard way):
set.seed(i)
af_shuffled <- af[,sample(ncol(af))]
cov_shuffled <- cov[,sample(ncol(cov))]
versus,
set.seed(i)
af_shuffled <- af[,sample(ncol(af))]
set.seed(i)
cov_shuffled <- cov[,sample(ncol(cov))]
https://tio.run/##K/r/vzi1RK84NTVFw9DAQJOrODG3ICdVIye1pCS1qFjHFC7i4xoS4hoUDBLhIqgFmzzCgP//AQ
Switch case (or dplyr case_when) can do it with less typing.
I am also big fan of named vectors. Also whole thing can be easily vectorised like this:
Check out Orange Data Mining. While it isn't necessarily an R Language tool, it is certainly worth having in your data science toolkit. Specifically relevant is the interactive decision tred visualization widget.
Here's a link for making one with the plotly package in R.
https://plot.ly/r/choropleth-maps/
Here's one with leaflet. It seems more straight forward to me.
I just wanted to add on here that as you learn more R you're going to want to learn more packages. The <code>stringr</code> package above is fantastic for string manipulation. I recommend going through a few examples when you have time.
stringr
is really just a wrapper for what programmers call regular expressions. Regular expressions, or regexp, is a syntax for parsing through strings, and finding complex matches. As you get more technical with R I highly recommend learning about them for when stringr
doesn't do the trick. RegExr is a good resource.
I had an issue installing packages for python using anaconda after updating to Catalina as well, drove me up the wall. Anaconda has an article about this type of issue and how to fix it, at least it worked for me. I recommend giving it a shot if you haven’t already.
Kinda new to it but I think dplyr or data.table has functions to do it (maybe merge and spread)
https://www.rstudio.com/resources/cheatsheets/
There is also dcast and melt but i forget their package and dcast is harder to use.
Hey, don’t worry I’ve been where you are. Try R using R studio and saving the workspace as a project. Essentially, it just makes sure that all the files you create while working on an analysis live in the same spot, and reopen when you close R studio.
Also, you may find R markdown useful for your project. It allows you to write code and text side by side. I used it all the time in graduate school for homework’s. This cheat sheet has everything you need: https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf
Maybe this webinar will prove to be useful? https://www.rstudio.com/resources/webinars/thinking-inside-the-box-you-can-do-that-inside-a-data-frame/
If you want to apply the same factor levels across all the variables it should be fairly straightforward using the tools shown in that webinar. If the levels are different for each of the variables, it will be a lot harder.
Okay, I grabbed the data you were working with and I would do something like this to make things easy. First, create a new dataframe to just have a state factor with a deaths numeric variable:
library(tidyverse) newdata <- clean_data %>% group_by(state) %>% summarise(deaths = n()) %>%
Then, use filter() to filter it however you want (see here for ideas if you want something more intense than just < 100):
newdata %>% filter(deaths > 100) %>% ggplot(aes(reorder(state, deaths), deaths)) + # reorder to sort by ascending deaths geom_bar(stat = "identity")
Happy to walk you through any of the code if it doesn't make sense.
A little late to the party, but if you're still interested, you can parse your igraph object (g) into Cytoscape (a fantastic, free, Java-based network visualisation GUI) using the CytoscapeRPC plug-in. You can then manipulate the object in there quite easily using the in-built Vizmapper. Can provide examples if needed.
okay, see how slackr is sending text messages:
https://github.com/hrbrmstr/slackr/blob/0bf0e3c7cac50669a41ae18b9c5c94044ed5240a/R/text_slackr.r
resp <- POST(url="https://slack.com/api/chat.postMessage", body=list(token=api_token, channel=channel, username=username, icon_emoji=icon_emoji, text=text, link_names=1, ...))
> The way ggplot2 introduced a consistent and beautiful grammar of graphics
Just to be clear, Lee Wilkinson introduced the grammar of graphics in his book, The Grammar of Graphics. Hadley turned it into an R package.
I also you urge you to consider the tinyverse perspective in contrast to the tidyverse. Lightweight is the right weight.
I used this book, loved it, but it comes with a tag
I was advised to start with basic R and after move to tidyverse and I think it was a good advice. For basic R I recommend https://www.amazon.com/Art-Programming-Statistical-Software-Design/dp/1593273843 , book is little old but I really like it.
I have this: Machine Learning with R - Second Edition https://www.amazon.com/dp/1784393908/ref=cm_sw_r_cp_api_7TMEybJSEQZED
I reference it often. Basic explanations plus use cases. Includes example code and data sources to get you going.
Not in depth from a math/stat perspective but a great starting point.
This is the one I ended up going with https://www.amazon.com/Machine-Learning-Using-Karthik-Ramasubramanian/dp/1484223330/ref=sr_1_40?ie=UTF8&qid=1484434011&sr=8-40&keywords=Learning+R
Unfortunately I still need a book that teaches R as the Language.
www.r-bloggers.com is my favourite site, if I'm ever looking for a how-to guide I will click their links first.
The only book I have bought on R is R In A Nutshell, but I didn't find it very useful.