I think your reorientation is ok, but you would not use an area chart, because that is meant more for a continuous quantity. The time doesn't necessarily change linearly between each sample. Instead, as a discrete sample, your original , reoriented bars, or just distinct scatter/dot plots work best.
For the the main selling points of Python are:
But the important thing for me is the spirit of open source. That means that the libraries that you are using were made by people like me and you. Your set of custom function can once become a universally known library.
Just a tough of that makes me happy and eager to help.
Btw. Have you tried: http://bokeh.pydata.org/ or http://ipython.org/. There are plenty of nice project waiting for you contribution. Whatever your interest is.
Check out bokeh. It's a nice library that outputs plots in an HTML file and you don't even need to know JS. I think they also offer some support for sliders, buttons and widgets. I think the creators are aiming to build something pretty close to Shiny for R, if not even better.
> I tried to love NumPy for a long time. It has a lot going for it.
> But R. Oh R.
Well, numpy is not really the equivalent of R. If you want to single out just one Python package to put up against R, it would probably be pandas. But if you really want to build an R-equivalent system in Python, you're probably looking at SciPy+numpy+pandas+matplotlib+IPython notebooks at least; arguably you'll need to look at the full spectrum of PyData tools.
Staying in the Python milieu, matplotlib has come a long way. And if ggplot is your thing, then you should look at Bokeh.
Granted, and as I said in another comment here, it's only as of very recently that I'd regard the Python ecosystem as a viable full-spectrum replacement for R, even though the scientific Python community have been working towards that goal for a long time.
Bokeh has Jupyter Notebook integration and has fairly easy methods for getting your data in to a plot either by rolling your own line/cirlce/etc glyphs or using one of their canned charts. You can link multiple plots together such that they all response in kind to an input on another (i.e. zoom, pan, etc), this is assuming they share a common data source. I think keyboard interaction might be the only thing I'm not sure it supports..
Seriously? Very few of these deal with large data in any meaningful way. Bokeh is a plotting library for Python and R that has a custom JS runtime library that can handle realtime updates and very large data.
This example, using the Canvas 2D backend, has a few thousand points, but easily goes up to tens of thousands of points: http://bokeh.pydata.org/en/latest/docs/gallery/color_scatter.html
Here's interactively visualizing 100k points on a webGL canvas on top of Google Maps, using 15 lines of Python. The webGL backend scales up to 500k points right now. http://nbviewer.ipython.org/github/bokeh/bokeh-notebooks/blob/master/tutorial/00%20-%20intro.ipynb#Interact-with-a-100k-points
For even larger data, there is a new server backend pipeline that makes it easy to scales to 1 billion points, interactively, through the browser (without using webGL).
Bokeh accept via ajax data source to update the plot with ajax requests. It is easy to built bokeh + microframework apps (and it has been designed with that in mind).
But being honest, I doesn't hurt to learn some JS basics because bokeh is built on JS charting libs. And BokehJS is a thing.
Just piggybacking off of this comment since it brings up a lot of fantastic points, and is a fantastic post overall!
Check out Bokeh for some more Python visualization! It even prides itself on trying to be D3.JS for Python!
There's also /r/datasets if you want some pre-made data to work with!
Bokeh. Look at the Server Apps on the gallery: http://bokeh.pydata.org/en/latest/docs/gallery.html#gallery
At the conference, someone pointed out that the Dash github page indicated that the project had been deprecated and was being re-written. Chris answered that a release will be forthcoming sometime early this year. There was no mention as to whether or not the server for this stuff will be open source, or dual-licensed etc.
With Bokeh, everything (front-end and backend server) is BSD licensed open source, with a community of contributors.
To echo what PhaethonPrime says, you'll be okay for most stats as long as you don't need to do any exotic (and also non-bayesian) models.
In terms of the graphics, definitely check out learning some of the "grammar" based plotting libs. This is one area where R still crushes it, but Python's Bokeh is getting interesting these days.
Have you looked at Bokeh? http://bokeh.pydata.org
It's exactly what you're talking about. Python (or Scala or R) based construction of interactive web-based plots and dashboards. No need to have a server, even - everything can embed in a static HTML file. (Although server mode works great for large and streaming datasets.)
That is way too complicated for the end result you get. Which is why the same thing would only be a handful of lines of code using ipywidgets, holoviews, or bokeh.
It seems like you are basically describing the Bokeh project. You can either host it using their built in bokeh server or embed it with a flask app, which also requires minimal work.
We use plotly offline all the time with python to generate HTML docs we send to people.
There is also http://bokeh.pydata.org/en/latest/
(Not python aside) If you want free(ish) online check out R shiny dashboards and if you want to host your own shiny
Regarding JS, a few comments:
Depending on what you are doing, it might be possible to avoid CustomJS
callbacks in favor or real python callbacks by using the Bokeh server. You can see some example Bokeh apps running here: https://demo.bokehplots.com We are also currently working on integrating Bokeh apps and the Notebook better, there should be some preliminary guidance in the upcoming 0.12.4 release.
Even without using the Bokeh server, depending on what you are are doing, it might be possible to "transpile" python code into JS automatically if you install Flexx. See these docs: http://bokeh.pydata.org/en/latest/docs/user_guide/interaction/callbacks.html#customjs-with-a-python-function
In general, the availability of CustomJS callbacks is somewhat of a compromise, that reflects the limited human resources of the core team, and the desire to not let the size of the library size spiral out of control to un-maintainability. It's a relief valve, to afford users a path to sophisticated capabilities that may be too narrowly defined to warrant inclusion in the core library. The ability to extend Bokeh with custom extensions is also intended along these lines, but it is my hope that we can soon develop a robust mechanism for discovering and sharing extensions, so that their value can be magnified across the entire community (and so that only a few people have to do the work to write them).
The standard library csv
module is fine for reading files - it can give you a streaming iterator so you don't have to load the whole thing into memory. From there, just use matplotlib or Bokeh to build the actual plots.
Start here. http://bokeh.pydata.org/en/latest/ I use Jupyter and output to my notebook usually when is it is something for my analysis or exploration. It does output to html really nice but the learning curve is stupid. Also look into seaborn for quick visualization. If you want to go fancy, have a look at high charts. Combined with a couple of interesting data set from Quandl and now you have a project. As for handling it, I work in either Anaconda or Jupyter.
try this to replace rstudio: https://github.com/yhat/rodeo
and this for dplyr: https://github.com/blaze/blaze/pull/484
for ggplot replacement, bokeh has a high level charts and a medium level API that can do cool stuff: http://bokeh.pydata.org/en/latest/docs/gallery/burtin.html
Plotting real-time streaming data with Bokeh is very simple.
The Bokeh library ships with a standalone executable bokeh-server that you can easily run to try out server examples, for prototyping, etc. Streaming data to automatically update plots is very straightforward using bokeh-server. The session.store_objects method can be used to update objects on the server (and consequently on the browser) from your python code.
What you need to do is:
1. Write a Python code
example.py
import time from random import shuffle from bokeh.plotting import figure, output_server, cursession, show
# prepare output to server output_server("animated_line")
p = figure(plot_width=400, plot_height=400) p.line([1, 2, 3, 4, 5], [6, 7, 2, 4, 5], name='ex_line') show(p)
# create some simple animation.. # first get our figure example data source renderer = p.select(dict(name="ex_line")) ds = renderer[0].data_source
while True: # Update y data of the source object shuffle(ds.data["y"])
# store the updated source on the server cursession().store_objects(ds) time.sleep(0.5)
2. Run the bokeh-server
$ bokeh-server
3. Run the Python code
$ python example.py
That's all.
http://bokeh.pydata.org/en/latest/docs/user_guide/server.html
First, /r/learnpython might be a better place for this. Or the matplotlib mailing list. Or another stackoverflow question. I also highly recommend just playing with things until you have a specific question.
Second, is the java (javascript?) web interface required? If it's python doing the updating of the web interface would that work too? Also, there are probably other/better ways to communicate data from a python process to a java process.
For the number of points questions, tick labels are separate from the data points so you can have whatever labels you want, but I can't remember the specific method for doing this in mpl. Also, I suggest only storing/showing the last X records unless you need to keep all of them for some reason (see numpy.roll when performance isn't an issue). You could easily store all the data on disk, but only show the last 200 in a real time dashboard. If you do look in to storing this information long term I suggest using comma-separated files (google CSV) as it is used more often than JSON for storing data like this and numpy and pandas have functions for loading entire CSVs in to a numpy array. As for your TypeError
exception, your json is a list of dictionaries so first you need to index the record you want: y = data[0]["tempF"]
.
If you're confused about how to update the plots with new data maybe look at the animation operations of matplotlib. You could also easily update a figure that you have by using the set_ydata(...)
and set_xdata(...)
method on the axis object (I think it's the axis object). Lastly, if you want to play with python-driven plots in a browser check out bokeh. It's something I've been meaning to get in to, but am not yet an expert.
Wow this got long.
Others will offer more conventional solutions, but I'd throw out that you can make a web based GUI, and have a local server serve it. I only mention it because I haven't heard much great about python GUIs and html/css/javascript is a very widely used GUI solution that will be more generally useful. It works for local apps and easily extends into the other paradigm of remotely hosted apps.
For instance, compare matplotlib to a python/javascript solution
I haven't tried a plot server like this before, but I was thinking about using this for an analytics dashboard. I like the idea of server-side downsampling of data that happens with the Bokeh Server. I imagine this could be a future feature of Lightning as well?
Our interactive web plotting project, Bokeh, can definitely use contributors at a variety of levels. We could use help from folks just playing with it and contributing examples and tutorials, to more experienced folks helping with fleshing out the matplotlib compatibility layer, to web/JS/design folks to help us make our default templates look prettier.
It's a fun and capable project and we are always eager to get new contributors. If you've ever wished that there was a real "d3 for Python" that didn't treat Python or JS as second-class citizens, you should definitely check us out.
This map was generated using data on Reddit comments on Google BigQuery using Python. A description of how the map was generated is available here.
The data was extracted from Reddit by u/Stuck_in_the_Matrix and eventually put on BigQuery. The data used in this plot can be extracted via [these queries](). Processing the data was handled in Python using Pandas and Scipy, with dimension reduction performeed with Scikit-learn and LargeVis. Visualization was done using Bokeh.
Take a look at Bokeh: http://bokeh.pydata.org/en/0.11.1/docs/user_guide/geo.html
and GeoViews (which can use Bokeh): https://www.continuum.io/blog/developer-blog/introducing-geoviews
Neither require Javascript, and probably do what you need.
Are you already committed to using D3? If you are not already familiar with both Javascript and graphics concepts of SVG, then D3 is a second hill you must climb, and it's a much bigger one than the "Python CSV parsing" hill.
I would recommend you look at the Bokeh library for creating interactive web graphics from the comfort of Python. For many chart types, it's all you need. You can even build UIs that let the viewer customize how they want to slice and dice the data, like this little example: https://demo.bokehplots.com/apps/crossfilter
The source code for that is entirely in Python, and can be seen here: https://github.com/bokeh/bokeh/blob/master/examples/app/crossfilter/main.py
(Disclaimer: I am one of the creators of Bokeh)
I haven't tried it myself, but bokeh seems to have a large chunk of functionality devoted to streaming data. You could ask this question on the bokeh mailing list - They're quick and very helpful!
OK, thank you for the information. We've refactored the core quite a bit for the 0.11 release, so that things like embedding are much easier. Additionally, there is a new high-level charts API, that is much easier to use for generating many "standard" infographics: http://bokeh.pydata.org/en/latest/docs/user_guide/charts.html
Sorry, can't help you with that. On the other hand, I've done this before with Bokeh, checkout the Categorical Heatmap example or the more in depth HoverTool section.
> (say more than 2000x2000)
Missed that one, sorry.
Matplotlib is probably too slow for 10000x10000 arrays. Bokeh or vispy might be able to handle really big images.
Hmm, that is not something I have tried before (I tend to do just JS-based visualizations) but the Bokeh user guide hints at the ability to hook a wide range of callbacks for interactivity: http://bokeh.pydata.org/en/latest/docs/user_guide/interaction.html#defining-actions
As well as a wide range of configurable visual attributes: http://bokeh.pydata.org/en/latest/docs/user_guide/styling.html
But overall, it does seem to be intended to be used primarily from Python, with the ability to configure/interact with the JS from the Python side. If you are looking for something you can tweak by writing JS code, I would start with the D3 example gallery: https://github.com/mbostock/d3/wiki/Gallery
As others have said, using PyPlot/Matplotlib for this is not that great, unless you get into the bowels of it. You can look at e.g. Chaco if you want a rich-client GUI, or Bokeh, if you want a web browser interface. (See e.g. https://github.com/bokeh/bokeh/blob/master/examples/plotting/server/animated.py for an example of animation with Bokeh)
Click any of the example thumbnails here for interactive web plots. (The code is displayed below the plots.)
http://bokeh.pydata.org/en/latest/docs/gallery.html
This kind of stuff is available for free, on Python, and it's getting better all the time.
your stuff is miles ahead of where I'm at (and at a much larger scale) but very interesting, thanks. I was going to have a play around with Bokeh (http://bokeh.pydata.org/en/latest/docs/user_guide.html) once I get the basics out of the way
basically, I am trying to create an Xaxis that looks similar for my plot, displays "Jan 14". Tried passing a list of strings generated from strftime that were in that format but bokeh line class/function doesn't like that. I get a blank plot. If i try to pass a list of timedate elements for my X axis, I get an incomprehensible axis as seen in the top graph here: link
Thanks for the Chaco mention. If you want to do Chaco-like things with Python 3, and don't mind it being a web interface, you should look at Bokeh: http://bokeh.pydata.org
It's still a work in progress but many of the same core ideas are there, with some improvements (e.g. incorporating graphing ideas from d3/protovis).
I am also keeping an eye on Vispy, and would like to see a future when it's the primary non-web backend for Bokeh...
Check out bokeh. From their webpage:
>Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, but also deliver this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.
It is being built by Continuum Analytics. They're a couple of years into it and are on a 0.5 version, so the API may be subject to changes. But it's starting to come together very nicely. They have samples on their site.
I've kinda tired of the Matplotlib API and have been meaning to check out bokeh for a long time.
Is there an good tutorial somewhere (for those coming from Matplotlib)? The one on their homepage is just a bunch of exercises.
Bokeh: Python library for novel, interactive, animated plots, in the browser. If you wanted to do customer, interesting d3-like plots but don't want to learn Javascript, and if you want to do it on really large datasets, then you are the target user Bokeh was designed for.
(There is also the beginnings of a Scala interface, and there is a Matplotlib compatibility layer that is actively being improved. See http://continuum.io/blog/bokeh-0.4.1)
Bokeh server doesn't run in Windows because it requires redis. The docs suggest that it works out-of-the-box with Anaconda, but apparently that's not the case. You said in a Google Groups thread that "in the next version of Bokeh we will be releasing a plot server that does not rely on redis". Which version would it be and when can we expect it to arrive?