RStudio if you want to do data analysis and visualization.
Also, look into python, Pandas, NumPy and IPython. Additional tools would be Matplotlib and Jupyter Notebook.
Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython
I took a class in python a couple years back. There were no pre-reqs IIRC, it's a very beginner-friendly language. Installing the anaconda distribution with Jupyter notebook (it's free) and using this book (https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793) as a guide will give you a good start.
I read Python for Data Analysis cover to cover the get started - but if I could do it all over again I'd do something like Datacamp or Dataquest in parallel with a project.
Andrew Ng's coursera lectures on machine learning are a good introduction (though that class uses Octave). Introduction to Statistical Learning and Elements of Statistical Learning are the canonical textbooks for that material.
This book is pretty good: https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
I like Functional Programming in Scala for learning the language. There are a ton of online resources for learning Spark if you want to go the Big Data route. But that's not an absolute necessity for getting into data science.
Kahn Academy has some very good courses for Statistics that I'd recommend.
These books are pretty good:
http://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
Check out /r/pystats for other tutorials and links.
Item | Current | Lowest | Reviews |
---|---|---|---|
Python for Data Analysis: Data Wrangling with Pan… | - | - | 4.1/5.0 |
^Item Info | Bot Info | Trigger
This book is by Wes McKinney, the author of Pandas. It's a great resource. https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
http://mathesaurus.sourceforge.net/r-numpy.html
https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
https://www.amazon.com/Python-R-Users-Ajay-Ohri/dp/1119126762
These are just the first few hits not a personal endorsement.
This might work for you.
This might be useful https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1449319793
Sorry for the misunderstanding --
>Factset, Bloomberg, Dimensional, AQR
Are not so much resources for dealing with data as employers of data wranglers. I mean Factset and Bloomberg are data providers, but...again, I was suggesting you look for employment with them, not have them teach you.
As for learning:
Sounds like you're still in school. Take as many stats and econometrics (especially "time series" anything) classes as you can, if you want to do data stuff in finance. Or...data stuff at all, really.
Python for Data Analysis is a guide to using a particular programming language (Python) to analyze data. The author developed the main library he showcases (<code>pandas</code>) while he was working for AQR, one of the biggest quant hedge fund managers, and open-sourced it when he left. Some of the examples in the book have to do with finance because of this.
You might like Quantopian especially if you like Python.
I was ISYE so I'm not sure how much you are allowed to cross over being CS but I would absolutely recommend taking a regression course. ISYE also has some data analysis electives, but to me learning and mastering regression is a must.
BBUUTT my biggest recommendation is to start playing with data yourself. I am a "Data Scientist" and graduated from the MS Analytics program at Tech and still to this day I learn the most just from playing around with data sets and trying new techniques or learning new coding tools. Don't wait to take classes to jump in, just go.
Here are some great books to get started doing "data science" in R and Python.
R: Introduction to Statistical Learning (free!!)
Python: Python for Data Analysis
my recommendations from someone who had to dive right into the deep end for using python in Data Science:
Check out the Anaconda Python Distribution: Its a python interpreter, a package manager, and something like nearly 300 of the most commonly used python packages preinstalled. It comes with the full 'scientific stack' pre-installed, so you'll have access to the modules you'll really need for visualization and data analytics ie. Numpy, Scipy, Pandas, Matplotlib, Bokeh, Sci-kit, Statsmodels, etc...
If you're into making GUI apps for your data analysis routines then Anaconda comes pre-installed with the Qt4 framework and the PyQt4 bindings - so you can write a GUI in Qt with embedded plots/graphs from matplotlib with ease.
Also check out the feature called IPython Notebooks. Its a really cool web-based (local webserver) interface for running python in 'Matlab'-esque style. You can inter weave Markdown Text with your python code so it can be a great instructive tool where you display your code and analysis routines along with inline plots/graphs as well as text to explain what you're doing and why.