I found this one really useful. I'm a long-time programmer with a solid background in SQL and that kind of stuff, so it was useful for me to see how pandas mirrors a lot of the functionality that I would ordinarily push to SQL. So, I was taking these massive text files and inserting them into a database... then using SQL to get the transformations and merges I needed... I was able to remove a fairly heavyweight dependency by dropping the make-a-database step.
I have leaned on SQL for a lot of my previous work but as I've learned more about Pandas I've realized they can both be made to do almost the same things. Now I'm not sure where I come down and I realize that it'll be project-by-project.
Take a look at this book. I found it to be really useful.
https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython-dp-1491957662/dp/1491957662/
Econ 490 ml is switching to python this semester. Prof said he'll likely be using these textbooks:
I picked up this book, written by the guy behind Pandas - Python for Data Analysis
Plus Mode has an awesome Python tutorial on their site that is tailored towards using Pandas within notebooks.
You don't install packages from within IDLE. You do it either in the shell environment with conda install pandas
or by using Anaconda's Navigator GUI to manage your conda environment.
Wait, you are using Anaconda, right? If not, you should be: it's essentially the standard distribution for scientific python and it gives you a single-file "batteries included" installation and package management system (called conda
). I'm pretty sure it comes with pandas by default, but even if it doesn't it's very simple to install it (or anything else!) using conda
commands in the shell or with Navigator. Go here and click the download button.
There are lots of learning resources out there, but to learn pandas you could do a lot worse than reading the book written by Wes McKinney, who originally wrote the library. But truly, the only true way to learn is to have a project and go do it, rinse and repeat; forget bootcamps, tutorials, everything else.
(I'm a computational neuroscientist whose been using python for ephys analysis, stats, and single-neuron and microcircuit modeling since 2006.)
Odlična knjiga za rad sa podacima i maniuplaciju istima. Uglavnom prolazi kroz pandas i numpy, verovatno i scipy. Svaki izbor gde bi naučio da koristiš te biblioteke je od koristi. To će ti biti neizbežan deo posla ako želiš da se baviš sa DS i ML.
https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1492032646
Ova knjiga je odlična. Dobar balans torije raznih modela, koda i rada sa najčešćim framework-ovima i bibliotekama u industriji.
Ovo je isto dobar resurs, primeri su u R programskom jeziku.
A kada sve ovo gore budeš držao u malom prstu i matematika ti postane jako bliska, onda možeš da pređeš i na hardcore literaturu. :)
https://www.amazon.com/Elements-Statistical-Learning-Prediction-Statistics/dp/0387848576
Kurevi od Andew Ng na Courseri su već legendarni, Lazy Progammer na Udemy dosta kvalitetan.
Na Kaggleu imaš dosta datasetova sa kojima možeš da vežbaš i takmičiš se ako želiš. Tamo nije baš sve kao kada radiš na nekom projektu, pošto su datasetovi pročišćeni i jasno ti je odmah šta moraš da radiš, ali je svakako dobra vežba ako ti je zanimljivo.
Mogao bih do sutra da pišem šta možeš sve da probaš, ali ovo ti je i više nego dovoljno za početak. Slobodno piši u DM ako je nešto konkretnije zanima.
The best thing you can do is get time to work on projects and actively learn. Since a lot of my job is related to data, I use Pandas on a regular basis. I would say that I learn about 3-5 hours a week and script/write code for about 1 to 2 hours a week. I've been studying for about 4 months and it's been time really well spent. Below is what I have spent money on in the last 2 years
​
I'll reply more in a bit, but I would highly recommend this book to get started: https://www.amazon.com/Python-Data-Analysis-Wrangling-IPython/dp/1491957662
It's by the creator of pandas. I still reference this book often -- it's quite useful.
Python for Data Analysis is a good book for exactly that.
I'd go with something more focused on numpy/pandas/matplotlib . "Python for Data Analysis" is excellent - https://www.amazon.co.uk/Python-Data-Analysis-Wes-Mckinney/dp/1491957662/ref=sr_1_1?keywords=python+data+science&qid=1560804719&s=gateway&sr=8-1 and is how I started with python (after years of using IDL for my PhD/post-doc).
Python for Data Analysis is worth a read if you are interested in data science at all.
READ THE "PYTHON FOR DATA ANALYSIS" BOOK.
Yes, I shouted. It's that important.