Explanation in this article: https://365datascience.com/data-science-vs-ml-vs-data-analytics/
But if you need an entire article to explain a dataviz (or a flow chart venn diagram?), your viz has failed.
I started delving into ML near the end of senior year in high school. As for my Maths courses, to be honest, they were "maxed out": basically every thing my school had to offer for a student of my level. Not to mention, I was already getting my hands dirty with Linear Algebra, Multi-variable Calculus and Number Theory over the high school years. As a result, fortunately, my missing "holes" in Maths for ML/DS/DL were not that big. I filled them up as I moved along.
The resources I used were:
#1 and #2 obviously have great courses in any related topics. I used courses (or should I say playlists since I watched them on YouTube) in #2 as "foundational courses" in Linear Algebra, Multi-variable Calculus, Probability and Statistics. If anything that was difficult to grasp came my way, I would go to #1. In case #1 was still not enough, I would ultimately to my Maths teachers. I frequently used #3 to solidify my newfound knowledge. I didn't give a link mainly because you're just a search away from the courses.
As for ML/DS/DL, I used 365datascience. They have courses from level 1 to max, I suppose. This one is a paid resource. There are plenty of free resources, no doubt. Unfortunately, I do not have enough insights. Best thing I can suggest is go to their respective subreddits' wiki. If you don't find anything useful, just ask the subreddit and you will definitely be guided by some resourceful folks.
To conclude, this has been my journey so far. Took me years of consistent study and hard work to get where I am right now. So, I would say consistency and patience are the keys. If you have these, Internet will help you take yourself to the next level. Good luck on your journey!
Edit : removed the unintentional formatting
The information is collected from public LinkedIn profiles, the data is analyzed with Tableau. You can read more about our research at https://365datascience.com/career-advice/career-guides/data-scientist-2021/ .
It looks like you are using Python. Do you have a git repo? If not, no worries. I can still help.
> I didn't check the assumptions because this is the first project on regression I've ever made, so I do not have experience at all.
OLS regression relies on five primary assumptions. You can find specific detail in this article. You should test each of these assumptions at least visually using plots similar to the ones you have above. Taking the log of Price
to account for the skew and then looking at a histogram of the resulting vector might yield some interesting insights with regards to its distribution.
Right from the start it does not look like your predictors have a linear relationship with Price
. This is assumption #1 in the article.
Likely the normality and heteroskedasticity assumptions (#3 in the article) will not hold either because your response variable skews upwards. You can see this by creating a histogram of Price
. This will result in your model fitting the data poorly. You likely need to consider a GLM to fit this data. I think a log link function will probably be worth looking into to start.
>I've tried linear regression with a k-fold cv (k=5) on both the original dataset (standardized) and the dataset after the PCA, on both of them I've achieved very poor results..
This is because k-fold CV seeks to correct for over-fitting. Your problem is under-fitting.
nicceeee that's what i like to hear, i pretty much do the same thing, and do my morning runs to lean myself out.
udemy has worked well for me if I want to learn something new and i've always been going back to web/IOS dev so i think that's the field I'm going to go in. But last year I was trying to force myself to like python and focus on machine learning/data science. But if that's your thing https://365datascience.com/courses/ is a good place to learn but it's a tiny bit pricy. it was free during june when covid was pretty bad.
https://365datascience.com/null-hypothesis/
This site says a similar thing. "The concept of the null hypothesis is similar to: innocent until proven guilty."
I'm happy to be corrected about any of this if I got something wrong, I want to be sure I understand this stuff so I don't come off as pretentious, which I apologise for if you felt I was doing that in my first comment. Looking back, I could've worded it better.
https://365datascience.com/career-advice/career-guides/best-degrees-data-scientist/
It's surprisingly difficult to find a good breakdown of the backgrounds of professional data scientists (at least from a 1 minute Google). This doesn't give a full breakdown but does say that Computer Science is the most represented undergraduate degree at 18.3%, with Maths or Statistics behind that at 16.3%.
But the big takeaway there should be that there is a wide range of fields that are good for getting into Data Science (seeing as only 34.6% from the top 2 areas).
Anything quantitative is generally a good place to start - Physics, Chemistry, Engineering, Economics, etc.
But going by the numbers alone, I'd say Computer Science is probably the one to pick if being a Data Scientist is the goal.
https://365datascience.com is where I picked up basics after working in non-technical BA role for a year. Google DA certificate doesnt go into detail and its just scratching the surface very thinly. But unlike 365, Google DA can be finished in 5days if done full time.
I recommend you checking out 365 Data Science . Pretty happy with how thoughout the program is and quite cheap compared to university, giving basically almost the same value.
This blog post about a research we did a while ago could be useful to you : Can I Become a Data Scientist: Research into 1,001 Data Scientists
365 Data Science has a pretty legit course on SQL and relational database theory. That site is a joke overall, but this course is fairly in-depth, and possibly doable within your timeframe if you buckle down and focus. I completed it myself and can vouch. It isn't free, but FWIW 365 DS constantly offers discounts, so it shouldn't break the bank.
Hello coders!
I am fairly new to the data science field and I was recommended to enroll in the course from 365 Data Science website to learn more about it. I have taken the Programming for Everybody course from Coursera and I LOVED it. I'm looking for a more in-depth and well rounded introduction into data science as a career. Does anybody have experience with 365 Data Science? via https://365datascience.com/
They are having a spring sale so I was thinking it would be a good time to enroll but I want to be sure that the course is worth it first.
Thanks! :)
Ashley
Never taken statistics I see. Even on a normal distribution a 7 is pretty good (depending on how big your standard deviations are).
https://365datascience.com/explainer-video/distribution-in-statistics/
365 Data Science's courses are free until April 15th and they have a short one on Tableau. May be too basic but maybe look into that one and then decide if it's worth it? Will probably only take a few hours and at least give you an idea. Haven't done it though so I can't say anything about quality.
Hi!
Did the same. I started with Coursera Machine learning by Andrew Ng, and it's a great foundation for a beginner.
Let me suggest 365DataScience, their courses are great for beginners, and i like the way they explain and give practical tips. They made their full course programm free until 15th April. Try it, you will love.
Happy learning
I'm trying to add this link:
https://365datascience.com/free-access-covid-19/
but it keeps flipping to a request for my WP login.
Also, can I suggest you add an 'education' category?
This is free till the 15th of April because of covid. I really enjoyed their "Git and Github" course. Its short, well explained and lots of examples. (I tried some 365 courses on Udemy and wasn't a fan of them, but this is gold).
That's not what distribution means in this context btw. It's a statistical term.
> A distribution in statistics is a function that shows the possible values for a variable and how often they occur.
Okay fair enough. I'm confused about the difference between correlation and regression though in your first response.
You said regression doesn't tell you if X causes Y or Y causes X etc... but also that "a high t stat helps us distinguish between coincidence and cause". Is that contradictory or am I reading it wrong?
I assumed the point of regression is to show how X affects Y (specifically that order)?
​
Here's what I found online:
"correlation doesn’t capture causality but the degree of interrelation between the two variables. Regression is based on causality. It shows no degree of connection, but cause and effect... Correlation between x and y is the same as the one between y and x. Contrary, a regression of x and y, and y and x, yields completely different results.
Short answer is no, I imagine with a stats or cs degree supplemented in with coursework from the other discipline in conjunction with some networking, hard work, and determination will get you there. It definitely will help to have an advanced degree because you DS requires a decent amount of breadth and depth. Plus recruiters might not even consider people with less than a masters. Here’s some quick DS stats:
19% have a bachelors 28% PHD 46% Masters