In my last post, I point the Road to data science, imagined by Swami. I think this road is too long and we can't make any difference between the basics (we have to know) and the advanced(it's important to know, bu if not, it doesn't matter).
So I've imagined my own way to be efficient, from Fundamentals to advanced skills to be autonomous in the deep search of being a good data scientist.
Here's the result.
I try to point that : the fundamentals are the lines orange and blue.
The line orange is about the fundamentals theories in mathematics, probability and statistics. There is no order in the way.
The line blue is about the tools to make the line orange become real. In the litterature today, there's more and more tools to help data scientist in their work; But we should be careful about these profusion. I try to underline that R(the most powerful environment), Spyder (the integrated environment for Python), Hadoop distribution (Cloudera, the simplest and the most popular) and tools for vizualisation like tableau software, rCharts are the things to know.
The roof, I called "Advanced machine learning " are the new standards, the new ways of research, and it can be useful in the future to be aware.
Enjoy your road for Data Science :)