Data Science Misconceptions

Given below are a series of statements that I have heard over time! They might make sense initially, but lets analyze it deeper to come to a logical conclusion. If the statements are in fact untrue, lets see why!

  • “Once I learn Python /R programming / Java / MATLAB and other sophisticated programming, I can easily get a role in Data science.”

This is like, expecting to become a chef by just knowing the recipes at heart. Like cooking is an art, you ought not to replicate the recipes, rather create some own recipes with your choice, your knowledge, your audience and your experience.

Data scientists, have to stay practical to their approach. Just knowing the skills is not enough to guarantee a job, rather, you must have a logical and practical approach to a problem. For example, first analyzing the problem at hand, studying and brainstorming on all the possible factors affecting that problem, verifying is measurable data could be obtained for each variable, applying the right analysis technique for YOUR DATA.

  • Data science is just about Predictive Model Fitting!

Absolutely not true! You have to spend more proportion of your time formatting and handling the data. By this I mean, bringing it to same formatting, making it readable, addressing the missing value and others.

Feature Scaling takes up a lot of your time. Feature scaling means when the data scientist has to conduct analysis and see which variables to let go and which variables to include in the final model.

You need to know visualization techniques, to be able to visualize the data and convey your point easily.

Photo by Chris Liverani on Unsplash
  • More Data relates to increased Accuracy :)

This is not true! The basic idea being, the data has to be analyzed properly in order to make sense out of it. Increasing the quantity would not give in the quality, until we use techniques suitably tailored for the data we are using. Therefore, importance has to be given to the type of tools and tests we apply to the kind of data we have, rather than the quantity of the data.

  • Data Collection is Easy!

In no way is this the case. There are primarily two types of data: Primary and Secondary data.

Now, for primary data, there are several ways to collecting the data of the people, via telephonic interview, digital forms, digital questionnaires etc. Each method has its own pros and cons, as analytical minds (who are brainstorming on the solution to the problem), you have to make a wise decision about the kind of data collection technique that suits your purpose. (ps. you don’t have to follow what others have been doing, rather, what suits your analysis and state).

Even for the secondary data, a lot of time goes into finding the right and reliable source for the data. Even so, there is a harder choice of which variables to keep and which to eliminate (again we have some sophisticated tools for that, but this is one major decision you may face).

  • You need a PhD/a highly Sophisticated educational Qualification to get a Data science job!

No, this is a profile dependent opportunity. There are a lot of opportunities that are entry level and hence does not require any PhD.

Hope this helped you look at the Data Science picture more clearly.

All the best!