Skip to main content

Myth about Data Science - A must know for all Data Science enthusiast


1. Only Coder /Programmer can only become a Data Science

No, its not correct. People who is having Basic Programming skills like Python/R or atleast who can learn basic programming can come in to this field.Here i wanted to suggest people who is having Engineering background /Software they can choose Python as a programming and The person who wanted to transit their career in to data science field but coming from non Engineering background like Arts,Commerce,Science they can prefer R as a Programming language . Here am not saying for non technical background can not learn python , its bit difficult to understand the basic and algorithm but if they are ready to learn no issues, they can take any of these either Python or R., I have Mentioned while choosing any of these which one is good according to me in another article i.e python, you can refer my article to get better understanding.

2. Data Scientist are master of all technology .

No, fact is that you should have knowledge on basic data science skills , you dont required to be expertise. A famous line " Jack of all trades " we should have knowledge or basic idea about terminology , algorithm how to apply on the data , basic statistical skill , no one would master in this Era, specially when technology is keep on growing. So i can say you must familiar about the terminilogy and logic and their uses and main thing how to apply these in our data.

3. Data Science is all about tools and technology.

No , its not . Data science is not just about tools and technology, because just applying some lines of code and executing the algorithm and getting the good accuracy is not data science, you should know how to interpret the algoithm and understanding the algorithm /Choosing the algoirthm which is best for the particular problems which is correct or not.

4. Data Analyst and Data Science both works same:


“A data scientist is someone who can predict the future based on past patterns whereas a data analyst is someone who merely curates meaningful insights from data.”

“A data scientist job roles involves estimating the unknown whilst a data analyst job roles involves looking at the known from new perspectives.”

“A data scientist is expected to generate their own questions while a data analyst finds answers to a given set of questions from data.”

“A data analyst addresses business problems but a data scientist not just addresses business problems but picks up those problems that will have the most business value once solved.”

“Data analysts are the one who do the day-to-day analysis stuff but data scientists have the what ifs.”

This is what Abraham Cabangbang, Senior Data Scientist at LinkedIn commented on the difference between data analyst and data scientist -

“It’s definitely a gray area. At my previous company I did both analyst and Data scientist jobs and as an analyst we were more customer facing; the tasks we did were directly related to the tangible business needs—what the customers wanted/requested. It was very directed. The scientist role is a little more free form. The first thing I did as a data scientist is work on building out internal dashboards, basically surfacing information that we were tracking on the back end, but weren’t being used by the data analysts for any reasons; for example, we might have lacked the infrastructure to display it, or the data was just not very well processed. It really wasn’t anything tailored out from a customer need, but came from what I noticed the analyst team needed in order to do their job.”.

Comments

Popular posts from this blog

Why Central Limit Theorem is Important for evey Data Scientist?

The Central Limit Theorem is at the core of what every data scientist does daily: make statistical inferences about data. The theorem gives us the ability to quantify the likelihood that our sample will deviate from the population without having to take any new sample to compare it with. We don’t need the characteristics about the whole population to understand the likelihood of our sample being representative of it. The concepts of confidence interval and hypothesis testing are based on the CLT. By knowing that our sample mean will fit somewhere in a normal distribution, we know that 68 percent of the observations lie within one standard deviation from the population mean, 95 percent will lie within two standard deviations and so on. In other words we can say " It all has to do with the distribution of our population. This theorem allows you to simplify problems in statistics by allowing you to work with a distribution that is approximately normal."  The CLT is...

Most Used Algorithm by DataScientist

We will discuss mostly machine learning algorithms that are important for data scientists and classify them based on supervised and unsupervised roles. I will provide you an outline for all the important algorithms that you can deploy for improving your data science operations. Here is the list of top Data Science Algorithms that you must know to become a data scientist. Let’s start with the first one – 1. Linear Regression Linear Regression is a method of  measuring the relationship between two continuous variables . The two variables are – Independent Variable – “x” Dependent Variable – “y” In the case of a simple linear regression, the independent value is the predictor value and it is only one. The relationship between x and y can be described as: y = mx + c Where m is the slope and c is the intercept. Based on the predicted output and the actual output, we perform the calculation 2. Logistic Regression Logistic Regression is used for binary classificat...

Math Skills required for Data Science Aspirants

The knowledge of this essential math is particularly important for newcomers arriving at data science from other professions, Specially whosoever wanted to transit their career in to Data Science field (Aspirant). Because mathematics is backbone of Data science , you must have knowledge to deal with data, behind any algorithm mathematics plays an important role. Here am going to iclude some of the topics which is Important if you dont have maths background.  1. Statistics and Probability 2. Calculus (Multivariable) 3. Linear Algebra 4.  Methods for Optimization 5. Numerical Analysis 1. Statistics and Probability Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc. Here are the topics you need to be familiar with: Mean, Median, Mode, Standard deviation/variance, Correlation coefficient and the covariance matrix, Probability distribution...