Skip to main content

Myth about Data Science - A must know for all Data Science enthusiast


1. Only Coder /Programmer can only become a Data Science

No, its not correct. People who is having Basic Programming skills like Python/R or atleast who can learn basic programming can come in to this field.Here i wanted to suggest people who is having Engineering background /Software they can choose Python as a programming and The person who wanted to transit their career in to data science field but coming from non Engineering background like Arts,Commerce,Science they can prefer R as a Programming language . Here am not saying for non technical background can not learn python , its bit difficult to understand the basic and algorithm but if they are ready to learn no issues, they can take any of these either Python or R., I have Mentioned while choosing any of these which one is good according to me in another article i.e python, you can refer my article to get better understanding.

2. Data Scientist are master of all technology .

No, fact is that you should have knowledge on basic data science skills , you dont required to be expertise. A famous line " Jack of all trades " we should have knowledge or basic idea about terminology , algorithm how to apply on the data , basic statistical skill , no one would master in this Era, specially when technology is keep on growing. So i can say you must familiar about the terminilogy and logic and their uses and main thing how to apply these in our data.

3. Data Science is all about tools and technology.

No , its not . Data science is not just about tools and technology, because just applying some lines of code and executing the algorithm and getting the good accuracy is not data science, you should know how to interpret the algoithm and understanding the algorithm /Choosing the algoirthm which is best for the particular problems which is correct or not.

4. Data Analyst and Data Science both works same:


“A data scientist is someone who can predict the future based on past patterns whereas a data analyst is someone who merely curates meaningful insights from data.”

“A data scientist job roles involves estimating the unknown whilst a data analyst job roles involves looking at the known from new perspectives.”

“A data scientist is expected to generate their own questions while a data analyst finds answers to a given set of questions from data.”

“A data analyst addresses business problems but a data scientist not just addresses business problems but picks up those problems that will have the most business value once solved.”

“Data analysts are the one who do the day-to-day analysis stuff but data scientists have the what ifs.”

This is what Abraham Cabangbang, Senior Data Scientist at LinkedIn commented on the difference between data analyst and data scientist -

“It’s definitely a gray area. At my previous company I did both analyst and Data scientist jobs and as an analyst we were more customer facing; the tasks we did were directly related to the tangible business needs—what the customers wanted/requested. It was very directed. The scientist role is a little more free form. The first thing I did as a data scientist is work on building out internal dashboards, basically surfacing information that we were tracking on the back end, but weren’t being used by the data analysts for any reasons; for example, we might have lacked the infrastructure to display it, or the data was just not very well processed. It really wasn’t anything tailored out from a customer need, but came from what I noticed the analyst team needed in order to do their job.”.

Comments

Popular posts from this blog

Introduction to Datascience

Data Science has become one of the most demanded jobs of the 21st century. What is Data Science? “Data Science is about extraction, preparation, analysis, visualization, and maintenance of information. It is a cross-disciplinary field which uses scientific methods and processes to draw insights from data. ” As a data scientist, you take a complex business problem, compile research from it, creating it into data, then use that data to solve the problem. A Data Scientist, specializing in Data Science, not only analyzes the data but also uses machine learning algorithms to predict future occurrences of an event. Therefore, we can understand Data Science as a field that deals with data processing, analysis, and extraction of insights from the data using various statistical methods and computer algorithms. It is a multidisciplinary field that combines mathematics, statistics, and computer science. Why Data Science? So, after knowing what exactly Data Science is, you must explore ...

What is P Value ?

In Data Science interviews, one of the frequently asked questions is ‘What is P-Value?”. According to American Statistical Association, “A p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.”  That’s hard to grasp, yes? Alright, lets understand what really is p value in small meaningful pieces to make it very clear. When and how is p-value used? To understand p-value, you need to understand some background and context behind it. So, let’s start with the basics. p-values are often reported whenever you perform a statistical significance test (like t-test, chi-square test etc). These tests typically return a computed test statistic and the associated p-value. This reported value is used to establish the statistical significance of the relationships being tested. So, whenever you see a p-valu...

Data Science Interview Questions -Part 2

1) What are the differences between supervised and unsupervised learning? Supervised Learning Unsupervised Learning Uses known and labeled data as input Supervised learning has a feedback mechanism  Most commonly used supervised learning algorithms are decision trees, logistic regression, and support vector machine Uses unlabeled data as input Unsupervised learning has no feedback mechanism  Most commonly used unsupervised learning algorithms are k-means clustering, hierarchical clustering, and apriori algorithm 2) How is logistic regression done? Logistic regression measures the relationship between the dependent variable (our label of what we want to predict) and one or more independent variables (our features) by estimating probability using its underlying logistic function (sigmoid). The image shown below depicts how logistic regression works: The formula and graph for the sigmoid function is as shown: 3) Explain the steps in making a deci...