Skip to main content

Posts

Math Skills required for Data Science Aspirants

The knowledge of this essential math is particularly important for newcomers arriving at data science from other professions, Specially whosoever wanted to transit their career in to Data Science field (Aspirant). Because mathematics is backbone of Data science , you must have knowledge to deal with data, behind any algorithm mathematics plays an important role. Here am going to iclude some of the topics which is Important if you dont have maths background.  1. Statistics and Probability 2. Calculus (Multivariable) 3. Linear Algebra 4.  Methods for Optimization 5. Numerical Analysis 1. Statistics and Probability Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc. Here are the topics you need to be familiar with: Mean, Median, Mode, Standard deviation/variance, Correlation coefficient and the covariance matrix, Probability distribution...

what data scientist spend the most time doing

Generally we think of data scientists building algorithms,exploring data and doing predictive analysis. That's actually not what they spend most of their time doing however , we can see in the in the graph most of the time Data scientist are involved in data cleaning part , as in real world scenario we are mostly getting the data which is messey, we can feed the data after cleaning , ML model will not work if the data is messey, Data cleaning is very very important so mostly data analyst and data scientists are involved in this task. 60 percent: Cleaning and organising Data According to a study, which surveyed 16,000 data professionals across the world, the challenge of dirty data is the biggest roadblock for a data scientist. Often data scientists spend a considerable time formatting, cleaning, and sometimes sampling the data, which will consume a majority of their time.Hence, a data scientist, the need for you to ensure that you have access to clean and structured data can save y...

Scope of an Artificial Intelligence

Artificial Intelligence has grown exponentially in the past decade, and so have the career opportunities as an AI expert/specialist. But what exactly does an AI expert do? Also, is becoming an expert the only option while pursuing a career in artificial intelligence?I don’t have any programming/ coding background. Can I still work as an AI expert? And, what specialization or skill set do I need to acquire to get into this field? Skills Required to Build a Career in Artificial Intelligence 1. Sound Mathematical and Algorithmic Understanding To be an ideal candidate in AI, you need to have solid knowledge of applied mathematics and a set of algorithms. Having proficiency in problem-solving and analytical abilities will help you in performing tasks in a more efficient way. You must also have reasonable knowledge of statistics and probability. This helps in understanding various models of AI, like Naive Bayes, Gaussian Mixture Model, etc. 2. Basic Know-How of Programmin...

What is P Value ?

In Data Science interviews, one of the frequently asked questions is ‘What is P-Value?”. According to American Statistical Association, “A p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.”  That’s hard to grasp, yes? Alright, lets understand what really is p value in small meaningful pieces to make it very clear. When and how is p-value used? To understand p-value, you need to understand some background and context behind it. So, let’s start with the basics. p-values are often reported whenever you perform a statistical significance test (like t-test, chi-square test etc). These tests typically return a computed test statistic and the associated p-value. This reported value is used to establish the statistical significance of the relationships being tested. So, whenever you see a p-valu...
Myth about Data Science - A must know for all Data Science enthusiast 1. Only Coder /Programmer can only become a Data Science No, its not correct. People who is having Basic Programming skills like Python/R or atleast who can learn basic programming can come in to this field.Here i wanted to suggest people who is having Engineering background /Software they can choose Python as a programming and The person who wanted to transit their career in to data science field but coming from non Engineering background like Arts,Commerce,Science they can prefer R as a Programming language . Here am not saying for non technical background can not learn python , its bit difficult to understand the basic and algorithm but if they are ready to learn no issues, they can take any of these either Python or R., I have Mentioned while choosing any of these which one is good according to me in another article i.e python, you can refer my article to get better understanding. 2. Data Scientist are ma...

20 Must know Data Science Interview Questions by kdnuggets

The Most important questions which is generally asked by the technical panel : 1. Explain what regularization is and why it is useful. 2. Which data scientists do you admire most? which startups? 3. How would you validate a model you created to generate a predictive model of a quantitative outcome variable using multiple regression. 4. Explain what precision and recall are. How do they relate to the ROC curve? 5. How can you prove that one improvement you've brought to an algorithm is really an improvement over not doing anything? 6. What is root cause analysis? 7. Are you familiar with pricing optimization, price elasticity, inventory management, competitive intelligence? Give examples. 8. What is statistical power? 9. Explain what resampling methods are and why they are useful. Also explain their limitations. 10. Is it better to have too many false positives, or too many false negatives? Explain. 11. What is selection bias, why is it important and how can you avoid i...

Why Central Limit Theorem is Important for evey Data Scientist?

The Central Limit Theorem is at the core of what every data scientist does daily: make statistical inferences about data. The theorem gives us the ability to quantify the likelihood that our sample will deviate from the population without having to take any new sample to compare it with. We don’t need the characteristics about the whole population to understand the likelihood of our sample being representative of it. The concepts of confidence interval and hypothesis testing are based on the CLT. By knowing that our sample mean will fit somewhere in a normal distribution, we know that 68 percent of the observations lie within one standard deviation from the population mean, 95 percent will lie within two standard deviations and so on. In other words we can say " It all has to do with the distribution of our population. This theorem allows you to simplify problems in statistics by allowing you to work with a distribution that is approximately normal."  The CLT is...