Skip to main content

Differentiate between univariate, bivariate and multivariate analysis.

Univariate analysis are descriptive statistical analysis techniques which can be differentiated based on one variable involved at a given point of time. For example, the pie charts of sales based on territory involve only one variable and can the analysis can be referred to as univariate analysis.

The bivariate analysis attempts to understand the difference between two variables at a time as in a scatterplot. For example, analyzing the volume of sale and spending can be considered as an example of bivariate analysis.

Multivariate analysis deals with the study of more than two variables to understand the effect of variables on the responses.

Comments

Popular posts from this blog

How to deal with missing values in data cleaning

The data you inherit for analysis will come from multiple sources and would have been pulled adhoc. So this data will not be immediately ready for you to run any kind of model on. One of the most common issues you will have to deal with is missing values in the dataset. There are many reasons why values might be missing - intentional, user did not fill up, online forms broken, accidentally deleted, legacy issues etc.  Either way you will need to fix this problem. There are 3 ways to do this - either you will ignore the missing values, delete the missing value rows or fill the missing values with an approximation. Its easiest to just drop the missing observations but you need to very careful before you do that, because the absence of a value might actually be conveying some information about the data pattern. If you decide to drop missing values : df_no_missing = df.dropna() will drop any rows with any value missing. Even if some values are available in a row it will still get dropp...

Data Science Skills

Below are some of the data science skills that every data scientist must know: 1. Change is the only constant It’s not about “Learning Data Science”, it’s about “improving your Data Science skills! The subjects you are learning currently in Grad School are important because no learning go waste but, the real world practicality is totally different from the theory of the books which is taught for decades. Don’t cramp the information, rather understand the big picture. A report states that 50% of things that you learn today regarding IT will be outdated in 4 years. Technology can become obsolete but, learning can’t be. You should have the attitude of learning, updating your knowledge and focusing on your skills(Get your Basics clear) and not on the information you learn! This will help you to survive in this tough and competitive world (I am not scaring you, I am just asking you to prepare your best! You should start focusing on the below skills for becoming a data scientist –...

What is P Value ?

In Data Science interviews, one of the frequently asked questions is ‘What is P-Value?”. According to American Statistical Association, “A p-value is the probability under a specified statistical model that a statistical summary of the data (e.g., the sample mean difference between two compared groups) would be equal to or more extreme than its observed value.”  That’s hard to grasp, yes? Alright, lets understand what really is p value in small meaningful pieces to make it very clear. When and how is p-value used? To understand p-value, you need to understand some background and context behind it. So, let’s start with the basics. p-values are often reported whenever you perform a statistical significance test (like t-test, chi-square test etc). These tests typically return a computed test statistic and the associated p-value. This reported value is used to establish the statistical significance of the relationships being tested. So, whenever you see a p-valu...