Skip to main content

Math Skills required for Data Science Aspirants

The knowledge of this essential math is particularly important for newcomers arriving at data science from other professions, Specially whosoever wanted to transit their career in to Data Science field (Aspirant). Because mathematics is backbone of Data science , you must have knowledge to deal with data, behind any algorithm mathematics plays an important role. Here am going to iclude some of the topics which is Important if you dont have maths background. 

1. Statistics and Probability
2. Calculus (Multivariable)
3. Linear Algebra
4.  Methods for Optimization
5. Numerical Analysis

1. Statistics and Probability

Statistics and Probability is used for visualization of features, data preprocessing, feature transformation, data imputation, dimensionality reduction, feature engineering, model evaluation, etc.

Here are the topics you need to be familiar with: Mean, Median, Mode, Standard deviation/variance, Correlation coefficient and the covariance matrix, Probability distributions (Binomial, Poisson, Normal), p-value, Baye’s Theorem (Precision, Recall, Positive Predictive Value, Negative Predictive Value, Confusion Matrix, ROC Curve), Central Limit Theorem, R_2 score, Mean Square Error (MSE), A/B Testing, Monte Carlo Simulation

2. Multivariable Calculus

Most machine learning models are built with a dataset having several features or predictors. Hence, familiarity with multivariable calculus is extremely important for building a machine learning model.

Here are the topics you need to be familiar with: Functions of several variables; Derivatives and gradients; Step function, Sigmoid function, Logit function, ReLU (Rectified Linear Unit) function; Cost function; Plotting of functions; Minimum and Maximum values of a function

3. Linear Algebra

Linear algebra is the most important math skill in machine learning. A data set is represented as a matrix. Linear algebra is used in data preprocessing, data transformation, dimensionality reduction, and model evaluation.

Here are the topics you need to be familiar with: Vectors; Norm of a vector; Matrices; Transpose of a matrix; The inverse of a matrix; The determinant of a matrix; Trace of a Matrix; Dot product; Eigenvalues; Eigenvectors

4. Optimization Methods

Most machine learning algorithms perform predictive modeling by minimizing an objective function, thereby learning the weights that must be applied to the testing data in order to obtain the predicted labels.

Here are the topics you need to be familiar with: Cost function/Objective function; Likelihood function; Error function; Gradient Descent Algorithm and its variants (e.g. Stochastic Gradient Descent Algorithm)

5. Numerical Analysis

Its very good to have numerical analysis knowledge like time series analysis , forecasting

Best Youtube channel to learn:


Best Blog to read :

https://towardsdatascience.com/

 

Comments

Popular posts from this blog

Deep Learning Interview Questions - Part 1

Q1. What do you mean by Deep Learning?  Deep Learning  is nothing but a paradigm of machine learning which has shown incredible promise in recent years. This is because of the fact that Deep Learning shows a great analogy with the functioning of the human brain. Q2. What is the difference between machine learning and deep learning? Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. Machine learning can be categorised in the following three categories. Supervised machine learning, Unsupervised machine learning, Reinforcement learning Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. Q3. What, in your opinion, is the reason for the popularity of Deep Learning in recent times? Now although Deep Learning has been around for many years, the major breakthroughs from these te...

How to deal with missing values in data cleaning

The data you inherit for analysis will come from multiple sources and would have been pulled adhoc. So this data will not be immediately ready for you to run any kind of model on. One of the most common issues you will have to deal with is missing values in the dataset. There are many reasons why values might be missing - intentional, user did not fill up, online forms broken, accidentally deleted, legacy issues etc.  Either way you will need to fix this problem. There are 3 ways to do this - either you will ignore the missing values, delete the missing value rows or fill the missing values with an approximation. Its easiest to just drop the missing observations but you need to very careful before you do that, because the absence of a value might actually be conveying some information about the data pattern. If you decide to drop missing values : df_no_missing = df.dropna() will drop any rows with any value missing. Even if some values are available in a row it will still get dropp...

Mathematics & Statistics for Data Science

Converting raw and quantitative data into organized and informative information needs a lot of brain power and understanding. It is true that everyone can’t be Aryabhatta but, you can be hardworking, focused and dedicated. So, it is time to show your dedication and hard work for learning maths and statistics for data science. Mathematics and Statistics are two of the most important concepts of Data Science. Data Science revolves around these two fields and draws their concepts to operate on the data. Today, we will explore the various concepts that build up data science and their practical usages in this field.Data Science has become a trending technology in the world today. In order to learn data science, you must reinforce your knowledge of mathematics and statistics. So let’s first explore how much maths is required for data science – Math for Data Science Mathematics has created an impact on every discipline. The magnitude of the usage of mathematics varies according to the ...