Skip to main content

Data Science Skills

Below are some of the data science skills that every data scientist must know:

1. Change is the only constant

It’s not about “Learning Data Science”, it’s about “improving your Data Science skills!

The subjects you are learning currently in Grad School are important because no learning go waste but, the real world practicality is totally different from the theory of the books which is taught for decades. Don’t cramp the information, rather understand the big picture.

A report states that 50% of things that you learn today regarding IT will be outdated in 4 years. Technology can become obsolete but, learning can’t be. You should have the attitude of learning, updating your knowledge and focusing on your skills(Get your Basics clear) and not on the information you learn!

This will help you to survive in this tough and competitive world (I am not scaring you, I am just asking you to prepare your best! You should start focusing on the below skills for becoming a data scientist –

Business Skills
Practical Skills like Math and statistical skills
Coding Skills(As I said, Learn and don’t Cramp)
Soft Skills like People skills, Social skills, Data Visualisation, Presentation and Communication (Emotional intelligence quotients- I feel this is the most imp. one)

2. Essential Data Science Ingredients – Tools for Data Science

Companies employ Data Scientists to help them gain insights about the market and to better their products. There are several tools needed for Data Science. Skills like Big Data Technologies, UNIX, Machine Learning, Python, R, SQL, etc. are needed to master the art. You can start with the last three skills (PYTHON, R, SQL) from now. It will reap you benefits in the future. These three are the most used skills in the current scenario.

3. Become the Jack of all and Master of none for the BELOW points only

a. Develop the mind, discover new things and enhance your imagination- In short, Start Reading.
The above benefits are good enough to start reading NOW! Also, make sure that you start reading outside your discipline. It will help you to get exposure to a wide range of different techniques and problems and to get comfortable jumping feet-first into a new topic.

b. Try to analyze new types of Data
Data can be in any form. You can easily interpret data if it is in the form of text and images. But, try to interpret the Data in a Video or in an Audio. It can be in a Pre-trained model, in a Relational database, and in a Time series form. The last three might be tough for you as of now, but you can start with Video and Audio form.

c. Get Inspired with New Ideas and Be Impacted: Talk to new people
Create a network with people you can learn from. If someone further along the road can mentor you in the right manner, it can add direction to your career. Talk to people with a technical background who are outside your field. You will get new information, ideas, and loop-holes where you can create value.

Also, talking to people with a non-technical background will help you to enhance your soft-skills. You will get a chance to explain to them the technicals of your specific academic background.

d. You can now cry over spilt milk with Version Control
Imagine! You have got control over your actions and just by a click, the actions are revocable and controllable. Wouldn’t life be perfect then?

Don’t know about real life but, you can control that in the reel life. If a mistake is made you can turn back the clock and compare earlier versions of the code to help fix the mistake while minimizing disruption to all team members.

Here you can figure out your mistakes and what you broke. It’s very good for individual projects and can help you to master the art if practiced regularly.

4. Never settle

Here I am not talking about the beautifully designed phone with the premium build quality & the best technology to users around the world. I am asking you to not Stop at “Good Enough”.

This means that if a model is not accurate and needs additional tuning then it should not be left at the good enough stage. This will be a major factor that will differentiate you from the other Data Scientist. Bring perfection in the task and make sure you answer every single question you can with the data. The best thing is to try to add value. If someone else finds it valuable I bet it will be the most valuable work for you.


Comments

Popular posts from this blog

Deep Learning Interview Questions - Part 1

Q1. What do you mean by Deep Learning?  Deep Learning  is nothing but a paradigm of machine learning which has shown incredible promise in recent years. This is because of the fact that Deep Learning shows a great analogy with the functioning of the human brain. Q2. What is the difference between machine learning and deep learning? Machine learning is a field of computer science that gives computers the ability to learn without being explicitly programmed. Machine learning can be categorised in the following three categories. Supervised machine learning, Unsupervised machine learning, Reinforcement learning Deep Learning is a subfield of machine learning concerned with algorithms inspired by the structure and function of the brain called artificial neural networks. Q3. What, in your opinion, is the reason for the popularity of Deep Learning in recent times? Now although Deep Learning has been around for many years, the major breakthroughs from these te...

R vs Python: Who is the Winner according to me...!!

As a data scientist, you probably want and need to learn Structured Query Language, or SQL. SQL is the de-facto language of relational databases, where most corporate information still resides. But that only gives you the ability to retrieve the data — not to clean it up or run models against it — and that’s where Python and R come in.R and Python both share similar features and are the most popular tools used by data scientists. Both are open-source and henceforth free yet Python is structured as a broadly useful programming language while R is created for statistical analysis. A little background on R R was created by Ross Ihaka and Robert Gentleman — two statisticians from the University of Auckland in New Zealand. It was initially released in 1995 and they launched a stable beta version in 2000. It’s an interpreted language (you don’t need to run it through a compiler before running the code) and has an extremely powerful suite of tools for statistical modeling and graphing...

How to deal with missing values in data cleaning

The data you inherit for analysis will come from multiple sources and would have been pulled adhoc. So this data will not be immediately ready for you to run any kind of model on. One of the most common issues you will have to deal with is missing values in the dataset. There are many reasons why values might be missing - intentional, user did not fill up, online forms broken, accidentally deleted, legacy issues etc.  Either way you will need to fix this problem. There are 3 ways to do this - either you will ignore the missing values, delete the missing value rows or fill the missing values with an approximation. Its easiest to just drop the missing observations but you need to very careful before you do that, because the absence of a value might actually be conveying some information about the data pattern. If you decide to drop missing values : df_no_missing = df.dropna() will drop any rows with any value missing. Even if some values are available in a row it will still get dropp...