So you want to become a data scientist? Want to dip your shoes in the so-called “sexiest profession of the 21st century?” I’d say, by all means, go for it! There is definitely a huge demand for Data Scientists everywhere. This is because institutions- businesses, governments, academia among others, are all realizing the power of Data.
Artificial Intelligence is now becoming the 4th industrial revolution and it has its foundation in Data Science. So if you are curious about this course, I’d like to share with you a few tips I’ve garnered from my journey into Data Science.
But first, what is Data Science all about?
Data science is simply a combination of programming and statistics. So you will have to learn a programming language and have a strong understanding of mathematics and statistics. The most preferred/ widely-used programming language is Python.
You can find documentation on installation and introduction to python with a simple Google search. Learning the basics of the language- lists, dictionaries, strings, functions and object-oriented programming, among others has also been made easier with the availability of material all over the internet.
Beyond learning the basics, you will be introduced to packages, which you will spend most of your time importing and using. Some of the python packages you are most likely to use are NumPy, Pandas, Matplotlib, Seaborn and Scikit-learn. These packages help you manipulate data, visualize data, train, and predict Machine Learning models among many other uses.
Applying Data Science in Your Field
You will find yourself using more than one package at a time. Depending on the field you are in, such as Geographic Information Systems(GIS), you will end up using more packages suitable for satellite imagery such as google earth engine and GDAL.
Why do you need a foundation in statistics you might ask? Because the principle of data science is to gain insights from data. You will be handling lots of data, sometimes complicated data, or even incomplete data and you will need to discern patterns and/or statistics from them in order to make proper decisions. Some of the simple statistical operations you will do include finding the mean, mode, median values of a given dataset.
You will also visualize graphs you have interacted with before such as line graphs, histograms, and bar graphs. You will also deal with finding the probabilities of certain events and be required to make sense of all these outputs and make intelligent decisions.
Mathematics in Data Science
Over time, the statistical operations, visualization and probabilities become more difficult as the problem you are solving becomes more complex. This is where a strong foundation in mathematics comes in handy. (This is where taking the Foundations in Data Science course at JENGA School comes in handy.)
Mathematical concepts such as advanced calculus, linear algebra, and probability are at the core of Machine Learning and Data Science. I would advise you to review your knowledge of these concepts and you will be ready to take on challenging data science projects.
Data Science is not all about math!
Don’t get me wrong. As a data scientist, your work involves making sense of all the math and programming. However, you will need strong communication skills, both verbal and written as you will be documenting and presenting your findings all the time. These are the skills I would advise anyone to strengthen getting as you get into data science and you should be good to go!