Putting the Brakes on Data Science
It’s almost a distant memory. When did all the Data Science hype begin? For me, it started with “Big Data.” This term was coined before we stared talking about Data Science as a discipline. The world was starting to produce massive amounts of data and the prospect of being able to do something useful with all this data was interesting. I was curious but just an observer. Where was all this going? Would it grow legs? Then we started talking about Machine Learning and Data Science. And I grew more and more curious. After all, how can you not be curious at the future that will be at the core of everything we do and affect so many aspects of our lives from AI-driven personal assistants to self-driving cars.
I could bare it no more. I could no longer be a passive observer. Might as well jump on the bandwagon, right? Well, short of a personal mentor, or an institution for higher education, it has been said that learning through video is a good way to ramp up your skills, a shortcut of sorts. Time to find some. Ok Google.
As a sesoned software developer, a programming language designed specifically for statistical computing was far more interesting to me than Python, which I already had some familiarity with. So I took the R route and after some searching, I decided to start with “Data Science and Machine Learning Bootcamp with R.”
I was excited! I was learning R, loading data sets into matrices, plotting histograms, scatterplots, barplots; even interpreting the data! And now it was time to move on to Machine Learning.
Machine Learning. Step 1. Read “An Introduction to Statistical Learning.” Whoa! It was time to go old school. Hit the books. RTFM! And I tried. Not an easy read. Do I really need to know all of this? I gleaned through it, trying to pick up what I could, but would eventually give in, give up, and continue with my online course, put all the theory that I hadn’t really learned too well into practice. Linear Regression. Logistic Regression. K Nearest Neighbours. Decision Trees and Random Forests.
I got through the course, yay! But it left me with a sinking feeling. I was modeling data with Machine Learning algorithms that I superficially understood. Seeing examples of how they are used is one thing but applying this knowledge in the field didn’t seem like a realistic expectation at all.
Then it hit me. I did not have the foundation I needed to move forward successfully. My statistics knowledge was near to null and I had avoided all other forms of mathematics while pursuing my bachelor’s degree, thinking that I would never really need it for developing software. And I was right. I knew enough math to get by. I did not do much games programming and the laws of physics were already built into game engines, had I gone down that path. But alas, it had come back to haunt me.
So the obvious question was, would I enjoy my path to being a Data Scientist, knowing that I had to “go back to school” and catch up on all the Linear Algebra and Statistics that I never quite got under my belt? There was only one way to find out. I hit the brakes on Data Science. I decided to enrol at UBC. It was time to be a student again! If I could make it through a Matrix Algebra and Statistics course, I’d know if I’m barking up the right tree. And so the journey began in earnest. I’ve already completed Matrix Algrbra. Thumbs up! Perhaps I’ll write about that soon.