Data Sci Adventures - part 1, current state

Published on 2020-11-20 12:15

Learning natural languages (English, Spanish, Swahili, etc.) is tough, learning two more at a time is tougher, learning two from same group, like Spanish and Romanian, from scratched is strongly discouraged. With formal languages on the other hand, there's some leniency as they are a bit simpler.

Disclaimer: I've been working with Python and flask for almost two years and I've encountered Scala and R at graduate school.

As written in Changing Directions post my path takes me across languages and technologies in Data Science, so the choice is partially driven by what others use. R is a language designed for dealing with data. While Python and Scala are multipurpose languages with libraries for Data Science.

As Scala is not as prevalent I've stumbled upon it again because there was a use case on my mind in a different field. The word "again" is used there as Scala and I go through cycles throughout the years when I remember how I used to like it and try to learn it again. Unfortunately every time my learning materials failed me, this time, though, is different. There are two course on Udemy (links in resources) that will help me to finally learn it. Pattern matching is a fascinating feature to me.

My R skills aren't where they used to be and what they used to be was not very good. The only couple of times when I used it was when we had to do homework for statistics class. So my current goal is to progress through R course, to get past R specific lectures so I can start Machine Learning parts for R and Python courses at the same time.

As for Python my skills now include Numpy and Pandas, so they are ready for the ML part of course. In the meantime there's a side project in Python that needs to be worked on.

Resources

  1. R course
  2. Python course
  3. Advanced Scala
  4. Apache Spark with Scala