Thursday, May 18, 2017

" drag-and-drops for deep learning" and "Why R is Bad for You according to Bill Vorhies via @DataScienceCtrl

A most interesting post by William Vorhies at Data Ceience Central poses the argument: "Why R is Bad for You". We are lead to believe that knowledge of the R programming langauge is an essential skill for data analysts/scientists - you can use it to do almost anything with data such as clean, manipulate, visualize, transform, perform statistical tests, and in general look for links/trends/patterns in data. Vorhies says that "R is not the best way to learn data science and not the best way to practice it either". 
Image source: The R Project for Statistical Computing

The trouble is that you have to learn R before you can use it. I and several of my colleagues use R for data analysis in class - in my case to perform statistical tests such as ANOVA, Time Series, and Principal Component Analysis. In the new Data Visualization module introduced this past academic year we also used R to plot charts such as boxplots, interactive charts, and flight paths. It is a very powerful language, but none of my classes are programming classes. Students learn how to perform basic programming in R before they come to my class. Usually I give students code in the notes and ask them to use and modify code already written. However, much time is spent in lab work fixing syntax problems - a missing comma can be difficult and frustrating to find and fix for someone not good at programming. 

Bill Vorhies writes that the "largest employers, those with the most data scientists are rapidly reconsolidating on packages like SAS and SPSS with drag-and-drop" - especially in larger companies. These tools, and the likes of Tableau software, are very powerful and much easier to learn and use. Excel is probably the most used data analysis tool - and is getting more powerful. So why learn R? 

R is free. Many employers list it as an essential skill in job adverts. Having the ability to programme in any language shows that you have a logical mind and you are good at problem-solving - probably good at deep learning too. If you have already learned how to use R, then keep on using it - but as Vorhies says: "in the commercial world the need to actually code models in R is diminishing". Something for us educators to think about!

Please note: Opinions and comment expressed in this post (and all posts in this blog) are mine alone and do not in any way represent those of anyone else or any institution.

No comments:

Post a Comment