Thursday, August 29, 2019

"We want to be learners - not students" via @emasie

Elliott Masie.
Image source: The Masie Center.
I love it when Elliott Masie's "Learning TRENDS" newsletter arrives in my Inbox - I have been subscribed for over 20 years (you can subscribe here - he has been doing this since 1997!). There is always something new and thought-provoking in his letters to his subscribers, and today's is no different. Today he tells us about his end of summer learning thoughts, one of which is: "Brand Change - From Student to Learner". He recalls recently dropping out of a University on-line course as the "learning method was non-motivating" despite the content being "pretty good".  He added: "We want to be learners - not students". So - which term should we use for the people in our classes?

Thinking about this I am reminded of an occasion many years ago that we decided to stop calling students "students", instead we were to use the word "learner" here in the College. This caused some hilarity in the Students' Union where the abbreviation "SU" was commonly used. Instead of "Students' Union", "Learners' Union" was to be used - cue some jokes about "LU" (pronounced "loo"): are you going to the LU for a game of pool? It didn't catch on. Nevertheless, in some official documentation, we do use the term "learner" in things like module or course descriptors. In fact, I do a search in any of my documents to ensure that I have not accidentally inserted the word "student".

My preference is for the word "student" - after a lifetime of using this term, I'll not change now! While Elliott Masie's focus is in the main directed at corporate learning, we have a lot to learn from him in academia. As someone who now teaches on-line, I do find it hard to be motivating in my methods from beginning to end of class - it's hard work! Some of my students do drop out, like Elliott did. I don't have an insight as to why this is, but I suspect non-motivating learning methods are amongst a variety of reasons. Maybe if students were motivated to learn more, they would not drop out?

Learning is best achieved with motivated teachers and learning methods - not just on-line, but in classes too. Here are some some research-based strategies for motivating students to learn from provided by the Vanderbilt University Center for Teaching:
  • Become a role model for student interest
  • Get to know your students
  • Use examples freely
  • Use a variety of student-active teaching activities
  • Set realistic performance goals 
  • Place appropriate emphasis on testing and grading
  • Be free with praise and constructive in criticism
  • Give students as much control over their own education as possible.
Source: Vanderbilt University Center for Teaching

The full list and explanations are well worth reading. I can't personally say that I do all this, but Elliott Masie certainly motivates me to adopt better strategies to improve student learning - thanks Elliott!. And the Vanderbilt Center for Teaching resource is a good place to start.

Wednesday, August 28, 2019

Data Scepticism & Curiosity

A few days ago I wrote about some advice for Data Analysts given by Brent Dykes, writing in, who tells us Why Companies Must Close The Data Literacy Divide. Amongst other things he advises us all to be both sceptical and curious about any data that we analyse. Critical thinking is a very important skill for students to develop, so I often tells my class to challenge any number that they see. My most basic example is "8 out of 10 cats" prefer Whiskas (according to their owners). How did they test this? Did they set up an experiment with 10 cats or more? What was the sample size? Was there a control group? Did the owners conduct a test? Such a claim would not pass a scientific test today!

Brent Dykes offers up some questions that you can ask to challenge the source and value of data. As he states: "it is important to be able to step back and weigh other less obvious factors that may be influencing the results and its interpretation". Here are his questions, and I recommend them to students analysing any data set:
  • Collection method: Could the method or way in which the data was collected influence the results?
  • Credibility: How credible or reliable is the source of the data?
  • Bias: Is there potential bias from either the data producer or you as the consumer?
  • Truthful: Is the data being manipulated in a way—intentionally or inadvertently—that misrepresents its true meaning?
  • Assumptions: Are there any implied assumptions that could be affecting how the numbers are interpreted?
  • Context: Is there additional context or background information that is missing and needed to properly understand the data?
  • Comparisons: If supplemental data is included for comparison purposes (e.g., period-over-period data), does it offer a fair and relevant comparison? Alternatively, is an obvious comparison missing?
  • Causation: Are you potentially confusing correlation with causation, which represents a direct pattern of cause and effect?
  • Significance: If the data is statistically significant, is it also practically significant?
  • Outliers: Is an outlier important or is it unnecessarily skewing the overall results?
  • Quality: Are you able to distinguish between data that is unusable or that which is still directionally helpful?
Source: Dykes (2017)

Tuesday, August 27, 2019

"Eugene O'Loughlin, your content violated YouTube's Community Guidelines and has been removed"

Oh dear - I am in the bad books at YouTube! Today I received an email from YouTube telling me that "Our team has reviewed your content, and, unfortunately, we think that it violates our harmful and dangerous policy. We've removed the following content from YouTube:

I think I can guess that they don't want anyone downloading videos, or perhaps the example video I used on President Obama's Inauguration was in breach of copyright. I published this video on 23rd October, 2013 - so it has taken nearly six years for YouTube to rap me on the knuckles. In this time, the video has amassed 89,918 views, 177 Likes, and 15 Dislikes. Further analytics are not now available, and I have 7 days to appeal this "decision" - no doubt made be an algorithm, not a person (or a "team"), before it disappears completely.

I will not be appealing this decision, even though you can still use Mozilla Firefox (with the Easy Youtube Video Downloader Express add-on) to download videos from YouTube. The screen shots I used six years ago are now dated, as is the choice of PowerPoint 2010. It feels funny to have a video like this classified as "harmful and dangerous", and I am sure I am joining thousands of other content developers having videos removed.

What does concern me is if this is the start of something new. By using screenshots from the likes of Microsoft PowerPoint - am I breaching copyright rules? Or am I just being "harmful and dangerous"? I'm also concerned that this may affect the search/recommendation algorithms for my other videos - will a bad boy get less? I'll be watching out for evidence of this over the next few weeks.

I hope that lots of the 89,917 views were useful for the folks that watched, and that many were able to show videos in their presentations. For a couple of years now it has not been possible to embed YouTube videos in a PowerPoint presentation - so my video was a handy workaround. 

A final BIG THANK YOU to all my viewers for watching this video!

Monday, August 26, 2019

Data Interpretation - What Questions to ask

My most common advice to students engaged in interpreting data for an assignment/project is to look for four things:
  1. Find links
  2. Observe trends
  3. What patterns emerge (eg clusters)
  4. Make predictions
This is basic advice, but it can be a useful starting point when you first open a data file and wonder what to do with it. Brent Dykes, writing in, tells us Why Companies Must Close The Data Literacy Divide. He offers loads of advice to improve data literacy for all. Included in this advice is some on Data Interpretation, and suggests making the following types of observations:
  • Trends: What direction is a trended metric heading (up, down, flat)?
  • Patterns: What repeatable patterns or cycles are you seeing in the data (e.g., seasonality)?
  • Gaps: Are there any obvious gaps or omissions in the dataset?
  • Clusters: Are some values bunched closely together in certain areas?
  • Skewness: Are values noticeably concentrated or skewed more to one side than another?
  • Outliers: Is there a data point that is detached or far removed from the rest of the data points?
  • Focus: Has something in the chart or table been emphasized to draw attention to it? Is it obvious why part of the data was highlighted?
  • Noise: Is there any extraneous data included that detracts from the main message of the chart?
  • Logical: Does the data help to answer a specific business question? Does the data support a proposed conclusion or argument?
Source: Dykes (2017)

These are really simple and great suggestions that I will now add to my shorter list. Students will naturally be curious about any dataset that they use, and many won't need a list such as above to get going. Nevertheless, Dykes' list will make a great starting point and will form that basis of a good assignment or project. Data Analysts/Scientists need to have the skills and tools to enable them to make the above different types of observations almost immediately upon opening a data file. While the list is aimed specifically at charts, I feel that it can be applied to any type of data. 

Think Data!

Friday, August 23, 2019

Interesting New Data Visualizations from YouTube #Analytics

YouTube/Google have stepped up their offerings to video content developers with their new Studio. This makes analysing data trends a little more interesting. Back in May 2015 (centre of chart below) there was an almost sudden Fall in the number of views - at the time I put this down to "advice" given to me by my YouTube Content Manager. This resulted in me making changes to tags and thumbnails which I blamed for the Fall in views. Today, while reviewing YouTube's new data visualizations I noted a huge contrast in the before and after May 2015 period. For several years my How To...Create a Basic Gantt Chart in Excel 2010 was by far my top performing video - it was my first million views video. This is illustrated by the purple line dominating the left side of the chart below. After the Fall, it never recovered the number of views. In the six months just before the Fall, my How To... Plot Multiple Data Sets on the Same Chart in Excel 2010 (in blue) increased dramatically before falling back along with all other videos. But it recovered and went on to become my second million views video, and is still my top performing video today. Strange!

The chart below from YouTube Studio is an excellent representation of seven and a half years of data on my five top performing videos. I like the choice of colours to clearly separate the five lines on each line, each data point is the number of views per day. Despite meaning that each line shows nearly 2,800 data points, the weekly and seasonal trends are clear to see. The chart is also very interactive with rollovers at any point giving more details. I, as the user, can also select from 23 different data sources and many different filters, and I can zoom in on any time period for mire detailed analysis. Thank you Google!

Tap/Click image to enlarge.

Tuesday, August 20, 2019

New Video: How To... Determine the Median Value in a Dataset

My latest education video is a short one showing how to determine the Median value in a data set. This plugs a small gap in my How To... Statistics by Hand playlist. The Median value in a data set is the middle value when the data are ranked from highest to lowest (or vice versa). This is easy to determine when there are an uneven/odd numbers of values, but not as easy when there are an even number of values. Here's how it works:

The Median is a useful measure in that it is not affected by skewed data. If you think about the average salary in a large company versus the median value - you may get a very different picture. Felim O'Rourke writing in "64% of workers in Ireland earn less than the 'average' salary", tells us that "the average earnings in 2017 were €37,646". He compares the mean value and the Median value concluding that the Median gives a better insight. The average value includes high earners whose salaries in the hundreds of thousands skew this average value upwards. For example, if there are 10 people working in a company and 9 of them earn €30,000 each while the 10th earns €60,000 - the average salary will be €33,000, but the Median value will be €30,000. Statisticians and data analysts should always consider the Median when describing data, as it can give a different and maybe more important insight into the data.

Friday, August 16, 2019

Python vs R Debate #analytics

One of the modules I teach is called "Programming for Big Data". It is normal when creating module descriptors not to be specific on what technologies will be used. So the module descriptor for this module does not specify what language will be used - it just states in general that "...programming languages such as R, Python, Java, etc" will be used. Last year I took over this module and decided to create a brand new set of course resources (I always do this when taking over a module, I just cannot use other lecturer's notes). As R is the preferred language in most other modules on our data analytics courses, and I was far more familiar with it - I decided to switch from Python, which had been used previously, to R.

Last evening at an Information Session for incoming students I was asked about this again, and would the students be learning Python. The answer is "No", and that we are continuing with R (students can choose in their Project module to use any language they wish). Today I decided to take a quick look at the 20th annual KDnuggets Software Poll (which had over 1,800 participants - so good sample size), and Python stands out the leading language. In 2017 Python and R were neck-and-neck (59% and 57% respectively), but this has changed in 2019 (66% for Python, and 47% for R). I am very glad to see that other technologies that we use in the Data Analytics programmes (RapidMiner, Excel/SQL/Tableau) feature strongly in the poll.

Image source: KDnuggets.
In this coming academic year, our Higher Diploma in Data Analytics (which I teach on), is due for its five yearly programmatic review. This means that the academic year  2019/2020 will be the last for the "old" version of our programme, and from September 2020 we will be starting with a "new" version. One of the biggest decisions we will have to make is whether we continue with R, or switch to Python as the main programming language. I am already favouring such a switch mainly for two reasons: first, the survey results below show a dramatic change, and secondly - Python is regarded as an easier language to learn than R. This is an important consideration in a one year conversion course like a Higher Diploma where a huge number of students will not have done programming before. 

Much debate ahead!

Wednesday, August 14, 2019

3 Weeks with Google Pixel 3a #loyalty

Three weeks ago I started using Google's Pixel 3a - it is also of course 3 weeks since I dumped the iPhone series which I had been using since 2008. 

The first thing I can say about the Pixel is that it is far superior to my previous iPhone 6. For a budget smartphone (it was $399), it is the best phone I have ever had. Switching from iOS to Android was not as complicated as I expected, and so far I do not miss the iPhone. I'm still getting used to where things are on the Pixel.

Without realising it, it seems that I am part of a global trend of people switching from iPhone to an Android phone like the Pixel 3a. The excellent Statista website reports that Loyalty Is Waning Among iPhone Users. You can see below that it has dropped from a high of 92% to 73% - quite a decline and not good news for Apple's share price. 

Infographic: Loyalty Is Waning Among iPhone Users | Statista You will find more infographics at Statista

Price for me was by far the biggest reason for switching to the Pixel 3a. I simply was not prepared to shell out nearly a thousand euro for an iPhone X, nor was I prepared to wait for the Pixel 4 - rumoured to also cost around a thousand euro. Budget phones like the Pixel 3a provide everything that I need, so no need to spend a fortune just to keep up with the latest gadgets.

Friday, August 09, 2019

Back to work

Today I am back to work after a long summer break - memories of Route 66 and beaches in Wexford seem so far away while going through hundreds of emails that arrived in my Inbox over the summer. I always make a point of not checking work email while I am on annual leave. The main reason is that I believe that everyone should take a break from the work email - I often think of this as the longest dog leash in the world. Also - since I am not teaching during the summer, not much happens for me during this time anyway. Despite hundreds of emails - I have got through them all, only a few dozen need a response as most are subscription and automated mail.

End of Route 66 with a new academic year to look forward to.
I will be teaching the same modules as last year, so there will be a lot less workload in preparing for classes in September. This time last year I took on an R Programming module and had to prepare a complete new set of module resources - this involved a lot of work. Thankfully I can reap the benefits of doing this next semester (it was also my first ever online module). There will be some updates, but these can be handled on a weekly basis. 

I am looking forward to the new academic year - I will not have too many left as I see retirement on the horizon. My subjects are Statistics, R, and Project Management - nice modules to work with. There is a great buzz in meeting new students each year, lots of new people to get to know. I also look forward to meeting students from countries that I have not had students from before. Last year I had a students from Uruguay and Paraguay for the first time - diverse classes make for a great learning and teaching experience. 

Back to work!