Friday, January 31, 2020

The Land of Milk and Honey - Brexitland

So - the UK leaves the European Union this evening, and presumably the milk and honey will start to flow tomorrow. Good luck to them. I do feel sorry for the 16 million who voted to remain, including majorities in Scotland and Northern Ireland who wanted to remain. I doubt that my life will change one bit as a result of this, but I do worry about the Border and what unintended consequences of leaving the EU might throw up.

Image source: The Courier.

All through the Brexit debate I have followed events with a mixture of bewilderment and disbelief - I still don't understand the decision to leave. In the 2014 Scottish Independence Referendum, people fought in on the basis of "Better Together". These same folks are now telling us that we are "Better Apart". At least the bitching and moaning about Europe that has been going on for years in Britain will stop. They want a trade deal with us despite the fact that they already had a deal that will have been far better than anything they can get in a new deal.

I wish Britain well for the future, but I will not miss you.

Wednesday, January 29, 2020

New YouTube Video: How To... Embed a YouTube Video into a Google Slides Presentation

It is just over five months since I last published a "How To..." video on YouTube - I feel it is time to do another. Last week I attended a presentation where the presenter played a video embedded in a slide. I know that it is not possible to do this in Microsoft PowerPoint, so I asked the presenter how did he do it. It turns out that he used Google Slides - no surprise that Google allows embedded video on its own presentation software, and not Microsoft's. So if you are a PowerPoint user, it is easy to switch to Google Slides for a presentation to embed a YouTube video (Google account required).

Embedding a YouTube presentation in Google Slides turns out to be very easy to do. So I made a short video showing the step-by-step procedure to do this. Here it is, enjoy:

Friday, January 24, 2020

A Simple Post Value Add to a Class

This week, I did something that I never did before in my 18 years of teaching in the College! I sent an email (via our Content Management System: Moodle) to my on-line class summarising all that we had covered in the previous evening's class. Even though it was the first week of the new semester, and it was also the first week for my students who were just starting their course this January - I was surprised at how much we had covered.

The module is a programming module, for many students it was their first time writing any code. Here's is (part of) what I sent to my class the next day:

Following last evening's class, you should be able to do the following (which is a lot for your first day!):
  1. Install R
  2. Install and run RStudio
  3. Explore the RStudio interface (4 quadrants)
  4. Create and save a new R script
  5. Use hashtags to insert comments/note in an R script
  6. Display a simple message in the console ("Hello World")
  7. Navigate and Set your Working Directory every time you start RStudio
  8. Watch out for syntax errors (the typos of programming) - as you have already found out, a misplaced comma can cause havoc
  9. Try to make sense of error messages so that you can fix code that does not work
  10. Use functions - we used print(), read.csv(), head(), tail(), plot(), and ggplot()
  11. Install an R library (we installed ggplot2)
  12. Run/load an R library
  13. Open a file (.CSV) in R and display its contents
  14. Read the contents of a file into a vector (diamondData in our example)
  15. Use R to refer to individual variables (eg, "carat" in the diamondData file)
  16. Be a programmer!
This is not an innovative thing to do - it's very simple, and I'm sure many other educators already do this. I chose to do this the day after to try and motivate students who were subjected to a four hour class in which many had frustrating technical difficulties, plus of course plenty of errors that first time programmers always get. I felt that a "look what you have done already..." message might be useful (as well as motivating) for them.

But when I was compiling the list, I think I'm probably the most surprised person - it is only when I see a list like this that I realise that far from being an introductory class, we did actually cover a lot of material. Obviously, I could put this on a slide for review at the end of a class, but I think that a separate communication rather than a simple slide works better - especially for on-line students. Hopefully I will do a few more of these in future classes.

Monday, January 20, 2020

A New Semester

Later this evening I will be delivering my first class of the new semester - a class on Advanced Business Data Analysis to students on the Higher Diploma in Data Analytics. Most of my teaching now is with H Dip students - I also have final year students for Statistics. The previous semester is not yet over in that there are still some exam processing to complete. I always look forward to semester II - the days get longer and we come out of the dreary winter, and of course there is the end of the semester to look forward to ahead of the summer holidays. The semester is also stretched out a little bit with reading weeks around St Patrick's Day and Easter.

At this stage in their course, students should be well settled in to their studies. On a one year course like the Higher Diploma in Data Analytics - students are half-way through, and are hopefully motivated to stick it out to completion. During semester I, we lost some students who, for a variety of reasons, have dropped out. There seems to be no clear reason why students drop out, but only this morning I got another email from a student announcing that while he liked the course, he was dropping out due to work commitments and a change of job. We also have new students starting out, some of whom I will have for my Programming for Big Data on-line class next Wednesday. It is always nice to meet new students - on day one everybody is very motivated and keen to get started. Many have dreams of changing careers and getting into the Big Data world, starting something new, getting a new job, earning more money, and learning lots of new skills. It can be a daunting prospect for many to be back at College after many years. Attending lectures, completing assignments, studying and preparing for exams, takes serious commitment - especially for those who are also working during the day.

So - to both continuing and new students, welcome to semester II. I do hope it will be an enjoyable learning experience for you. It's likely to be my last semester, so I hope to also enjoy it as both a learning and teaching experience.

Monday, January 06, 2020

0.3% - UK Statistic of the Decade via @guardian #Analytics

Just before Christmas, in an article entitled "Don’t glaze over. This statistic holds the key to UK prosperity" by Hetan Shah in the Guardian newspaper it was reported that "Productivity growth has fallen to 0.3% ", and that The Guardian had "named it the statistic of the decade". Shah writes that productivity in the past 10 years has been "truly terrible". Before the financial crisis productivity in the UK was growing at around 2% each year, but in the last decade that has slumped to an average growth of 0.3% a year. As Shah states - the end of decade report should be that the UK "must try harder".

When compared to Ireland, the UK's productivity is lower, as measured by "Nominal labour productivity per person" (Source: eurostat). Here's a plot of figures for 2010 - 2018 for both countries:

This shows that actual productivity in the UK has been static, while Ireland has recovered considerably since the aftermath of the financial crisis of 2008. Figures for 2019 are not available on the eurostat site - the impact of all the political uncertainty of 2019 is not included.

What above tells us is that productivity in the UK has hardly changed at all over the past decade - it was the same in 2017 and 2018, after the 2016 Brexit referendum. Indeed it is not much changed since 2010 - long before a Brexit referendum was even proposed. Whats all this Brexit fuss about?

Friday, January 03, 2020

Correlation is not Causation #Analytics

A mantra that data analysts/scientist learn very early on is "Correlation is not causation". Measuring the strength of a correlation is usually done using Pearson's or Spearman's Correlation Coefficient (values between -1 and +1). These measures simply tell us whether two variables are related to each other or not. Even if we get a value as high as 0.9 (a strong positive correlation), we still cannot say that a change in one variable is dependent on change in the other. Causation is not established. 

For any two correlated events A and B, the following four relationships are possible:

  1. A causes B
  2. B causes A
  3. A and B are consequences of a common cause, but do not cause each other
  4. There is no connection between A and B, the correlation is coincidental

So what should we do? If a correlation is established, then further investigation is needed to see if there is also a causal relationship. To do this we need a controlled study in the form of an experiment. For example, as you drink more coffee, the number of hours you stay awake increases (see a great list of Common Correlations here). An experiment to test if there is a causal relationship would be easy to set up, for example - get volunteers to drink different amounts of coffee (measured by the same cup size) and time how long they stay awake. It would be important here to have a control group who do not drink any coffee. This experiment should provide strong evidence that there is a causal relationship between drinking coffee and staying awake. 

Image source:

Statistics is not an exact science, mostly because we are dealing with samples instead of populations. While we can be 95% or 99% confident of a correct result, we cannot say 100% - there is always uncertainty. Comparing two variables also involves uncertainty as we are usually also dealing with samples. Be careful with experimental design, as any bias or non-random sampling will compromise your research work.