In this project, you will use data from the Premier League (2020–2021) to find out if there is any evidence that the referees were biased in favour of the home teams.
This guide helps you through the problem-solving process with helpful hints and questions to direct you.
There are 20 teams in the English Premier League that play each other at home and away grounds throughout the season, making a total of 380 matches. There is a popular belief amongst fans and players that playing at your home ground is an advantage. Is this true?
(The codes used in the first entry of the data are explained here».)
A version of this data with a reduced number of fields is available in a variable called premierLeagueData2021.
to determine if there is any evidence that football referees in the Premier League make decisions that are biased towards home or away teams.
Tips: • Extract the relevant parts of the data (i.e. parts of the game that are ultimately decided by a referee). • Perform a
You may find the following code useful:
What is your precise question to tackle? Can you write a formal hypothesis test? What is a suitable level of significance for such a test? What assumptions are you making?
ABSTRACT TO COMPUTABLE FORM
What sort of data are you using? How reliable is the source? Are there any outliers in the data? Which tools will you use and why?
What statistics have you found? How did you compute them? How did you overcome any problems?
Should you accept or reject your null hypothesis? What do your statistics mean? Do your results seem correct? How can you check that it is a sensible answer? How much does the answer depend upon the assumptions you made? What could you do to improve your answer? Are there further questions you could ask to develop your analysis?