I’m starting a project in which I attempt to build a model to forecast the attendance at a sporting event. Historical data will go in and future forecasts will pop out. I’m interested in predicting total attendance at each game.
Here are some variables that I think might effect the attendance at a given game:
-Sport
-Home Team
-Away Team
-Stadium/Location
-Month
-Weekday
-Time of game
-Home Record
-Home Playoff Chance
-Away Record
-Away Playoff Chance
-Number of home game (home openers obviously well attended)
-Weather
-Temperature
-Indoor/outdoor stadium
-Attendance Capacity
-Promotions
-TV Coverage of game
-Is it a holiday or holiday weekend?
-In-division game? Rivalry game?
-Is away team defending divisional/conference/league champion?
-Do teams rarely play? Intra-league game?
-Current injuries to key starters
I’m trying to be general, so that similar variables work across sports. Some of them (indoor/outdoor, for example) obviously don’t make sense in every sport. These variables will probably feed into some form of regression model, so the variables “Home Team” and “Stadium/location” will incorporate a lot of fixed effects of the team: size of fan base, interest in that sport in that city, average cost of tickets, etc.
So, am I missing any variables that would help predict the attendance at a game? Help me out.