Monthly Archives: September 2014

Theory Thursday- Conditional Probability

Now that we know the Three Axioms of Probability, we can understand conditional probability.

First, let’s think about a normal (unconditioned) event. What is the probability of rolling an 8 with 2 normal dice (equally likely outcomes from 1 to 6)? Well, we can roll a 2-6, 3-5, 4-4, 5-3, and 6-2 with the two dice to get a sum of 8. That’s 5 possible outcomes that sum to 8. There are 36 possible outcomes, so the probability is 5/36.

The conditional probability of an event is the probability that the event happens, given that another event has happened or will happen. So, for example, what is the probability that I roll an 8 with 2 dice, given that the first die is a 2? Well, I would need the second die to be a 6 for them to add to 8, and there are 6 options for the second die. So I have a conditional probability of 1/6 of rolling an 8, given that the first die was a 2. The conditional probability of rolling an 8 given the first die is a 2 (1/6) is higher than the unconditioned probability of rolling an 8 with 2 dice (5/36).

What is the conditional probability of rolling an 8 with 2 dice, given that the first die is a 1? Well, we would need the second die to be 7. But the die only has options from 1 to 6. So we cannot roll an 8 with 2 dice if one of the dies is a 1. The conditional probability is 0.

We have an easy way to calculate the probability of a conditional event happening. Let E be the event we want to happen, conditional on the event F happening. Let P(E|F) be the conditional probability of E given F. Then P(E|F)=\frac{P(EF)}{P(F)}, where P(EF) is the probability of both E and F happening.

In our example with the dice, E=roll an 8 with 2 dice and F=roll a 2 with the first die. There is one way to roll an 8 with two dice and roll a 2 with the first die: 2-6. There are 36 possible outcomes, so P(EF)=1/36. P(F)=1/6 because there is a 1 in 6 chance of rolling a 2 with the first die. P(E|F)=\frac{1/36}{1/6}=1/6, as we found above.

Are Sports Broken?

Scott Adams thinks so.

My thoughts on his thoughts:
-Get rid of football? No, just play flag football. Way more interesting.
-Get rid of tennis serves and funky scoring? Yes please.
-Get rid of head balls in soccer? No, because that’s not what causes most concussions. Getting drilled from short range, falling awkwardly, or getting physically hit/headed/kicked by another player cause concussions.
-Make the soccer goal bigger? Maybe. Would be interesting.
-Get rid of offsides in soccer? No, because the defense would have to hang back all the time to guard cherry-pickers. There would be no break-aways. I hate the closeness of most onside/offside calls, but they seem relatively necessary to keep the game interesting.
-Add TV timeouts to soccer play? Why must soccer be TV-friendly like everything else? Why ruin uninterrupted play. Dumb suggestion.
-Add walls so that all soccer is indoor soccer? Yes, but only in the last 15 minutes of each half. I hate how they take so long to do throw-ins and goal kicks when time is running out and they have the ball and lead. Also, awesome moveable walls.
-All his baseball suggestions: No.
-Volleyball and golf suggestions: Don’t care.

My suggestions:
-In baseball, add an “acceptable lead-off line” past each base. The runner on that base can lead-off up to the line but can’t go past the line. The pitcher is allowed to try to pick off the runner, but each time he does, it counts as a ball to the batter. This will eliminate most pick-offs.
-In baseball, stop allowing the batter to leave the batter’s box. If the ball wasn’t hit foul, the batter must remain in the batter’s box, and the pitcher must pitch or try a pick-off within 10 seconds of receiving the ball. Any violation by the batter is an automatic strike; any violation by the pitcher is an automatic ball.
-In soccer, give teams a 1 minute, 30 second shot clock. If you don’t score, kick the ball out of bounds, or hit the goalie or the goal post within 90 seconds, it’s an automatic turnover. Shots in the air after 90 seconds count, the same way they do in basketball.
-In soccer, get rid of the referee timing. Keep a clock that counts down like in every other sport. It’s fine to have stoppage time still, just add it to the clock at the end of each half.
-In basketball, call fouls whenever the defender uses his arms to touch an opponent. Then eject players after 4 fouls. This would lower the incentive to foul and get rid of the physicality on defense. The game would be smoother.

We can fix this.

Theory Tuesday- 3 Axioms of Probability

Everyone has an intuitive concept of chance/probability. When we say that there is a certain probability of an event happening, what do we mean?

First, you need to understand the concept of an “event”. An event is one possible outcome of some probabilistic scenario. Say you are flipping a coin. The two possible events are “heads” and “tails” (though I guess you could argue that there is a third event: “coin lands on its edge”). As you will see in Axiom #3, since a coin cannot land on both “heads” and “tails, “heads” and “tails” are mutually exclusive.


There are three axioms upon which probability theory is built:
1. Let P(E) be the probability of an event. Then 0 \leq P(E) \leq 1. The probability of any event is between 0 and 1.
2. Let S be the set of all possible events. Anything that could possibly happen is contained in S. P(S) = 1. The probability of some event happening is 1.
3. Let E_1, E_2, …, E_n be a sequence of mutually exclusive events. Two events are mutually exclusive if only one of the two can happen at any time. P(\cup_{i=1}^n E_i)=\sum_{i=1}^n P(E_i). This says that you can add the probabilities of two or more mutually exclusive events together to get the probability of any one of them happening.

With these three axiomatic building blocks, all of probability theory can be built.

Two Links Tuesday- September 9, 2014- Model Building Edition

In Defense of Model Simplicity: Examples from Laura McLay about problems in data science and optimization that respond better to a simple model than a complex one.

Learning from the Best: Good write-up on Kaggle’s blog about suggestions from past data science competition winners. Spend the most time focusing on extracting the right features to solve the problem. Without the right features, it doesn’t matter how complex your model is, it won’t work.

Albert Einstein: “Everything should be made as simple as possible, but not simpler.”

Code Monkey Monday- Notepad++

If you like to read data in from text files or save lots of data to text files, you’ve probably discovered that Window’s default text reading program, Notepad, sucks. It can’t open large (MB or larger) text files, sucks at formatting, and seems slow. I prefer the freely downloaded program Notepad++. It can handle large files with ease, allows multiple text files to be open in the same program, highlights all equivalent words if a word is highlighted, gives line numbers, and probably has a ton of other capabilities that I’m not familiar with. You should upgrade to Notepad++.

Book Review- Think Like a Freak

Think Like A Freak
Steven Levitt and Stephen Dubner


I read this book over the span of perhaps 3-4 hours one afternoon. It was short. In following the Freakonomics blog, I had already seen/read podcasts of 3 of the book’s 8 chapters; I don’t think there were many surprises for me in the book. I also happened upon Stephen Dubner giving a public lecture in the atrium of IU’s business school in March/April, where he talked about the content of one of the chapters of the book.

I think the best advice from the book is to look for small problems to solve. Big problems are typically hard to solve, and you won’t be the first person to try to conquer a big problem. But a little problem may have been overlooked and may offer significant opportunity.

Do you think Steven Levitt and Stephen Dubner get into recurring fights about the proper way to spell their first names?

Fantasy Draft Results- Maria

Maria is two-time reigning champion of her male-dominated league. Her league is evolving from a 10-team no-keeper auction league (2 years ago) to a 10-team 2-keeper auction league (last year) to a 12-team 3-keeper auction league (this year). She plays with friends from high school.

Maria has a $200 draft budget. Her league does QB, 2 RB, 2 WR, flex, TE, D/ST, K and 5 bench spots. Here’s how her team filled out, in order of when they were added to her team:
Peyton Manning, QB for Denver, $37 keeper
Antonio Brown, WR for Pittsburgh, $21 keeper
Michael Crabtree, WR for San Francisco, $2 keeper
Arian Foster, RB for Houston, $33
Doug Martin, RB for Tampa Bay, $48
Ben Tate, RB for Cleveland, $20
Robert Griffin III, QB for Washington, $5
Reuben Randle, WR for New York Giants, $2
Blair Walsh, K for Minnesota, $1
Kelvin Benjamin, WR for Carolina, $6
Shonn Greene, RB for Tennessee, $1
Dennis Pitta, TE for Baltimore, $3
New Orleans D/ST, $1
DeAndre Hopkins, WR for Houston, $3

The money she didn’t use in the draft ($17) she keeps and can use for auction-style waiver wire pickups.

Hopefully she holds on and wins her league again. Bring on the fantasy football!

Fantasy Draft Results- Eric

This will be my third year playing fantasy football. In the first year, I didn’t understand a lot of the rules early on, but re-built my roster and rallied in the second half of the season. Missed out on the playoffs via a tie-breaker. In my second year, I had the second best record entering the playoffs but lost when Jamaal Charles scored about a million points for my opponent.

I am a pathological trader. Last season, I was involved in EVERY trade that happened in my league (around 6-8 of them). So I don’t think the draft is the last chance to build a winning team, but my draft was Sunday night. So I’ll share my initial roster. My league, which includes fraternity brothers from Phi Kappa Tau at Case Western, uses normal scoring and a snake draft. We start QB, 2 RB, 2 WR, TE, flex, D/ST, and K. There is no dynasty aspect or keepers. We get a fresh slate each year. This year, I got shafted with the last pick in a 12 team league. So I pick last in the first round, first in the second, last in the third, first in the fourth, etc. I pick twice in a row each time. Here are my selections and starting team for 2014 (if past actions are any clue, my ending team won’t resemble this team much):

1st round: Doug Martin, RB for Tampa Bay
2nd round: Montee Ball, RB for Denver
3rd round: Vincent Jackson, WR for Tampa Bay
4th round: Keenan Allen, WR for San Diego
5th round: Trent Richardson, RB for Indianapolis
6th round: Emmanuel Sanders, WR for Denver
7th round: Golden Tate, WR for Detroit
8th round: Colin Kaepernick, QB for San Francisco
9th round: Darren Sproles, RB for Philadelphia
10th round: Jeremy Hill, RB for Cincinnati
11th round: Andy Dalton, QB for Cincinnati
12th round: Doug Baldwin, WR for Detroit
13th round: Zach Ertz, TE for Philadelphia
14th round: Lance Dunbar, RB for Dallas
15th round: Matt Bryant, K for Atlanta
16th round: Bills D/ST

I grabbed a lot of running backs in the draft because they are always the hardest commodity to find during the season. I’m really high on Montee Ball, mostly because of the high-powered Denver offense. Everyone is down on Trent Richardson, but I think Indy’s offense will be even better this year and he is a feature back. He’ll make enough points to justify a 5th round pick. Sanders, Tate, and Baldwin are all #2 WR in high-powered offenses. I wanted Matt Ryan as my QB, but someone took him in the 7th round, so I took Kaepernick. His receivers are vastly improved over last year and he gets some running yards, so that should be fine. I’ve heard good things about Zach Ertz, so I’m hoping he’ll be good as my only TE for now. Hill and Dunbar are #2 RB in systems that might feature two backs. We’ll see if they get significant touches or not. I always stream kickers and defenses week to week, so I put zero thought into those picks. I’ll typically just grab the best available for the coming week off the waiver wire and do as well as most people. Some crazies started taking defenses in the 7th round, way too early for me.

I’ll have details about Maria’s picks in her league up in the afternoon.

Two Links Tuesday- September 2, 2014- Sports Watching Edition

MLB Blackout areas

MLBAM may be close to revising its asinine blackout policy: I live in Bloomington, IN, which is about 2.5 hours from Cincinnati. Cincinnati Reds games are broadcast locally in Cincinnati on Fox Sports Ohio. I don’t get Fox Sports Ohio. I get Fox Sports Indiana. Because I’m in Indiana. Fox Sports Indiana carries SOME Reds games, but not all. Fox Sports Indiana will broadcast 95 Reds games in 2014; mostly they show Reds games when no Pacers games, live wrestling, or live racing are in line for broadcast on Fox Sports Indiana. About 10 other games are on national TV. All of which makes sense. Here’s where it gets stupid: Major League Baseball classifies Bloomington as Cincinnati Reds territory. And it thus feels justified in blacking out Reds games on MLB.TV in Bloomington, so that I have to support my local broadcasters and watch it on TV. Except I don’t get about 60 of the Reds games in Bloomington on TV. There is no way, outside of satellite TV options, for me to watch about 60 Reds games in 2014 IN REDS TERRITORY! I can watch all 162 games involving the Houston Astros, though, if I bought MLB.TV. Thanks, MLB.TV. Good idea. At least, maybe, possibly, it will be fixed in the future. Like in 2018, once I’ve left Bloomington.

NFL broadcast maps from 506 Sports: Ever wonder what NFL games your local CBS and FOX stations are going to carry this week? Bookmark this site and check back a few days before the game. Then you can see if your stations are going to show an interesting game. If not, you can go ahead and make other plans for that Sunday time slot.