With all of the talk over the last few years about robo umps, one of the least discussed repercussions of having an automated strike zone is the elimination of catcher framing. In my opinion, framing is one of the most interesting, yet unappreciated aspects of baseball. Not only can framing have a huge impact on a pitching staff, but it also provides a way for catchers to add value to themselves if they struggle at the plate.
So, as a college student studying Data Science, when I was thinking of baseball-related projects to work on in my free time, catcher framing was one of the first things that came to mind. My goal for this project was to create a simple metric to determine which catchers “stole” the most strikes throughout the 2020 season and the first half of this year (and improve my coding skills). Let me tell you a little about my method.
The first step was creating a model to determine whether a pitch will be called a ball or strike. I took pitch-by-pitch data from the first game played in 2020 until today, and eliminated every pitch that involved a swing, whether it was a swinging strike, foul, or in play. Now that I was down to only balls and called strikes, I used each pitch’s coordinates to train the model to learn an umpire’s tendencies. After testing additional variables and fine-tuning the model, it ended up with an accuracy rating of ~91%. In layman’s terms, this enabled me to correctly label a pitch as a ball or strike approximately 91% of the time.
The next step was using each catcher’s pitch-by-pitch data as test data for the model. Every catcher included in the testing had to have caught at least 2,000 pitches since the start of the 2020 season. After finding the number of strikes the model expected each catcher to have, I compared that expected number to how many strikes they actually caught. This is how I created what I called ‘Stolen Strike %’, which measures how many pitches a catcher “stole” from being a ball. The exact formula is:
Let’s review an example using Christian Vázquez. Say Vázquez catches 12 total called strikes/balls, and the model expects 2 of these pitches to be called strikes. If 3 of them were actually called strikes, that results in a Stolen Strike Rate of 1 out of 12, or 8.33%. Vázquez “stole” 1 pitch that was expected to be a ball.
That covers the basics of my model. Very simple, but it gives a nice overview of which catchers helped out their pitching staff the most. Now, let’s look at some of my results. For context, I expected the average Stolen Strike % to be close to 0%, because an average catcher would finish with the exact amount of strikes they were expected to. Supporting this belief, the sample’s mean is 0.03%, and a confidence interval at 95% expects the average Stolen Strike % for all catchers to be between -0.15% and 0.21%. Below are the top 5 catchers in Stolen Strike %.
The most noticeable thing here is that the Texas Rangers really know what they’re doing behind the dish – Jose Trevino and Jonah Heim have been exceptional framers. This is backed up by Baseball Prospectus’ Called Strikes Above Average (CSAA) metric, which has Trevino and Heim ranked 1st and 7th, respectively. Kyle Higashioka and Max Stassi are both expected names as well – Higashioka was ranked 1st in CSAA last year and Stassi is ranked 3rd this year. The big surprise here was Ryan Jeffers, who, while still a well-above-average framer, is ranked 14th in CSAA this year.
How have the Red Sox’ catchers fared in the framing department?
To no one’s surprise, Christian Vázquez has been very solid. He may be inconsistent at the plate, but he’s as consistent as they come defensively. The shocker for me was how poorly Kevin Plawecki‘s framing was graded – with his bat not providing much spark, I assumed the majority of his value came from his defense. Once again, CSAA backs up my rankings for the Sox’ catchers, with Christian Vázquez dead on at 16 and Kevin Plawecki way down at 59.
Being ranked in the mid-to-low 50’s places Plawecki in the bottom tier among backup catchers. As mentioned above, his offense is subpar as well, and it may be time for the Red Sox to be on the lookout for a new backup catcher this upcoming offseason. With Plawecki on the IL for the time being, I’m definitely looking forward to seeing what Connor Wong brings to the table.
To conclude, this project was meant to introduce an extremely simple form of catcher framing data and highlight a potentially underrated area of improvement for the Red Sox. Working on and completing this project has been a ton of fun, and I hope you enjoyed reading about my method and results.