Pitcher Precision

Sasank Vishnubhatla

4/17/2019

Last Update: 2019-05-12 16:30:20

Libraries

Let’s load some libraries in first.

library(baseballr)
library(pitchRx)
library(tidyverse)

Let’s also clean out environment.

rm(list = ls())

With these libraries, we can get out data as well as visaulize it. Let’s take a look at some players to see what we can look at.

Data Loading

Here are the list of players I will be looking at.

Let’s now scrape the data for each player.

scrape.data = function(start, id) {
    data = scrape_statcast_savant(start_date = start,
                                  end_date = format(Sys.time(), "%Y-%m-%d"),
                                  playerid = id,
                                  player_type = 'pitcher')
    data
}

start = "2019-01-01"

syndergaard.data = scrape.data(start, 592789)
corbin.data = scrape.data(start, 571578)
vazquez.data = scrape.data(start, 553878)
stroman.data = scrape.data(start, 573186)
verlander.data = scrape.data(start, 434378)

Now with our data, let’s get the information we want out of it.

filter.data = function(data) {
    filtered = data.frame(name = data %>% pull(player_name),
                          pitch = data %>% pull(pitch_type),
                          outcome = data %>% pull(type),
                          date = data %>% pull(game_date),
                          event = data %>% pull(events),
                          descrip = data %>% pull(description),
                          xcoord = data %>% pull(plate_x),
                          ycoord = data %>% pull(plate_z),
                          xmove = data %>% pull(pfx_x),
                          ymove = data %>% pull(pfx_z),
                          velo = data %>% pull(effective_speed),
                          spin = data %>% pull(release_spin_rate),
                          exvelo = data %>% pull(launch_speed),
                          exang = data %>% pull(launch_angle),
                          contact = data %>% pull(launch_speed_angle),
                          year = substring(data %>% pull(game_date), 0, 4))
    filtered$exvelo[is.na(filtered$exvelo)] = 0
    filtered$exang[is.na(filtered$exang)] = 0
    filtered$contact[is.na(filtered$contact)] = 0
    filtered
}

syndergaard = filter.data(syndergaard.data)
corbin = filter.data(corbin.data)
stroman = filter.data(stroman.data)
vazquez = filter.data(vazquez.data)
verlander = filter.data(verlander.data)

With this filtered data, we have selected the following columns:

Visualization

Let’s start visualizing some of this data. Before that, let me define a strikezone. This strikezone was taken from the website Baseball with R

topKzone = 3.5
botKzone = 1.6
inKzone = -.95
outKzone = 0.95
kZone = data.frame(x = c(inKzone, inKzone, outKzone, outKzone, inKzone),
                   y = c(botKzone, topKzone, topKzone, botKzone, botKzone))

Location via Outcome

Let’s look at pitch location with if the pitch is a ball or strike. We know X is hit into play, B is ball, and S is any type of strike.

graph.pitch.heatmap.out = function(player) {
    graph = ggplot(player) +
        geom_jitter(aes(x = player$xcoord,
                        y = player$ycoord,
                        color = player$outcome)) +
        xlab("Horizontal Position") +
        ylab("Vertical Position") +
        ggtitle(paste(player$name[1], player$year[1], "Outcome", sep = " ")) +
        labs(color = "Pitch Outcome") +
        theme_minimal() + geom_path(aes(x, y), data = kZone)
    graph
}

Patrick Corbin

corbin.heatmap.out = graph.pitch.heatmap.out(corbin)
corbin.heatmap.out

Marcus Stroman

stroman.heatmap.out = graph.pitch.heatmap.out(stroman)
stroman.heatmap.out