vignettes/extract-fbref-data.Rmd
extract-fbref-data.Rmd
This package is designed to allow users to extract various world football results and player statistics from the following popular football (soccer) data sites:
You can install the CRAN version of worldfootballR
with:
install.packages("worldfootballR")
You can install the released version of worldfootballR
from GitHub with:
# install.packages("devtools")
devtools::install_github("JaseZiv/worldfootballR")
Package vignettes have been built to help you get started with the package.
This vignette will cover the functions to extract data from FBref.com.
NOTE:
As of version 0.5.2
, all FBref functions now come with a user-defined pause between page loads to address their new rate limiting. See this document for more information.
To be able to join data player between FBref and Transfermarkt, player_dictionary_mapping()
has been created. There are over 6,100 players who have been listed for teams in the Big 5 Euro leagues on FBref since the start of the 2017-18 seasons, with all of these mapped together. This is expected to be updated and grow over time. The raw data is stored here
mapped_players <- player_dictionary_mapping()
dplyr::glimpse(mapped_players)
#> Rows: 6,495
#> Columns: 4
#> $ PlayerFBref <chr> "Aaron Connolly", "Aaron Cresswell", "Aarón Escandell", "A…
#> $ UrlFBref <chr> "https://fbref.com/en/players/27c01749/Aaron-Connolly", "h…
#> $ UrlTmarkt <chr> "https://www.transfermarkt.com/aaron-connolly/profil/spiel…
#> $ TmPos <chr> "Centre-Forward", "Left-Back", "Goalkeeper", "Attacking Mi…
The following section will outline the various functions available to find different URLs to be able to pass through the FBref suite of functions outlined in this vignette.
To extract the URL of any country’s league(s) (provided fbref have data for the league), use the fb_league_urls()
function.
This function also accepts a tier
argument. for first-tier leagues, select ‘1st’, for second-tier select ‘2nd’ and so on.
A fill list of countries available can be found in the worldfootballR_data
repository and can be found here.
fb_league_urls(country = "ENG", gender = "M", season_end_year = 2021, tier = '2nd')
To get a list of URLs for each team in a particular season, the fb_teams_urls()
function can be used:
fb_teams_urls("https://fbref.com/en/comps/9/Premier-League-Stats")
To get a list of player URLs for a particular team, the fb_player_urls()
function can be used. The results of this output can be passed through to the player season stat functions fb_player_season_stats()
and fb_player_scouting_report()
.
fb_player_urls("https://fbref.com/en/squads/fd962109/Fulham-Stats")
To get the match URLs needed to pass in to some of the match-level functions below, get_match_urls()
can be used:
epl_2021_urls <- get_match_urls(country = "ENG", gender = "M", season_end_year = 2021, tier="1st")
This section will cover the functions to aid in the extraction of season team statistics.
The get_season_team_stats
function allows the user to return a data frame of different stat types for all teams in Domestic leagues here.
Note, some stats may not be available for all leagues. The big five European leagues should have all of these stats, but for those leagues, it’s more efficient to use fb_big5_advanced_season_stats()
.
The following stat types can be selected:
#----- function to extract season teams stats -----#
prem_2020_shooting <- get_season_team_stats(country = "ENG", gender = "M", season_end_year = "2020", tier = "1st", stat_type = "shooting")
dplyr::glimpse(prem_2020_shooting)
#> Rows: 40
#> Columns: 25
#> $ Competition_Name <chr> "Premier League", "Premier League", "Premier …
#> $ Gender <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", …
#> $ Country <chr> "ENG", "ENG", "ENG", "ENG", "ENG", "ENG", "EN…
#> $ Season_End_Year <int> 2020, 2020, 2020, 2020, 2020, 2020, 2020, 202…
#> $ Squad <chr> "Arsenal", "Aston Villa", "Bournemouth", "Bri…
#> $ Team_or_Opponent <chr> "team", "team", "team", "team", "team", "team…
#> $ Num_Players <dbl> 29, 28, 27, 25, 22, 27, 25, 24, 24, 24, 24, 2…
#> $ Mins_Per_90 <dbl> 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 3…
#> $ Gls_Standard <dbl> 56, 40, 38, 35, 41, 69, 29, 42, 65, 83, 100, …
#> $ Sh_Standard <dbl> 401, 453, 384, 456, 384, 619, 372, 465, 533, …
#> $ SoT_Standard <dbl> 144, 146, 116, 137, 124, 210, 116, 155, 181, …
#> $ SoT_percent_Standard <dbl> 35.9, 32.2, 30.2, 30.0, 32.3, 33.9, 31.2, 33.…
#> $ Sh_per_90_Standard <dbl> 10.55, 11.92, 10.11, 12.00, 10.11, 16.29, 9.7…
#> $ SoT_per_90_Standard <dbl> 3.79, 3.84, 3.05, 3.61, 3.26, 5.53, 3.05, 4.0…
#> $ G_per_Sh_Standard <dbl> 0.13, 0.09, 0.09, 0.07, 0.10, 0.10, 0.07, 0.0…
#> $ G_per_SoT_Standard <dbl> 0.37, 0.27, 0.29, 0.25, 0.31, 0.30, 0.22, 0.2…
#> $ Dist_Standard <dbl> 15.9, 16.9, 16.4, 17.1, 15.7, 16.2, 16.5, 15.…
#> $ FK_Standard <dbl> 19, 17, 21, 11, 17, 27, 15, 20, 20, 18, 27, 3…
#> $ PK_Standard <dbl> 3, 1, 4, 1, 3, 7, 3, 1, 5, 5, 6, 10, 0, 2, 1,…
#> $ PKatt_Standard <dbl> 3, 3, 4, 2, 3, 7, 3, 1, 7, 5, 11, 14, 1, 2, 1…
#> $ xG_Expected <dbl> 49.2, 40.1, 42.7, 41.2, 43.9, 66.6, 34.0, 49.…
#> $ npxG_Expected <dbl> 46.9, 37.7, 39.7, 39.7, 41.6, 61.7, 31.9, 48.…
#> $ npxG_per_Sh_Expected <dbl> 0.12, 0.08, 0.10, 0.09, 0.11, 0.10, 0.09, 0.1…
#> $ G_minus_xG_Expected <dbl> 6.8, -0.1, -4.7, -6.2, -2.9, 2.4, -5.0, -7.3,…
#> $ `np:G_minus_xG_Expected` <dbl> 6.1, 1.3, -5.7, -5.7, -3.6, 0.3, -5.9, -7.5, …
#----- to get shooting stats for the English Championship: -----#
# championship_2020_shooting <- get_season_team_stats(country = "ENG", gender = "M", season_end_year = "2020", tier = "2nd", stat_type = "shooting")
#----- Can also run this for multiple leagues at a time: -----#
# multiple_2020_shooting <- get_season_team_stats(country = c("USA", "NED"),
# gender = "M", season_end_year = 2020,
# tier = "1st", stat_type = "shooting")
The fb_big5_advanced_season_stats()
function allows users to extract data for any of the below listed stat types for all teams of the big five European leagues (EPL, La Liga, Ligue 1, Serie A, Bundesliga).
The stat types available for this function are below:
The function also accepts a season or seasons and whether you want data for the player, or team.
Note that when selecting team_or_player="team"
, results will be returned for both the team’s for and against stats. To filter on this, use the Team_or_Opponent
column in the resulting data frame, selecting ‘team’ if you want the team’s for stats, or ‘opponent’ if you want the team’s against stats.
#----- Get data for big five leagues for TEAMS -----#
big5_team_shooting <- fb_big5_advanced_season_stats(season_end_year= c(2019:2021), stat_type= "shooting", team_or_player= "team")
dplyr::glimpse(big5_team_shooting)
#> Rows: 588
#> Columns: 24
#> $ Season_End_Year <int> 2019, 2019, 2019, 2019, 2019, 2019, 2019, 201…
#> $ Squad <chr> "Alavés", "Alavés", "Amiens", "Amiens", "Ange…
#> $ Comp <chr> "La Liga", "La Liga", "Ligue 1", "Ligue 1", "…
#> $ Team_or_Opponent <chr> "team", "opponent", "team", "opponent", "team…
#> $ Num_Players <dbl> 26, 26, 26, 26, 24, 24, 28, 28, 27, 27, 27, 2…
#> $ Mins_Per_90 <dbl> 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 38, 3…
#> $ Gls_Standard <dbl> 39, 48, 30, 50, 42, 47, 69, 49, 75, 43, 38, 4…
#> $ Sh_Standard <dbl> 419, 492, 393, 498, 453, 463, 467, 488, 642, …
#> $ SoT_Standard <dbl> 118, 159, 115, 155, 151, 160, 159, 173, 211, …
#> $ SoT_percent_Standard <dbl> 28.2, 32.3, 29.3, 31.1, 33.3, 34.6, 34.0, 35.…
#> $ Sh_per_90_Standard <dbl> 11.03, 12.95, 10.34, 13.11, 11.92, 12.18, 12.…
#> $ SoT_per_90_Standard <dbl> 3.11, 4.18, 3.03, 4.08, 3.97, 4.21, 4.18, 4.5…
#> $ G_per_Sh_Standard <dbl> 0.09, 0.09, 0.07, 0.09, 0.08, 0.10, 0.14, 0.0…
#> $ G_per_SoT_Standard <dbl> 0.31, 0.26, 0.24, 0.28, 0.25, 0.28, 0.41, 0.2…
#> $ Dist_Standard <dbl> 17.7, 17.0, 19.2, 18.3, 17.9, 18.1, 16.3, 17.…
#> $ FK_Standard <dbl> 15, 26, 21, 18, 19, 30, 11, 18, 16, 17, 6, 23…
#> $ PK_Standard <dbl> 2, 6, 2, 6, 5, 3, 4, 7, 2, 4, 5, 5, 3, 5, 5, …
#> $ PKatt_Standard <dbl> 3, 6, 3, 9, 10, 5, 5, 7, 4, 5, 7, 5, 4, 5, 5,…
#> $ xG_Expected <dbl> 37.3, 51.0, 34.6, 46.4, 47.7, 46.9, 59.1, 53.…
#> $ npxG_Expected <dbl> 35.0, 46.8, 32.3, 39.7, 39.8, 43.2, 55.1, 48.…
#> $ npxG_per_Sh_Expected <dbl> 0.08, 0.10, 0.08, 0.08, 0.09, 0.09, 0.12, 0.1…
#> $ G_minus_xG_Expected <dbl> 1.7, -3.0, -4.6, 3.6, -5.7, 0.1, 9.9, -4.8, 8…
#> $ `np:G_minus_xG_Expected` <dbl> 2.0, -4.8, -4.3, 4.3, -2.8, 0.8, 9.9, -6.5, 9…
#> $ Url <chr> "https://fbref.com/en/squads/8d6fd021/2018-20…
#----- Get data for big five leagues for PLAYERS -----#
big5_player_shooting <- fb_big5_advanced_season_stats(season_end_year= c(2019:2021), stat_type= "shooting", team_or_player= "player")
dplyr::glimpse(big5_player_shooting)
#> Rows: 8,210
#> Columns: 27
#> $ Season_End_Year <int> 2019, 2019, 2019, 2019, 2019, 2019, 2019, 201…
#> $ Squad <chr> "Alavés", "Alavés", "Alavés", "Alavés", "Alav…
#> $ Comp <chr> "La Liga", "La Liga", "La Liga", "La Liga", "…
#> $ Player <chr> "Martin Agirregabiria", "Borja Bastón", "Alej…
#> $ Nation <chr> "ESP", "ESP", "ESP", "SRB", "ESP", "ARG", "ES…
#> $ Pos <chr> "DF", "FW", "FW,MF", "MF", "MF,FW", "FW", "DF…
#> $ Age <chr> "22", "25", "19", "26", "24", "24", "22", "24…
#> $ Born <dbl> 1996, 1992, 1998, 1992, 1993, 1993, 1995, 199…
#> $ Mins_Per_90 <dbl> 22.6, 16.0, 0.2, 17.6, 6.2, 31.9, 30.8, 4.8, …
#> $ Gls_Standard <dbl> 0, 5, 0, 0, 1, 9, 0, 0, 2, 3, 2, 2, 4, 1, 0, …
#> $ Sh_Standard <dbl> 2, 34, 0, 8, 8, 73, 22, 3, 28, 43, 10, 16, 50…
#> $ SoT_Standard <dbl> 0, 10, 0, 1, 2, 17, 7, 1, 5, 10, 5, 5, 23, 5,…
#> $ SoT_percent_Standard <dbl> 0.0, 29.4, NA, 12.5, 25.0, 23.3, 31.8, 33.3, …
#> $ Sh_per_90_Standard <dbl> 0.09, 2.13, 0.00, 0.46, 1.30, 2.29, 0.71, 0.6…
#> $ SoT_per_90_Standard <dbl> 0.00, 0.63, 0.00, 0.06, 0.32, 0.53, 0.23, 0.2…
#> $ G_per_Sh_Standard <dbl> 0.00, 0.15, NA, 0.00, 0.13, 0.10, 0.00, 0.00,…
#> $ G_per_SoT_Standard <dbl> NA, 0.50, NA, 0.00, 0.50, 0.41, 0.00, 0.00, 0…
#> $ Dist_Standard <dbl> 18.5, 10.2, NA, 15.0, 22.1, 14.5, 26.7, 14.9,…
#> $ FK_Standard <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 8, 1, 0, …
#> $ PK_Standard <dbl> 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ PKatt_Standard <dbl> 0, 0, 0, 0, 0, 2, 0, 0, 0, 1, 0, 0, 0, 0, 0, …
#> $ xG_Expected <dbl> 0.1, 5.7, 0.0, 0.5, 0.9, 7.2, 0.6, 0.1, 2.1, …
#> $ npxG_Expected <dbl> 0.1, 5.7, 0.0, 0.5, 0.9, 5.7, 0.6, 0.1, 2.1, …
#> $ npxG_per_Sh_Expected <dbl> 0.07, 0.17, NA, 0.06, 0.12, 0.08, 0.03, 0.02,…
#> $ G_minus_xG_Expected <dbl> -0.1, -0.7, 0.0, -0.5, 0.1, 1.8, -0.6, -0.1, …
#> $ `np:G_minus_xG_Expected` <dbl> -0.1, -0.7, 0.0, -0.5, 0.1, 1.3, -0.6, -0.1, …
#> $ Url <chr> "https://fbref.com/en/players/355c883a/Martin…
The following sections outlines the functions available to extract data at the per-match level
To get the match results (and additional metadata) for all leagues and comps listed here, the following function can be used:
# function to extract Serie A match results data
serieA_2020 <- get_match_results(country = "ITA", gender = "M", season_end_year = 2020, tier = "1st")
dplyr::glimpse(serieA_2020)
#> Rows: 380
#> Columns: 20
#> $ Competition_Name <chr> "Serie A", "Serie A", "Serie A", "Serie A", "Serie A"…
#> $ Gender <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M"…
#> $ Country <chr> "ITA", "ITA", "ITA", "ITA", "ITA", "ITA", "ITA", "ITA…
#> $ Season_End_Year <int> 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020, 2020,…
#> $ Round <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ Wk <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "10…
#> $ Day <chr> "Sat", "Sat", "Sun", "Sun", "Sun", "Sun", "Sun", "Sun…
#> $ Date <date> 2019-08-24, 2019-08-24, 2019-08-25, 2019-08-25, 2019…
#> $ Time <chr> "18:00", "20:45", "18:00", "20:45", "20:45", "20:45",…
#> $ Home <chr> "Parma", "Fiorentina", "Udinese", "SPAL", "Roma", "To…
#> $ HomeGoals <dbl> 0, 3, 1, 2, 3, 2, 1, 0, 0, 4, 0, 1, 2, 2, 4, 1, 0, 1,…
#> $ Home_xG <dbl> 0.4, 1.7, 0.9, 1.6, 1.9, 1.2, 0.2, 0.8, 1.0, 1.7, 1.3…
#> $ Away <chr> "Juventus", "Napoli", "Milan", "Atalanta", "Genoa", "…
#> $ AwayGoals <dbl> 1, 4, 0, 3, 3, 1, 1, 3, 1, 0, 1, 2, 2, 1, 0, 2, 4, 1,…
#> $ Away_xG <dbl> 1.3, 2.0, 0.4, 1.7, 1.3, 1.5, 1.6, 2.3, 1.5, 0.7, 0.2…
#> $ Attendance <dbl> 20073, 33614, 24584, 11706, 38779, 16536, 16324, 1950…
#> $ Venue <chr> "Stadio Ennio Tardini", "Stadio Artemio Franchi", "Da…
#> $ Referee <chr> "Fabio Maresca", "Davide Massa", "Fabrizio Pasqua", "…
#> $ Notes <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ MatchURL <chr> "https://fbref.com/en/matches/2b01c68a/Parma-Juventus…
The function can also be used to return match URLs for a non-domestic league season. To use this functionality, simply leave country = ''
and pass the non-domestic league URL, which can be found at https://fbref.com/en/comps/
# for international friendlies:
get_match_results(country = "", gender = "M", season_end_year = 2018, tier = "", non_dom_league_url = "https://fbref.com/en/comps/218/history/Friendlies-M-Seasons")
The get_match_results()
function can be used to get data for multiple seasons/leagues/genders/etc also:
big_5_2020_results <- get_match_results(country = c("ENG", "ESP", "ITA", "GER", "FRA"),
gender = "M", season_end_year = 2020, tier = "1st")
This function will return similar results to that of get_match_results()
, however get_match_report()
will provide some additional information. It will also only provide it for a single match, not the whole season:
# function to extract match report data
liv_mci_2020 <- get_match_report(match_url = "https://fbref.com/en/matches/47880eb7/Liverpool-Manchester-City-November-10-2019-Premier-League")
dplyr::glimpse(liv_mci_2020)
#> Rows: 1
#> Columns: 21
#> $ League <chr> "Premier League"
#> $ Gender <chr> "M"
#> $ Country <chr> "ENG"
#> $ Season <chr> "2019-2020"
#> $ Match_Date <chr> NA
#> $ Matchweek <chr> "Premier League (Matchweek 12)"
#> $ Home_Team <chr> "Liverpool"
#> $ Home_Formation <chr> "4-3-3"
#> $ Home_Score <dbl> 3
#> $ Home_xG <dbl> 1
#> $ Home_Goals <chr> "\n\t\t\n\t\t\tFabinho · 6’ \n\t\t\n\t\t\tMoh…
#> $ Home_Yellow_Cards <chr> "0"
#> $ Home_Red_Cards <chr> "0"
#> $ Away_Team <chr> NA
#> $ Away_Formation <chr> "4-2-3-1"
#> $ Away_Score <dbl> 1
#> $ Away_xG <dbl> 1.3
#> $ Away_Goals <chr> "\n\t\t\n\t\t\t Bernardo Silva · 78’\n\t\t\n\…
#> $ Away_Yellow_Cards <chr> "2"
#> $ Away_Red_Cards <chr> "0"
#> $ Game_URL <chr> "https://fbref.com/en/matches/47880eb7/Liverpool-Man…
This function will return the main events that occur during a match, including goals, substitutions and red/yellow cards:
# function to extract match summary data
liv_mci_2020_summary <- get_match_summary(match_url = "https://fbref.com/en/matches/47880eb7/Liverpool-Manchester-City-November-10-2019-Premier-League")
dplyr::glimpse(liv_mci_2020_summary)
#> Rows: 10
#> Columns: 30
#> $ League <chr> "Premier League", "Premier League", "Premier League"…
#> $ Gender <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", "M"
#> $ Country <chr> "ENG", "ENG", "ENG", "ENG", "ENG", "ENG", "ENG", "EN…
#> $ Season <chr> "2019-2020", "2019-2020", "2019-2020", "2019-2020", …
#> $ Match_Date <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ Matchweek <chr> "Premier League (Matchweek 12)", "Premier League (Ma…
#> $ Home_Team <chr> "Liverpool", "Liverpool", "Liverpool", "Liverpool", …
#> $ Home_Formation <chr> "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3"…
#> $ Home_Score <dbl> 3, 3, 3, 3, 3, 3, 3, 3, 3, 3
#> $ Home_xG <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ Home_Goals <chr> "\n\t\t\n\t\t\tFabinho · 6’ \n\t\t\n\t\t\tMoh…
#> $ Home_Yellow_Cards <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"
#> $ Home_Red_Cards <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"
#> $ Away_Team <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
#> $ Away_Formation <chr> "4-2-3-1", "4-2-3-1", "4-2-3-1", "4-2-3-1", "4-2-3-1…
#> $ Away_Score <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
#> $ Away_xG <dbl> 1.3, 1.3, 1.3, 1.3, 1.3, 1.3, 1.3, 1.3, 1.3, 1.3
#> $ Away_Goals <chr> "\n\t\t\n\t\t\t Bernardo Silva · 78’\n\t\t\n\…
#> $ Away_Yellow_Cards <chr> "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"
#> $ Away_Red_Cards <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"
#> $ Game_URL <chr> "https://fbref.com/en/matches/47880eb7/Liverpool-Man…
#> $ Team <chr> "Liverpool", "Liverpool", "Liverpool", "Liverpool", …
#> $ Home_Away <chr> "Home", "Home", "Home", "Home", "Away", "Away", "Awa…
#> $ Event_Time <dbl> 6, 13, 51, 61, 65, 71, 78, 79, 87, 95
#> $ Is_Pens <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL…
#> $ Event_Half <dbl> 1, 1, 2, 2, 2, 2, 2, 2, 2, 2
#> $ Event_Type <chr> "Goal", "Goal", "Goal", "Substitute", "Yellow Card",…
#> $ Event_Players <chr> "Fabinho", "Mohamed Salah Assist: Andrew Robertson",…
#> $ Score_Progression <chr> "1:0", "2:0", "3:0", "3:0", "3:0", "3:0", "3:1", "3:…
#> $ Penalty_Number <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA
This function will return a dataframe of all players listed for that match, including whether they started on the pitch, or on the bench.
From version 0.2.7, this function now also returns some summary performance data for each player that played, including their position, minutes played, goals, cards, etc.
# function to extract match lineups
liv_mci_2020_lineups <- get_match_lineups(match_url = "https://fbref.com/en/matches/47880eb7/Liverpool-Manchester-City-November-10-2019-Premier-League")
dplyr::glimpse(liv_mci_2020_lineups)
#> Rows: 36
#> Columns: 17
#> $ Matchday <date> 2019-11-10, 2019-11-10, 2019-11-10, 2019-11-10, 2019-11-1…
#> $ Team <chr> "Liverpool", "Liverpool", "Liverpool", "Liverpool", "Liver…
#> $ Home_Away <chr> "Home", "Home", "Home", "Home", "Home", "Home", "Home", "H…
#> $ Formation <chr> "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3…
#> $ Player_Num <chr> "1", "3", "4", "5", "6", "9", "10", "11", "14", "26", "66"…
#> $ Player_Name <chr> "Alisson", "Fabinho", "Virgil van Dijk", "Georginio Wijnal…
#> $ Starting <chr> "Pitch", "Pitch", "Pitch", "Pitch", "Pitch", "Pitch", "Pit…
#> $ PlayerURL <chr> "https://fbref.com/en/players/7a2e46a8/Alisson", "https://…
#> $ Nation <chr> "BRA", "BRA", "NED", "NED", "CRO", "BRA", "SEN", "EGY", "E…
#> $ Pos <chr> "GK", "DM", "CB", "CM,DM", "CB", "FW,AM", "LW,RM,FW", "RW,…
#> $ Age <chr> "27-039", "26-018", "28-125", "28-364", "30-128", "28-039"…
#> $ Min <dbl> 90, 90, 90, 90, 90, 78, 90, 86, 60, 90, 90, NA, 30, NA, 4,…
#> $ Gls <dbl> 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0, NA, 0, NA, 0, 0, NA, NA, …
#> $ Ast <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, NA, 0, NA, 0, 0, NA, NA, …
#> $ CrdY <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, NA, 0, 0, NA, NA, …
#> $ CrdR <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA, 0, NA, 0, 0, NA, NA, …
#> $ MatchURL <chr> "https://fbref.com/en/matches/47880eb7/Liverpool-Mancheste…
The below function allows users to extract shooting and shot creation event data for a match or selected matches. The data returned includes who took the shot, when, with which body part and from how far away. Additionally, the player creating the chance and also the creation before this are included in the data.
#----- Get shots data for a single match played: -----#
shot_one_match <- get_match_shooting(match_url = "https://fbref.com/en/matches/a3eb7a37/Sheffield-United-Wolverhampton-Wanderers-September-14-2020-Premier-League")
#----- Can also extract for multiple matches at a time: -----#
# test_urls_multiple <- c("https://fbref.com/en/matches/c0996cac/Bordeaux-Nantes-August-21-2020-Ligue-1",
# "https://fbref.com/en/matches/9cbccb37/Dijon-Angers-August-22-2020-Ligue-1",
# "https://fbref.com/en/matches/f96cd5a0/Lorient-Strasbourg-August-23-2020-Ligue-1")
# shot_multiple_matches <- get_match_shooting(test_urls_multiple)
The get_advanced_match_stats()
function allows the user to return a data frame of different stat types for matches played.
Note, some stats may not be available for all leagues. The big five European leagues should have all of these stats.
The following stat types can be selected:
The function can be used for either all players individually:
test_urls_multiple <- c("https://fbref.com/en/matches/c0996cac/Bordeaux-Nantes-August-21-2020-Ligue-1",
"https://fbref.com/en/matches/9cbccb37/Dijon-Angers-August-22-2020-Ligue-1")
advanced_match_stats <- get_advanced_match_stats(match_url = test_urls_multiple, stat_type = "possession", team_or_player = "player")
dplyr::glimpse(advanced_match_stats)
#> Rows: 61
#> Columns: 53
#> $ League <chr> "Ligue 1", "Ligue 1", "Ligue 1", "Ligue 1", "Lig…
#> $ Gender <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", "M"…
#> $ Country <chr> "FRA", "FRA", "FRA", "FRA", "FRA", "FRA", "FRA",…
#> $ Season <chr> "2020-2021", "2020-2021", "2020-2021", "2020-202…
#> $ Match_Date <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ Matchweek <chr> "Ligue 1 (Matchweek 1)", "Ligue 1 (Matchweek 1)"…
#> $ Home_Team <chr> "Bordeaux", "Bordeaux", "Bordeaux", "Bordeaux", …
#> $ Home_Formation <chr> "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-…
#> $ Home_Score <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Home_xG <dbl> 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4…
#> $ Home_Goals <chr> "\n\t\t\n\t\t\tMehdi Zerkane · 20’ \n\t\t…
#> $ Home_Yellow_Cards <chr> "2", "2", "2", "2", "2", "2", "2", "2", "2", "2"…
#> $ Home_Red_Cards <chr> "1", "1", "1", "1", "1", "1", "1", "1", "1", "1"…
#> $ Away_Team <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ Away_Formation <chr> "4-4-2", "4-4-2", "4-4-2", "4-4-2", "4-4-2", "4-…
#> $ Away_Score <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Away_xG <dbl> 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3…
#> $ Away_Goals <chr> "\n\t\t\n\t", "\n\t\t\n\t", "\n\t\t\n\t", "\n\t\…
#> $ Away_Yellow_Cards <chr> "3", "3", "3", "3", "3", "3", "3", "3", "3", "3"…
#> $ Away_Red_Cards <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0"…
#> $ Game_URL <chr> "https://fbref.com/en/matches/c0996cac/Bordeaux-…
#> $ Team <chr> "Bordeaux", "Bordeaux", "Bordeaux", "Bordeaux", …
#> $ Home_Away <chr> "Home", "Home", "Home", "Home", "Home", "Home", …
#> $ Player <chr> "Josh Maja", "Remi Oudin", "Hwang Ui-jo", "Samue…
#> $ Player_Num <dbl> 9, 28, 18, 10, 12, 7, 26, 17, 5, 29, 25, 6, 24, …
#> $ Nation <chr> "NGA", "FRA", "KOR", "NGA", "FRA", "FRA", "CRO",…
#> $ Pos <chr> "FW", "FW", "LW", "LW", "RW", "RW", "CM", "CM", …
#> $ Age <chr> "21-238", "23-277", "27-359", "22-361", "29-226"…
#> $ Min <dbl> 45, 45, 74, 16, 63, 27, 90, 19, 90, 74, 16, 90, …
#> $ Touches_Touches <dbl> 14, 22, 34, 13, 42, 13, 48, 18, 82, 47, 10, 69, …
#> $ `Def Pen_Touches` <dbl> 0, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 12, 11, 5, 28, …
#> $ `Def 3rd_Touches` <dbl> 0, 9, 6, 2, 3, 3, 14, 0, 21, 14, 7, 52, 49, 21, …
#> $ `Mid 3rd_Touches` <dbl> 9, 7, 17, 5, 27, 7, 29, 17, 50, 27, 4, 21, 36, 4…
#> $ `Att 3rd_Touches` <dbl> 5, 6, 12, 7, 14, 4, 8, 2, 16, 7, 0, 0, 1, 19, 0,…
#> $ `Att Pen_Touches` <dbl> 1, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Live_Touches <dbl> 14, 21, 33, 13, 36, 13, 47, 17, 82, 43, 9, 69, 8…
#> $ Succ_Dribbles <dbl> 0, 0, 1, 0, 2, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 0, …
#> $ Att_Dribbles <dbl> 0, 1, 3, 1, 4, 0, 2, 3, 1, 2, 0, 0, 0, 2, 0, 0, …
#> $ Succ_percent_Dribbles <dbl> NA, 0.0, 33.3, 0.0, 50.0, NA, 0.0, 66.7, 100.0, …
#> $ Player_NumPl_Dribbles <dbl> 0, 0, 2, 0, 2, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 0, …
#> $ Megs_Dribbles <dbl> 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Carries_Carries <dbl> 10, 9, 24, 10, 28, 6, 36, 19, 62, 35, 6, 51, 55,…
#> $ TotDist_Carries <dbl> 67, 39, 94, 31, 130, 48, 179, 99, 185, 71, 25, 2…
#> $ PrgDist_Carries <dbl> 15, 7, 49, 15, 67, 32, 72, 54, 106, 32, 23, 82, …
#> $ Prog_Carries <dbl> 0, 0, 5, 0, 2, 2, 2, 3, 3, 2, 0, 0, 2, 2, 0, 5, …
#> $ Final_Third_Carries <dbl> 0, 0, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 2, …
#> $ CPA_Carries <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Mis_Carries <dbl> 1, 1, 4, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 3, …
#> $ Dis_Carries <dbl> 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, …
#> $ Targ_Receiving <dbl> 18, 22, 33, 16, 37, 16, 41, 18, 67, 35, 3, 47, 6…
#> $ Rec_Receiving <dbl> 12, 13, 22, 11, 28, 8, 37, 17, 64, 34, 3, 47, 60…
#> $ Rec_percent_Receiving <dbl> 66.7, 59.1, 66.7, 68.8, 75.7, 50.0, 90.2, 94.4, …
#> $ Prog_Receiving <dbl> 4, 0, 3, 2, 6, 2, 1, 1, 0, 0, 0, 0, 0, 0, 0, 6, …
Or used for the team totals for each match:
test_urls_multiple <- c("https://fbref.com/en/matches/c0996cac/Bordeaux-Nantes-August-21-2020-Ligue-1",
"https://fbref.com/en/matches/9cbccb37/Dijon-Angers-August-22-2020-Ligue-1")
advanced_match_stats_team <- get_advanced_match_stats(match_url = test_urls_multiple, stat_type = "passing_types", team_or_player = "team")
dplyr::glimpse(advanced_match_stats_team)
#> Rows: 4
#> Columns: 49
#> $ League <chr> "Ligue 1", "Ligue 1", "Ligue 1", "Ligue 1"
#> $ Gender <chr> "M", "M", "M", "M"
#> $ Country <chr> "FRA", "FRA", "FRA", "FRA"
#> $ Season <chr> "2020-2021", "2020-2021", "2020-2021", "2020-2021"
#> $ Match_Date <chr> NA, NA, NA, NA
#> $ Matchweek <chr> "Ligue 1 (Matchweek 1)", "Ligue 1 (Matchweek 1)", "L…
#> $ Home_Team <chr> "Bordeaux", "Bordeaux", "Dijon", "Dijon"
#> $ Home_Formation <chr> "4-3-3", "4-3-3", "4-3-3", "4-3-3"
#> $ Home_Score <dbl> 0, 0, 0, 0
#> $ Home_xG <dbl> 0.4, 0.4, 0.5, 0.5
#> $ Home_Goals <chr> "\n\t\t\n\t\t\tMehdi Zerkane · 20’ \n\t\t\n\t…
#> $ Home_Yellow_Cards <chr> "2", "2", "0", "0"
#> $ Home_Red_Cards <chr> "1", "1", "0", "0"
#> $ Away_Team <chr> NA, NA, NA, NA
#> $ Away_Formation <chr> "4-4-2", "4-4-2", "4-1-4-1", "4-1-4-1"
#> $ Away_Score <dbl> 0, 0, 1, 1
#> $ Away_xG <dbl> 0.3, 0.3, 2.1, 2.1
#> $ Away_Goals <chr> "\n\t\t\n\t", "\n\t\t\n\t", "\n\t\t\n\t\t\t Ismaël T…
#> $ Away_Yellow_Cards <chr> "3", "3", "0", "0"
#> $ Away_Red_Cards <chr> "0", "0", "0", "0"
#> $ Game_URL <chr> "https://fbref.com/en/matches/c0996cac/Bordeaux-Nant…
#> $ Team <chr> "Bordeaux", NA, "Dijon", NA
#> $ Home_Away <chr> "Home", "Away", "Home", "Away"
#> $ Min <dbl> 919, 990, 990, 990
#> $ Att <dbl> 525, 662, 578, 495
#> $ Live_Pass_Types <dbl> 476, 623, 530, 443
#> $ Dead_Pass_Types <dbl> 49, 39, 48, 52
#> $ FK_Pass_Types <dbl> 18, 9, 8, 18
#> $ TB_Pass_Types <dbl> 0, 1, 2, 3
#> $ Press_Pass_Types <dbl> 81, 40, 124, 28
#> $ Sw_Pass_Types <dbl> 10, 12, 8, 15
#> $ Crs_Pass_Types <dbl> 5, 11, 7, 15
#> $ CK_Pass_Types <dbl> 2, 3, 3, 9
#> $ In_Corner_Kicks <dbl> 2, 0, 2, 0
#> $ Out_Corner_Kicks <dbl> 0, 3, 1, 3
#> $ Str_Corner_Kicks <dbl> 0, 0, 0, 5
#> $ Ground_Height <dbl> 356, 529, 458, 369
#> $ Low_Height <dbl> 64, 64, 62, 46
#> $ High_Height <dbl> 105, 69, 58, 80
#> $ Left_Body_Parts <dbl> 147, 277, 146, 198
#> $ Right_Body_Parts <dbl> 326, 331, 377, 253
#> $ Head_Body_Parts <dbl> 13, 24, 11, 9
#> $ TI_Body_Parts <dbl> 24, 19, 25, 19
#> $ Other_Body_Parts <dbl> 1, 4, 10, 10
#> $ Cmp_Outcomes <dbl> 431, 577, 499, 416
#> $ Off_Outcomes <dbl> 0, 3, 1, 0
#> $ Out_Outcomes <dbl> 10, 7, 7, 9
#> $ Int_Outcomes <dbl> 1, 2, 6, 3
#> $ Blocks_Outcomes <dbl> 9, 18, 9, 11
This section will cover off the functions to get team-level data from FBref.
To get all the results a team(s) has competed in for a season, the following function can be used. The resulting data frame output will include all game results, including any cup games played, and will accept either one, or many team URLs.
#----- for single teams: -----#
man_city_2021_url <- "https://fbref.com/en/squads/b8fd03ef/Manchester-City-Stats"
man_city_2021_results <- get_team_match_results(man_city_2021_url)
dplyr::glimpse(man_city_2021_results)
#> Rows: 58
#> Columns: 20
#> $ Team_Url <chr> "https://fbref.com/en/squads/b8fd03ef/Manchester-City-Stats…
#> $ Team <chr> "Manchester City", "Manchester City", "Manchester City", "M…
#> $ Date <chr> "2021-08-07", "2021-08-15", "2021-08-21", "2021-08-28", "20…
#> $ Time <chr> "17:15", "16:30", "15:00", "12:30", "15:00", "20:00", "15:0…
#> $ Comp <chr> "Community Shield", "Premier League", "Premier League", "Pr…
#> $ Round <chr> "FA Community Shield", "Matchweek 1", "Matchweek 2", "Match…
#> $ Day <chr> "Sat", "Sun", "Sat", "Sat", "Sat", "Wed", "Sat", "Tue", "Sa…
#> $ Venue <chr> "Neutral", "Away", "Home", "Home", "Away", "Home", "Home", …
#> $ Result <chr> "L", "L", "W", "W", "W", "W", "D", "W", "W", "L", "D", "W",…
#> $ GF <chr> "0", "0", "5", "5", "1", "6", "0", "6", "1", "0", "2", "2",…
#> $ GA <chr> "1", "1", "0", "0", "0", "3", "0", "1", "0", "2", "2", "0",…
#> $ Opponent <chr> "Leicester City", "Tottenham", "Norwich City", "Arsenal", "…
#> $ xG <dbl> NA, 1.9, 2.7, 3.8, 2.9, 2.1, 1.1, NA, 1.7, 1.9, 1.2, 1.9, 3…
#> $ xGA <dbl> NA, 1.3, 0.1, 0.1, 0.8, 0.6, 0.4, NA, 0.3, 0.8, 1.0, 1.0, 0…
#> $ Poss <int> 57, 64, 67, 80, 61, 51, 63, 79, 60, 54, 51, 69, 63, 52, 64,…
#> $ Attendance <dbl> NA, 58262, 51437, 52276, 32087, 38062, 52698, 30959, 40036,…
#> $ Captain <chr> "Fernandinho", "Fernandinho", "İlkay Gündoğan", "İlkay Günd…
#> $ Formation <chr> "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-3", "4-3-…
#> $ Referee <chr> "Paul Tierney", "Anthony Taylor", "Graham Scott", "Martin A…
#> $ Notes <chr> "", "", "", "", "", "", "", "", "", "", "", "", "", "", "We…
#----- get all team URLs for a league: -----#
# epl_2021_team_urls <- fb_teams_urls("https://fbref.com/en/comps/9/Premier-League-Stats")
# epl_2021_team_results <- get_team_match_results(team_url = team_urls)
To be able to get full match logs for all available stat types for each team(s) season, the fb_team_match_log_stats()
function can be used.
Available stat types are below. Note, not all of the listed stat types are available for all teams, and for teams where they are, there still may be gaps in the resulting data frame as not all competitions participated in will have the stat available:
# can do it for one team:
man_city_url <- "https://fbref.com/en/squads/b8fd03ef/Manchester-City-Stats"
man_city_logs <- fb_team_match_log_stats(team_urls = man_city_url, stat_type = "passing")
dplyr::glimpse(man_city_logs)
# or multiple teams:
urls <- c("https://fbref.com/en/squads/822bd0ba/Liverpool-Stats",
"https://fbref.com/en/squads/b8fd03ef/Manchester-City-Stats")
shooting_logs <- fb_team_match_log_stats(team_urls = urls, stat_type = "shooting")
This section will cover the functions available to aid in the extraction of player season data.
The examples provided below in a lot of cases have the actual url (player or team) passed to them, however the suite of fbref helper functions outlined in this helpers vignette could also be used.
The fb_player_scouting_report()
function takes in two inputs;
player_url
- the URL of the player’s main pagepos_versus
which can return the player’s comparison against players in their “primary”, OR “secondary” positionand returns the full scouting report for the player selected.
As of version 0.3.6
, the function now returns the scouting report of ALL available periods, not just the “Last 365 Days”. As a result, there is now an additional column called scouting_period
. This column should be used to filter out the period/season you want the scouting report for:
# TO GET THE LAST 365 DAYS REPORT:
scout <- fb_player_scouting_report(player_url = "https://fbref.com/en/players/d70ce98e/Lionel-Messi",
pos_versus = "primary") %>%
dplyr::filter(scouting_period == "Last 365 Days")
#----- Get scouting report for the players primary position (first position listed in fbref): -----#
messi_primary <- fb_player_scouting_report(player_url = "https://fbref.com/en/players/d70ce98e/Lionel-Messi", pos_versus = "primary")
dplyr::glimpse(messi_primary)
#> Rows: 894
#> Columns: 8
#> $ Player <chr> "Lionel Messi", "Lionel Messi", "Lionel Messi", "Lione…
#> $ Versus <chr> "Att Mid / Wingers", "Att Mid / Wingers", "Att Mid / W…
#> $ StatGroup <chr> "Standard", "Standard", "Standard", "Standard", "Stand…
#> $ Statistic <chr> "Goals", "Assists", "Non-Penalty Goals", "Penalty Kick…
#> $ Per90 <dbl> 0.36, 0.45, 0.29, 0.06, 0.10, 0.03, 0.00, 0.48, 0.40, …
#> $ Percentile <dbl> 75, 97, 70, 83, 86, 91, 56, 93, 93, 96, 98, 75, 99, 87…
#> $ BasedOnMinutes <dbl> 2783, 2783, 2783, 2783, 2783, 2783, 2783, 2783, 2783, …
#> $ scouting_period <chr> "Last 365 Days", "Last 365 Days", "Last 365 Days", "La…
#----- Get scouting report for the players secondary position (second position listed in fbref): -----#
messi_secondary <- fb_player_scouting_report(player_url = "https://fbref.com/en/players/d70ce98e/Lionel-Messi", pos_versus = "secondary")
dplyr::glimpse(messi_secondary)
#> Rows: 894
#> Columns: 8
#> $ Player <chr> "Lionel Messi", "Lionel Messi", "Lionel Messi", "Lione…
#> $ Versus <chr> "Forwards", "Forwards", "Forwards", "Forwards", "Forwa…
#> $ StatGroup <chr> "Standard", "Standard", "Standard", "Standard", "Stand…
#> $ Statistic <chr> "Goals", "Assists", "Non-Penalty Goals", "Penalty Kick…
#> $ Per90 <dbl> 0.36, 0.45, 0.29, 0.06, 0.10, 0.03, 0.00, 0.48, 0.40, …
#> $ Percentile <dbl> 42, 99, 40, 57, 62, 92, 55, 74, 73, 99, 93, 42, 91, 72…
#> $ BasedOnMinutes <dbl> 2783, 2783, 2783, 2783, 2783, 2783, 2783, 2783, 2783, …
#> $ scouting_period <chr> "Last 365 Days", "Last 365 Days", "Last 365 Days", "La…
The fb_player_season_stats()
function allows for the extraction of historical season totals for selected player URLs and stat_type.
The stat_types available for use in this function are below:
#----- can use for a single player: -----#
mo_shooting <- fb_player_season_stats("https://fbref.com/en/players/e342ad68/Mohamed-Salah", stat_type = 'shooting')
dplyr::glimpse(mo_shooting)
#> Rows: 39
#> Columns: 25
#> $ player_name <chr> "Mohamed Salah", "Mohamed Salah", "Mohamed Sa…
#> $ player_url <chr> "https://fbref.com/en/players/e342ad68/Mohame…
#> $ Season <chr> "2012-2013", "2012-2013", "2013-2014", "2013-…
#> $ Age <dbl> 20, 20, 21, 21, 21, 22, 22, 22, 22, 22, 22, 2…
#> $ Squad <chr> "Basel", "Basel", "Basel", "Basel", "Chelsea"…
#> $ Country <chr> "", "SUI", "", "SUI", "ENG", "", "ENG", "ENG"…
#> $ Comp <chr> "2. Europa Lg", "1. Swiss Super League", "1. …
#> $ Mins_Per_90 <dbl> 10.5, 16.0, 5.9, 12.8, 5.6, 0.8, 0.8, 1.7, 0.…
#> $ Gls_Standard <dbl> 2, 5, 2, 4, 2, 0, 0, 0, 0, 2, 1, 6, 1, 0, 14,…
#> $ Sh_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ SoT_Standard <dbl> 7, 14, 6, 14, 6, 2, NA, NA, 0, NA, 5, 13, 3, …
#> $ SoT_percent_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ Sh_per_90_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ SoT_per_90_Standard <dbl> 0.66, 0.87, 1.02, 1.10, 1.08, 2.37, NA, NA, 0…
#> $ G_per_Sh_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ G_per_SoT_Standard <dbl> 0.29, NA, 0.33, 0.29, 0.33, 0.00, NA, NA, NA,…
#> $ Dist_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ FK_Standard <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ PK_Standard <dbl> 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ PKatt_Standard <dbl> 0, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
#> $ xG_Expected <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ npxG_Expected <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ npxG_per_Sh_Expected <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ G_minus_xG_Expected <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#> $ `np:G_minus_xG_Expected` <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
#----- Or for multiple players at a time: -----#
# multiple_playing_time <- fb_player_season_stats(player_url = c("https://fbref.com/en/players/d70ce98e/Lionel-Messi",
# "https://fbref.com/en/players/dea698d9/Cristiano-Ronaldo"),
# stat_type = "playing_time")
The fb_big5_advanced_season_stats()
function allows users to extract data for any of the below listed stat types for all players of the big five European leagues (EPL, La Liga, Ligue 1, Serie A, Bundesliga).
The stat types available for this function are below:
The function also accepts a season or seasons and whether you want data for the player, or team.
big5_player_possession <- fb_big5_advanced_season_stats(season_end_year= 2021, stat_type= "possession", team_or_player= "player")
dplyr::glimpse(big5_player_possession)
#> Rows: 2,822
#> Columns: 34
#> $ Season_End_Year <int> 2021, 2021, 2021, 2021, 2021, 2021, 2021, 2021, …
#> $ Squad <chr> "Alavés", "Alavés", "Alavés", "Alavés", "Alavés"…
#> $ Comp <chr> "La Liga", "La Liga", "La Liga", "La Liga", "La …
#> $ Player <chr> "Martin Agirregabiria", "Rodrigo Battaglia", "Bu…
#> $ Nation <chr> "ESP", "ARG", "ESP", "ESP", "BRA", "ESP", "BRA",…
#> $ Pos <chr> "DF,MF", "MF", "MF", "MF", "FW", "DF", "DF", "MF…
#> $ Age <chr> "24", "29", "26", "23", "29", "24", "26", "34", …
#> $ Born <dbl> 1996, 1991, 1993, 1997, 1991, 1995, 1993, 1986, …
#> $ Mins_Per_90 <dbl> 17.3, 27.7, 0.2, 2.4, 9.7, 28.3, 5.2, 12.5, 0.2,…
#> $ Touches_Touches <dbl> 883, 1266, 12, 115, 443, 1656, 209, 521, 11, 87,…
#> $ `Def Pen_Touches` <dbl> 48, 86, 1, 3, 13, 105, 37, 25, 0, 1, 57, 15, 0, …
#> $ `Def 3rd_Touches` <dbl> 306, 380, 1, 26, 58, 521, 126, 123, 0, 4, 155, 1…
#> $ `Mid 3rd_Touches` <dbl> 454, 813, 7, 51, 226, 830, 78, 341, 7, 47, 800, …
#> $ `Att 3rd_Touches` <dbl> 155, 145, 6, 47, 175, 369, 10, 77, 4, 41, 565, 2…
#> $ `Att Pen_Touches` <dbl> 8, 21, 1, 3, 31, 12, 6, 13, 1, 10, 155, 15, 0, 1…
#> $ Live_Touches <dbl> 706, 1227, 12, 110, 433, 1344, 196, 510, 8, 79, …
#> $ Succ_Dribbles <dbl> 4, 18, 0, 3, 6, 5, 1, 2, 0, 1, 30, 22, 0, 0, 10,…
#> $ Att_Dribbles <dbl> 8, 23, 0, 3, 15, 11, 1, 6, 0, 2, 65, 45, 0, 0, 1…
#> $ Succ_percent_Dribbles <dbl> 50.0, 78.3, NA, 100.0, 40.0, 45.5, 100.0, 33.3, …
#> $ `#Pl_Dribbles` <dbl> 4, 19, 0, 3, 6, 6, 1, 4, 0, 1, 35, 27, 0, 0, 10,…
#> $ Megs_Dribbles <dbl> 0, 0, 0, 0, 2, 1, 0, 1, 0, 0, 3, 3, 0, 0, 1, 0, …
#> $ Carries_Carries <dbl> 412, 804, 8, 62, 202, 713, 88, 281, 4, 44, 711, …
#> $ TotDist_Carries <dbl> 1633, 4594, 43, 225, 618, 3260, 252, 1079, 49, 1…
#> $ PrgDist_Carries <dbl> 843, 2104, 35, 120, 258, 1688, 98, 467, 26, 39, …
#> $ Prog_Carries <dbl> 33, 78, 3, 6, 14, 82, 2, 18, 2, 2, 42, 60, 1, 29…
#> $ Final_Third_Carries <dbl> 8, 23, 2, 2, 2, 25, 1, 6, 0, 2, 12, 25, 0, 1, 2,…
#> $ CPA_Carries <dbl> 2, 0, 1, 1, 2, 1, 0, 0, 0, 0, 11, 3, 0, 0, 0, 2,…
#> $ Mis_Carries <dbl> 8, 15, 0, 8, 28, 8, 1, 8, 1, 5, 77, 23, 0, 3, 5,…
#> $ Dis_Carries <dbl> 9, 18, 0, 6, 13, 7, 0, 11, 0, 1, 50, 21, 0, 3, 6…
#> $ Targ_Receiving <dbl> 501, 840, 12, 93, 530, 808, 101, 370, 7, 124, 17…
#> $ Rec_Receiving <dbl> 425, 778, 7, 69, 301, 755, 98, 307, 5, 67, 1129,…
#> $ Rec_percent_Receiving <dbl> 84.8, 92.6, 58.3, 74.2, 56.8, 93.4, 97.0, 83.0, …
#> $ Prog_Receiving <dbl> 19, 16, 1, 8, 75, 15, 4, 36, 1, 12, 211, 36, 0, …
#> $ Url <chr> "https://fbref.com/en/players/355c883a/Martin-Ag…
The fb_team_player_stats()
function allows users to extract data for any of the below listed stat types for all players of selected team(s) seasons,
The stat types available for this function are below:
#----- to get stats for just a single team: -----#
fleetwood_standard_stats <- fb_team_player_stats(team_urls= "https://fbref.com/en/squads/d6a369a2/Fleetwood-Town-Stats", stat_type= 'standard')
dplyr::glimpse(fleetwood_standard_stats)
#> Rows: 41
#> Columns: 24
#> $ Season <chr> "2021-2022", "2021-2022", "2021-2022", "202…
#> $ Squad <chr> "Fleetwood Town", "Fleetwood Town", "Fleetw…
#> $ Comp <chr> "League One", "League One", "League One", "…
#> $ Player <chr> "Alex Cairns", "Danny Andrew", "Tom Clarke"…
#> $ Nation <chr> "ENG", "ENG", "ENG", "ENG", "NIR", "ENG", "…
#> $ Pos <chr> "GK", "DF,MF", "DF", "DF,MF", "MF", "MF", "…
#> $ Age <chr> "28", "30", "33", "24", "25", "25", "23", "…
#> $ MP_Playing_Time <dbl> 42, 39, 35, 35, 31, 32, 30, 28, 26, 20, 20,…
#> $ Starts_Playing_Time <dbl> 42, 39, 35, 32, 29, 26, 23, 23, 19, 19, 18,…
#> $ Min_Playing_Time <dbl> 3774, 3408, 3106, 2923, 2544, 2309, 2004, 1…
#> $ Mins_Per_90_Playing_Time <dbl> 41.9, 37.9, 34.5, 32.5, 28.3, 25.7, 22.3, 2…
#> $ Gls <dbl> 0, 6, 2, 3, 3, 5, 2, 7, 4, 0, 0, 1, 1, 6, 0…
#> $ Ast <dbl> 0, 7, 0, 1, 5, 2, 5, 1, 8, 0, 1, 0, 1, 2, 2…
#> $ G_minus_PK <dbl> 0, 6, 2, 3, 3, 5, 2, 6, 4, 0, 0, 1, 1, 5, 0…
#> $ PK <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0…
#> $ PKatt <dbl> 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 0…
#> $ CrdY <dbl> 1, 5, 4, 3, 2, 4, 4, 2, 2, 3, 2, 6, 4, 2, 0…
#> $ CrdR <dbl> 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0…
#> $ Gls_Per_Minutes <dbl> 0.00, 0.16, 0.06, 0.09, 0.11, 0.19, 0.09, 0…
#> $ Ast_Per_Minutes <dbl> 0.00, 0.18, 0.00, 0.03, 0.18, 0.08, 0.22, 0…
#> $ `G+A_Per_Minutes` <dbl> 0.00, 0.34, 0.06, 0.12, 0.28, 0.27, 0.31, 0…
#> $ G_minus_PK_Per_Minutes <dbl> 0.00, 0.16, 0.06, 0.09, 0.11, 0.19, 0.09, 0…
#> $ `G+A_minus_PK_Per_Minutes` <dbl> 0.00, 0.34, 0.06, 0.12, 0.28, 0.27, 0.31, 0…
#> $ PlayerURL <chr> "https://fbref.com/en/players/1a5e36e7/Alex…
#----- Can even get stats for a series of teams: -----#
# league_url <- fb_league_urls(country = "ENG", gender = "M",
# teams <- fb_teams_urls(league_url)
#
# multiple_playing_time <- fb_team_player_stats(team_urls= teams, stat_type= "playing_time")
The fb_player_match_logs()
function allows the user to return a data frame of the match logs of different stat types for a player’s matches played in a season.
The following stat types can be selected, depending on the player’s position (ie a striker probably won’t have “keepers” stats):
ederson_summary <- fb_player_match_logs("https://fbref.com/en/players/3bb7b8b4/Ederson", season_end_year = 2021, stat_type = 'summary')
dplyr::glimpse(ederson_summary)
#> Rows: 66
#> Columns: 39
#> $ Player <chr> "Ederson", "Ederson", "Ederson", "Ederson", "Ederso…
#> $ Season <chr> "2020-2021", "2020-2021", "2020-2021", "2020-2021",…
#> $ Date <chr> "2020-09-21", "2020-09-27", "2020-09-30", "2020-10-…
#> $ Day <chr> "Mon", "Sun", "Wed", "Sat", "Fri", "Tue", "Sat", "W…
#> $ Comp <chr> "Premier League", "Premier League", "EFL Cup", "Pre…
#> $ Round <chr> "Matchweek 2", "Matchweek 3", "Fourth round", "Matc…
#> $ Venue <chr> "Away", "Home", "Away", "Away", "Home", "Away", "Ho…
#> $ Result <chr> "W 3–1", "L 2–5", "W 3–0", "D 1–1", "W 5–0", "W 4–2…
#> $ Squad <chr> "Manchester City", "Manchester City", "Manchester C…
#> $ Opponent <chr> "Wolves", "Leicester City", "Burnley", "Leeds Unite…
#> $ Start <chr> "Y", "Y", "N", "Y", "N", "N", "Y", "Y", "Y", "Y", "…
#> $ Pos <chr> "GK", "GK", "On matchday squad, but did not play", …
#> $ Min <dbl> 90, 90, NA, 90, NA, NA, 90, 90, 90, 90, 90, 90, 90,…
#> $ Gls <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Ast <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ PK <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ PKatt <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Sh <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ SoT <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ CrdY <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ CrdR <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Touches <dbl> 36, 19, NA, 36, NA, NA, 50, 39, 26, 22, 36, 30, 40,…
#> $ Press <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 1…
#> $ Tkl <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ Int <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Blocks <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ xG_Expected <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ npxG_Expected <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ xA_Expected <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ SCA_SCA <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ GCA_SCA <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ Cmp_Passes <dbl> 30, 15, NA, 24, NA, NA, 29, 31, 24, 19, 28, 24, 33,…
#> $ Att_Passes <dbl> 36, 15, NA, 30, NA, NA, 46, 38, 26, 21, 35, 29, 37,…
#> $ Cmp_percent_Passes <dbl> 83.3, 100.0, NA, 80.0, NA, NA, 63.0, 81.6, 92.3, 90…
#> $ Prog_Passes <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ Carries_Carries <dbl> 27, 5, NA, 17, NA, NA, 29, 24, 13, 13, 27, 20, 25, …
#> $ Prog_Carries <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ Succ_Dribbles <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…
#> $ Att_Dribbles <dbl> 0, 0, NA, 0, NA, NA, 0, 0, 0, 0, 0, 0, 0, NA, NA, 0…