Formats of Competition Results

Evgeni Chasnovski

2023-02-28

comperes offers a pipe (%>%) friendly set of tools for storing and managing competition results (hereafter - results). This vignette discusses following topics:

Understanding of competition is quite general: it is a set of games (abstract event) in which players (abstract entity) gain some abstract scores (typically numeric). The most natural example is sport results, however not the only one. For example, product rating can be considered as a competition between products as “players”. Here a “game” is a customer that reviews a set of products by rating them with numerical “score” (stars, points, etc.).

We will need the following packages:

library(comperes)
library(tibble)

Storage

Long format

Results in long format are stored in object of class longcr. It is considered to be a tibble with one row per game-player pair. It should have at least columns with names “game”, “player” and “score”. For example:

cr_long_raw <- tibble(
  game   = c(1,  1,  1, 2, 2, 3, 3, 4),
  player = c(1, NA, NA, 1, 2, 2, 1, 2),
  score  = 1:8
)

To convert cr_long_raw into longcr object use as_longcr():

cr_long <- as_longcr(cr_long_raw)
cr_long
#> # A longcr object:
#> # A tibble: 8 × 3
#>    game player score
#>   <dbl>  <dbl> <int>
#> 1     1      1     1
#> 2     1     NA     2
#> 3     1     NA     3
#> 4     2      1     4
#> 5     2      2     5
#> 6     3      2     6
#> 7     3      1     7
#> 8     4      2     8

By default, as_longcr() repairs its input by applying set of heuristics to extract relevant data:

tibble(
  PlayerRS = "a",
  gameSS = "b",
  extra = -1,
  score_game = 10,
  player = 1
) %>%
  as_longcr()
#> as_longcr: Some matched names are not perfectly matched:
#>   gameSS -> game
#>   score_game -> score
#> # A longcr object:
#> # A tibble: 1 × 5
#>   game  player score PlayerRS extra
#>   <chr>  <dbl> <dbl> <chr>    <dbl>
#> 1 b          1    10 a           -1

Wide format

Results in wide format are stored in object of class widecr. It is considered to be a tibble with one row per game with fixed amount of players. Data should be organized in pairs of columns “player”-“score”. Identifier of a pair should go after respective keyword and consist only from digits. For example: player1, score1, player2, score2. Order doesn’t matter.

Extra columns are allowed. Column game for game identifier is optional.

Example of correct wide format:

cr_wide_raw <- tibble(
  player1 = c(1, 1, 2),
  score1  = -(1:3),
  player2 = c(2, 3, 3),
  score2  = -(4:6)
)

To convert cr_wide_raw into widecr object use as_widecr():

cr_wide <- cr_wide_raw %>% as_widecr()
cr_wide
#> # A widecr object:
#> # A tibble: 3 × 4
#>   player1 score1 player2 score2
#>     <dbl>  <int>   <dbl>  <int>
#> 1       1     -1       2     -4
#> 2       1     -2       3     -5
#> 3       2     -3       3     -6

By default, as_widecr() also does repairing of its input:

tibble(
  score = 2,
  PlayerRS = "a",
  scoreRS = 1,
  player = "b",
  player1 = "c",
  extra = -1,
  game = "game"
) %>%
  as_widecr()
#> as_widecr: Some matched names are not perfectly matched:
#>   player -> player1
#>   score -> score1
#>   player1 -> player2
#>   PlayerRS -> player3
#>   scoreRS -> score3
#> as_widecr: Next columns are not found. Creating with NAs.
#>   score2
#> # A widecr object:
#> # A tibble: 1 × 8
#>   game  player1 score1 player2 score2 player3 score3 extra
#>   <chr> <chr>    <dbl> <chr>    <int> <chr>    <dbl> <dbl>
#> 1 game  b            2 c           NA a            1    -1

Conversion

as_longcr() and as_widecr() do actual conversion applied to widecr and longcr objects respectively:

as_longcr(cr_wide)
#> # A longcr object:
#> # A tibble: 6 × 3
#>    game player score
#>   <int>  <dbl> <int>
#> 1     1      1    -1
#> 2     1      2    -4
#> 3     2      1    -2
#> 4     2      3    -5
#> 5     3      2    -3
#> 6     3      3    -6

# Determines number of players in game as
# actual maximum number of players in games
as_widecr(cr_long)
#> # A widecr object:
#> # A tibble: 4 × 7
#>    game player1 score1 player2 score2 player3 score3
#>   <dbl>   <dbl>  <int>   <dbl>  <int>   <dbl>  <int>
#> 1     1       1      1      NA      2      NA      3
#> 2     2       1      4       2      5      NA     NA
#> 3     3       2      6       1      7      NA     NA
#> 4     4       2      8      NA     NA      NA     NA

Notes