# Calculating Proportion with N

One of the more typical workflows I need to perform is calculating counts within groups. The dplyr `group_by() %>% summarise()` pipeline (or just `count()`) can easily get aggregated tallies. But occassionally it’s helpful to get count within group along with the proportion overall.

Here’s one way to do that:

``````library(dplyr)

# use the starwars dataset
# one row per character
# can be grouped by character species
starwars``````
``````## # A tibble: 87 x 13
##    name  height  mass hair_color skin_color eye_color birth_year gender
##    <chr>  <int> <dbl> <chr>      <chr>      <chr>          <dbl> <chr>
##  1 Luke…    172    77 blond      fair       blue            19   male
##  2 C-3PO    167    75 <NA>       gold       yellow         112   <NA>
##  3 R2-D2     96    32 <NA>       white, bl… red             33   <NA>
##  4 Dart…    202   136 none       white      yellow          41.9 male
##  5 Leia…    150    49 brown      light      brown           19   female
##  6 Owen…    178   120 brown, gr… light      blue            52   male
##  7 Beru…    165    75 brown      light      blue            47   female
##  8 R5-D4     97    32 <NA>       white, red red             NA   <NA>
##  9 Bigg…    183    84 black      light      brown           24   male
## 10 Obi-…    182    77 auburn, w… fair       blue-gray       57   male
## # … with 77 more rows, and 5 more variables: homeworld <chr>,
## #   species <chr>, films <list>, vehicles <list>, starships <list>``````
``````starwars %>%
# get the count in category with count()
count(species, name = "species_n", sort = TRUE) %>%
# add a column that gives you total of all ...
# calculate proportion ...
# format n (%)
# keep in mind you won't be able to sort after formatting as character
mutate(n = sum(species_n),
prop = round(species_n/n * 100, digits = 0),
prop_with_n = paste0(species_n, " (", prop, "%)")) %>%
# return only columns of interest
select(species, prop_with_n)``````
``````## # A tibble: 38 x 2
##    species  prop_with_n
##    <chr>    <chr>
##  1 Human    35 (40%)
##  2 Droid    5 (6%)
##  3 <NA>     5 (6%)
##  4 Gungan   3 (3%)
##  5 Kaminoan 2 (2%)
##  6 Mirialan 2 (2%)
##  7 Twi'lek  2 (2%)
##  8 Wookiee  2 (2%)
##  9 Zabrak   2 (2%)
## 10 Aleena   1 (1%)
## # … with 28 more rows``````