Hey, That's My Fish! Review
The fish-gorging fiesta.
4/15/2017
Measuring and analyzing the most frequently used words in BoardGameGeek game descriptions.
Posted on 5/24/2017 by Tim Rice
Language is a fascinating aspect of the human experience. Each of the hundreds of thousands of words in the English language have unique meanings, histories, and connotations which, when combined, can represent everything humanity has ever known. Pretty amazing.
You can learn a lot about a culture by studying its language. As a member of the hobby gaming community, I thought it would be interesting to measure words that are common in the board gaming subculture and compare them to a general list in order to identify the most game-centric terms. This project is the result of that curiosity.
When it comes to board game data, there's no better source than BoardGameGeek. The site is a giant community-driven database home to a wealth of information about thousands of board games, which makes it the perfect place to collect game-centric words.
In order to create a list of the most common "board game words", I wrote a program that utilizes the BGG XML API to access descriptions from a large sample of board games in the database. The program then parsed the descriptions into their individual words, removed punctuation, converted all letters to lowercase, and kept track of each word’s number of occurrences. The result is a list of words that are used to describe board games, sorted from the most frequent to the least frequent.
Source code: bggwords.py
The BGG column of the table below shows the top 100 most frequent words from BoardGameGeek descriptions. Here is the list up to 1000: bgg-top-1000-words.txt. For some perspective, the right column shows the general list of the top 100 most frequent English words. The right list was taken from Peter Norvig’s website, who used an n-gram frequency analysis of Google’s Trillion Word Corpus to gather the data. See the full list of ⅓ million words here.
Rank | BGG | General |
---|---|---|
1 | the | the |
2 | of | of |
3 | a | and |
4 | and | to |
5 | to | a |
6 | game | in |
7 | in | for |
8 | is | is |
9 | on | on |
10 | you | that |
11 | player | by |
12 | with | this |
13 | for | with |
14 | players | i |
15 | are | you |
16 | cards | it |
17 | or | not |
18 | each | or |
19 | that | be |
20 | from | are |
21 | as | from |
22 | one | at |
23 | your | as |
24 | by | your |
25 | it | all |
26 | card | have |
27 | be | new |
28 | can | more |
29 | their | an |
30 | this | was |
31 | all | we |
32 | board | will |
33 | an | home |
34 | play | can |
35 | at | us |
36 | have | about |
37 | if | if |
38 | they | page |
39 | which | my |
40 | has | has |
41 | will | search |
42 | two | free |
43 | first | but |
44 | other | our |
45 | but | one |
46 | up | other |
47 | points | do |
48 | turn | no |
49 | move | information |
50 | his | time |
51 | rules | they |
52 | there | site |
53 | when | he |
54 | dice | up |
55 | more | may |
56 | out | what |
57 | who | which |
58 | most | their |
59 | must | news |
60 | then | out |
61 | not | use |
62 | them | any |
63 | games | there |
64 | also | see |
65 | number | only |
66 | get | so |
67 | may | his |
68 | time | when |
69 | three | contact |
70 | different | here |
71 | pieces | business |
72 | into | who |
73 | played | web |
74 | wins | also |
75 | only | now |
76 | was | help |
77 | he | get |
78 | take | pm |
79 | new | view |
80 | four | online |
81 | set | c |
82 | any | e |
83 | playing | first |
84 | so | am |
85 | die | been |
86 | where | would |
87 | war | how |
88 | over | were |
89 | roll | me |
90 | same | s |
91 | no | services |
92 | use | some |
93 | some | these |
94 | end | click |
95 | battle | its |
96 | deck | like |
97 | win | service |
98 | color | x |
99 | make | than |
100 | based | find |
As you can see, most of the highest ranked words are prepositions, conjunctions, and determiners. This is expected; these words are often required to logically connect ideas together, and that's just how English works. Interestingly, we do see words on the BGG list (game, player, card, board, play, points, turn, move, rules, dice, etc.) which don’t appear on the general top 100 at all.
This table is interesting, but I would not call the the most game-centric word as a result. In order to develop that list, I gave each of the 1000 words collected from BGG a score. A word’s score is equal to a word's general rank (from Norvig’s list) minus its BGG rank (determined by my code). Therefore, a positive score means the word appears more often in BGG descriptions than in general written English, and higher scores represent larger differences. The 200 words with the highest score are shown in the table below.
Rank | Word | BGG Rank | General Rank | Score |
---|---|---|---|---|
1 | gameboard | 684 | 136349 | 135665 |
2 | cannot | 493 | 106595 | 106102 |
3 | wargame | 485 | 69356 | 68871 |
4 | rulebook | 563 | 55369 | 54806 |
5 | boardgame | 979 | 54810 | 53831 |
6 | pawns | 324 | 52151 | 51827 |
7 | marbles | 907 | 25259 | 24352 |
8 | pawn | 442 | 23745 | 23303 |
9 | spinner | 398 | 21782 | 21384 |
10 | checkers | 987 | 22256 | 21269 |
11 | cubes | 919 | 18162 | 17243 |
12 | discard | 506 | 17665 | 17159 |
13 | secretly | 964 | 18092 | 17128 |
14 | cavalry | 821 | 17666 | 16845 |
15 | armies | 417 | 16537 | 16120 |
16 | tokens | 200 | 14979 | 14779 |
17 | artillery | 726 | 15082 | 14356 |
18 | cardboard | 727 | 14773 | 14046 |
19 | dungeon | 915 | 14799 | 13884 |
20 | paced | 814 | 14510 | 13696 |
21 | themed | 873 | 14280 | 13407 |
22 | clues | 751 | 13852 | 13101 |
23 | miniatures | 336 | 13410 | 13074 |
24 | chooses | 743 | 13220 | 12477 |
25 | opposing | 588 | 12575 | 11987 |
26 | solitaire | 560 | 12519 | 11959 |
27 | monopoly | 333 | 11934 | 11601 |
28 | randomly | 766 | 11860 | 11094 |
29 | gameplay | 737 | 11769 | 11032 |
30 | whoever | 658 | 11660 | 11002 |
31 | hex | 344 | 11232 | 10888 |
32 | loses | 888 | 11685 | 10797 |
33 | decks | 574 | 11367 | 10793 |
34 | squares | 272 | 10868 | 10596 |
35 | variant | 526 | 10860 | 10334 |
36 | pirate | 900 | 11071 | 10171 |
37 | infantry | 505 | 10664 | 10159 |
38 | clue | 961 | 11057 | 10096 |
39 | pile | 345 | 10302 | 9957 |
40 | tactical | 358 | 10297 | 9939 |
41 | battles | 243 | 10075 | 9832 |
42 | sided | 271 | 10097 | 9826 |
43 | numbered | 595 | 10357 | 9762 |
44 | tiles | 112 | 9738 | 9626 |
45 | booklet | 655 | 10183 | 9528 |
46 | monsters | 559 | 9988 | 9429 |
47 | allies | 790 | 10121 | 9331 |
48 | opponents | 185 | 9499 | 9314 |
49 | warfare | 770 | 10012 | 9242 |
50 | opponent | 118 | 9071 | 8953 |
51 | draws | 627 | 9521 | 8894 |
52 | rounds | 353 | 9206 | 8853 |
53 | dice | 54 | 8892 | 8838 |
54 | token | 486 | 9277 | 8791 |
55 | terrain | 599 | 9146 | 8547 |
56 | fought | 942 | 9434 | 8492 |
57 | scenarios | 284 | 8598 | 8314 |
58 | dealt | 579 | 8747 | 8168 |
59 | rolled | 586 | 8695 | 8109 |
60 | tactics | 647 | 8665 | 8018 |
61 | markers | 490 | 8453 | 7963 |
62 | creatures | 868 | 8808 | 7940 |
63 | steal | 773 | 8683 | 7910 |
64 | reaches | 783 | 8362 | 7579 |
65 | combinations | 928 | 8498 | 7570 |
66 | answering | 925 | 8476 | 7551 |
67 | miniature | 960 | 8410 | 7450 |
68 | defeat | 700 | 8135 | 7435 |
69 | marker | 827 | 7866 | 7039 |
70 | abilities | 545 | 7577 | 7032 |
71 | trivia | 315 | 7290 | 6975 |
72 | timer | 863 | 7651 | 6788 |
73 | invasion | 947 | 7722 | 6775 |
74 | tile | 213 | 6983 | 6770 |
75 | allied | 876 | 7637 | 6761 |
76 | determines | 999 | 7760 | 6761 |
77 | rows | 982 | 7725 | 6743 |
78 | simultaneously | 891 | 7598 | 6707 |
79 | chess | 446 | 7113 | 6667 |
80 | trick | 552 | 7167 | 6615 |
81 | treasure | 355 | 6968 | 6613 |
82 | naval | 661 | 7246 | 6585 |
83 | captured | 712 | 7253 | 6541 |
84 | rid | 878 | 7381 | 6503 |
85 | mechanics | 714 | 7215 | 6501 |
86 | tries | 663 | 7153 | 6490 |
87 | flip | 970 | 7356 | 6386 |
88 | suits | 957 | 7318 | 6361 |
89 | reveal | 872 | 7124 | 6252 |
90 | landing | 555 | 6770 | 6215 |
91 | rolls | 370 | 6566 | 6196 |
92 | placing | 483 | 6632 | 6149 |
93 | spell | 975 | 6986 | 6011 |
94 | tanks | 966 | 6973 | 6007 |
95 | compete | 577 | 6554 | 5977 |
96 | collecting | 659 | 6583 | 5924 |
97 | soviet | 735 | 6647 | 5912 |
98 | scoring | 298 | 6116 | 5818 |
99 | scored | 781 | 6533 | 5752 |
100 | challenging | 976 | 6722 | 5746 |
101 | counters | 156 | 5868 | 5712 |
102 | scenario | 491 | 6193 | 5702 |
103 | colored | 386 | 6087 | 5701 |
104 | adjacent | 641 | 6247 | 5606 |
105 | bull | 512 | 6069 | 5557 |
106 | pairs | 927 | 6457 | 5530 |
107 | destroy | 774 | 6266 | 5492 |
108 | offensive | 850 | 6274 | 5424 |
109 | heroes | 541 | 5885 | 5344 |
110 | fleet | 817 | 6013 | 5196 |
111 | tricks | 968 | 6163 | 5195 |
112 | stack | 517 | 5702 | 5185 |
113 | complexity | 896 | 5977 | 5081 |
114 | holes | 618 | 5554 | 4936 |
115 | representing | 500 | 5335 | 4835 |
116 | stones | 499 | 5206 | 4707 |
117 | receives | 759 | 5444 | 4685 |
118 | opposite | 786 | 5425 | 4639 |
119 | spin | 566 | 5184 | 4618 |
120 | wins | 74 | 4673 | 4599 |
121 | correctly | 439 | 4996 | 4557 |
122 | wooden | 388 | 4922 | 4534 |
123 | chips | 328 | 4815 | 4487 |
124 | enemy | 312 | 4784 | 4472 |
125 | combat | 124 | 4575 | 4451 |
126 | hero | 859 | 5294 | 4435 |
127 | divided | 587 | 4987 | 4400 |
128 | lands | 503 | 4843 | 4340 |
129 | spaces | 148 | 4427 | 4279 |
130 | symbols | 902 | 5163 | 4261 |
131 | deck | 96 | 4329 | 4233 |
132 | revealed | 832 | 5060 | 4228 |
133 | troops | 643 | 4845 | 4202 |
134 | collect | 170 | 4363 | 4193 |
135 | moves | 216 | 4398 | 4182 |
136 | balls | 715 | 4891 | 4176 |
137 | rolling | 375 | 4537 | 4162 |
138 | drawn | 488 | 4609 | 4121 |
139 | victory | 165 | 4277 | 4112 |
140 | puzzle | 756 | 4785 | 4029 |
141 | consists | 317 | 4306 | 3989 |
142 | turns | 130 | 4117 | 3987 |
143 | simulation | 502 | 4317 | 3815 |
144 | luck | 405 | 4206 | 3801 |
145 | quest | 941 | 4737 | 3796 |
146 | skill | 597 | 4391 | 3794 |
147 | adds | 870 | 4591 | 3721 |
148 | capture | 297 | 4006 | 3709 |
149 | throw | 739 | 4404 | 3665 |
150 | exciting | 516 | 4180 | 3664 |
151 | sides | 472 | 4115 | 3643 |
152 | escape | 801 | 4430 | 3629 |
153 | covering | 952 | 4454 | 3502 |
154 | represented | 717 | 4209 | 3492 |
155 | coins | 767 | 4251 | 3484 |
156 | monster | 581 | 4049 | 3468 |
157 | grid | 325 | 3766 | 3441 |
158 | soldiers | 874 | 4313 | 3439 |
159 | dragon | 709 | 4086 | 3377 |
160 | winner | 110 | 3462 | 3352 |
161 | represents | 402 | 3730 | 3328 |
162 | marked | 745 | 4026 | 3281 |
163 | corresponding | 848 | 4118 | 3270 |
164 | castle | 664 | 3905 | 3241 |
165 | draw | 162 | 3350 | 3188 |
166 | plays | 287 | 3470 | 3183 |
167 | suit | 553 | 3709 | 3156 |
168 | earn | 528 | 3644 | 3116 |
169 | depending | 448 | 3555 | 3107 |
170 | drawing | 575 | 3671 | 3096 |
171 | fighting | 603 | 3676 | 3073 |
172 | sheets | 719 | 3767 | 3048 |
173 | powers | 648 | 3686 | 3038 |
174 | empire | 532 | 3547 | 3015 |
175 | weapons | 461 | 3444 | 2983 |
176 | remaining | 569 | 3546 | 2977 |
177 | decide | 613 | 3568 | 2955 |
178 | starts | 335 | 3286 | 2951 |
179 | objective | 465 | 3381 | 2916 |
180 | pieces | 71 | 2983 | 2912 |
181 | attacks | 741 | 3651 | 2910 |
182 | tower | 692 | 3599 | 2907 |
183 | blocks | 796 | 3703 | 2907 |
184 | ages | 253 | 3135 | 2882 |
185 | scores | 460 | 3309 | 2849 |
186 | catch | 792 | 3635 | 2843 |
187 | chip | 885 | 3712 | 2827 |
188 | expansion | 758 | 3557 | 2799 |
189 | collected | 789 | 3576 | 2787 |
190 | lose | 407 | 3169 | 2762 |
191 | chosen | 778 | 3519 | 2741 |
192 | begins | 614 | 3343 | 2729 |
193 | attempt | 360 | 3056 | 2696 |
194 | don | 322 | 2982 | 2660 |
195 | ancient | 633 | 3283 | 2650 |
196 | challenges | 836 | 3419 | 2583 |
197 | represent | 399 | 2976 | 2577 |
198 | colors | 289 | 2836 | 2547 |
199 | continues | 681 | 3193 | 2512 |
200 | roll | 89 | 2595 | 2506 |
Overall, I’m pretty happy with these results. Board gaming is an extremely open-ended hobby, so It’s interesting to see which words stand out and become universal "gaming terms", and to think about why.
Did you make any observations that I missed? Any surprises? I’d love to hear your thoughts.
Thanks for reading!