{"id":5265,"date":"2024-05-13T02:15:00","date_gmt":"2024-05-13T06:15:00","guid":{"rendered":"https:\/\/www.both.org\/?p=5265"},"modified":"2024-05-09T15:10:42","modified_gmt":"2024-05-09T19:10:42","slug":"using-grep-to-play-a-word-game","status":"publish","type":"post","link":"https:\/\/www.both.org\/?p=5265","title":{"rendered":"Using \u2018grep\u2019 to play a word game"},"content":{"rendered":"<div class=\"pld-like-dislike-wrap pld-template-1\">\r\n    <div class=\"pld-like-wrap  pld-common-wrap\">\r\n    <a href=\"javascript:void(0)\" class=\"pld-like-trigger pld-like-dislike-trigger  \" title=\"\" data-post-id=\"5265\" data-trigger-type=\"like\" data-restriction=\"cookie\" data-already-liked=\"0\">\r\n                        <i class=\"fas fa-thumbs-up\"><\/i>\r\n                <\/a>\r\n    <span class=\"pld-like-count-wrap pld-count-wrap\">    <\/span>\r\n<\/div><\/div>\n<p>Sometimes I need to take a break from what I\u2019m doing and let my mind relax. And a fun way to do that is to play a simple puzzle game. You might be familiar with <a href=\"https:\/\/www.nytimes.com\/games\/wordle\/index.html\">Wordle<\/a>, the word puzzle game where you make successive attempts to guess a secret five-letter word that changes every day. For each guess, the game tells you which letters are correct and in the correct location (green), which letters are correct but in the wrong position (yellow), and which letters don\u2019t actually appear in the secret word (gray).<\/p>\n\n\n\n<p>I find that this can be a relaxing game to play when I need a quick break. And when I play the game, I like to use the <code>grep<\/code> command to exercise <em>regular expressions<\/em>. Using <code>grep<\/code> isn\u2019t really cheating, it\u2019s just a way to help narrow down my options.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"start-with-a-list-of-words\">Start with a list of words<\/h2>\n\n\n\n<p>To get started, you\u2019ll need to have a list of five-letter words. Linux provides this in the <code>\/usr\/share\/dict\/words<\/code> file, but this file contains all kinds of words, including names and other proper nouns (like <strong>Linus<\/strong>), some number-based words (such as <strong>12-point<\/strong> and <strong>1st<\/strong>), and acronyms (like <strong>SPARC<\/strong>). Wordle doesn\u2019t allow these kinds of words, it only uses all-lowercase words. To get a list of all-lowercase five-letter words, we can use the character pattern <code>[a-z]<\/code> which matches a single lowercase letter from <code>a<\/code> to <code>z<\/code>. If we use this multiple times, and combine it with <code>^<\/code> to match the start of a line, and <code>$<\/code> for the end of a line, we\u2019ll have a list of words that are all-lowercase and exactly five letters long:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep '^&#91;a-z]&#91;a-z]&#91;a-z]&#91;a-z]&#91;a-z]$' \/usr\/share\/dict\/words &gt; wordlist<\/code><\/pre>\n\n\n\n<p>This looks for words in <code>\/usr\/share\/dict\/words<\/code> that are composed of exactly five lowercase letters, and saves the output in a new file called <code>wordlist<\/code> in the current directory. On my system, that list is over 15,000 words long!<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ wc -l wordlist\n15034 wordlist<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"narrow-down-the-options\">Narrow down the options<\/h2>\n\n\n\n<p>Start the game by guessing a word that has five letters. To help narrow down the options, I like to pick a word that has five unique letters, rather than a word with repeated letters, like <strong>boots<\/strong>. Some of the most commonly used letters in English include E, S, T, and R, so I\u2019ll start by guessing the word <strong>stare<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"350\" height=\"420\" src=\"https:\/\/www.both.org\/wp-content\/uploads\/2024\/05\/guess1.png\" alt=\"\" class=\"wp-image-5266\"\/><\/figure>\n\n\n\n<p>Let\u2019s use <code>grep<\/code> to help narrow down my possible next guesses. The gray and yellow letter tiles tell me that today\u2019s secret word doesn\u2019t contain the letters S, T, or A. The secret word <em>does<\/em> contain R and E, but not as the last two letters.<\/p>\n\n\n\n<p>First, let\u2019s narrow down the options to eliminate words that <em>do not<\/em> contain S, T, or A. The <code>-v<\/code> option for <code>grep<\/code> is very handy here to \u201cinvert\u201d a search. For example, if we \u201cinvert\u201d the search for any words with S, T, or, A, <code>grep<\/code> will return only the words that <em>do not<\/em> contain those letters. This already reduces our options from 15,000 possible words in the first guess to only 3,600 possible words for our second guess:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -v '&#91;sta]' wordlist &gt; guess2a\n$ wc -l guess2a \n3640 guess2a<\/code><\/pre>\n\n\n\n<p>But this list also includes words like <strong>chide<\/strong>, which has the letter E in the last position, or the word <strong>berry<\/strong> which has an R in the next-to-last position. Wordle colored those letter tiles yellow after our first guess, to indicate that the secret word had both R and E in it, but not in those positions. So to narrow down our possible list of guesses, we need to eliminate any words with an E as the last letter, or an R as the second-to-last letter. This brings the list down to only 550 possible words:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep e guess2a | grep -v 'e$' | grep r | grep -v 'r.$' &gt; guess2b\n$ wc -l guess2b\n553 guess2b<\/code><\/pre>\n\n\n\n<p>The period in <code>r.$<\/code> is a placeholder for any possible character. In this case, since our list only contains words with five letters, this regular expression effectively means \u201cthe letter R as the next-to-last letter.\u201d<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"make-another-guess\">Make another guess<\/h2>\n\n\n\n<p>As I look through my list of words to make my next guess, I want to pick an \u201ceveryday\u201d word that has five unique letters. For example, the word <strong>creek<\/strong> is good, but it has two E\u2019s. Instead, I decided to guess the word <strong>biker<\/strong>.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"350\" height=\"420\" src=\"https:\/\/www.both.org\/wp-content\/uploads\/2024\/05\/guess2.png\" alt=\"\" class=\"wp-image-5267\"\/><\/figure>\n\n\n\n<p>Guessing a word that has five unique letters provides me additional information about what letters might appear in the word. For example, the gray and yellow tiles tell me that the secret word does not contain the letters B or I. It does have a K in it, but not as the middle letter.<\/p>\n\n\n\n<p>We can use <code>grep<\/code> again to further narrow down the options. As before, the first step is to eliminate any words that have B or I. This narrows the list to just over 300 possible words:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -v '&#91;bi]' guess2b &gt; guess3a\n$ wc -l guess3a\n325 guess3a<\/code><\/pre>\n\n\n\n<p>Then, filter the list to only find words with K, E, and R, but not as the last three letters. Since we already filtered the word list to only contain R and E words, we don\u2019t need to run <code>grep<\/code> with those letters, but we need to <code>grep<\/code> for any words with K:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep k guess3a | grep -v '^..k' | grep -v 'e.$' | grep -v 'r$' &gt; guess3b\n$ wc -l guess3b\n9 guess3b<\/code><\/pre>\n\n\n\n<p>This brings the list down to only nine possible words:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ cat guess3b\ndreck\nfreck\njerky\nkerch\nkreng\nperky\nreeky\nrenky\nwreck<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"guess-the-word\">Guess the word<\/h2>\n\n\n\n<p>From here, guessing the secret word within six total attempts should be pretty easy. Wordle tends to use \u201ceveryday\u201d words, so we can pick a word like <strong>wreck<\/strong> for the next guess.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"350\" height=\"420\" src=\"https:\/\/www.both.org\/wp-content\/uploads\/2024\/05\/guess3.png\" alt=\"\" class=\"wp-image-5268\"\/><\/figure>\n\n\n\n<p>This is getting close! We now know the word doesn\u2019t contain W or C, and the letters R, E, and K are in the wrong positions:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ grep -v w guess3b | grep -v c | grep -v '^.r' | grep -v '^..e' | grep -v 'k$' &gt; guess4a\n$ wc -l guess4a\n3 guess4a<\/code><\/pre>\n\n\n\n<p>This narrows down the list of possible words to just three:<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$ cat guess4a\njerky\nperky\nrenky<\/code><\/pre>\n\n\n\n<p>I\u2019ll guess the word <strong>jerky<\/strong>, which happens to be correct!<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"350\" height=\"420\" src=\"https:\/\/www.both.org\/wp-content\/uploads\/2024\/05\/guess4.png\" alt=\"\" class=\"wp-image-5269\"\/><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"regular-expressions-for-the-win\">Regular expressions for the win<\/h2>\n\n\n\n<p>The <code>grep<\/code> command is a powerful tool that lets you find words in a list based on regular expressions. This example shows how to use <code>grep<\/code> to help narrow down the options in a word puzzle game, but you can use <code>grep<\/code> in the same way to match other things. For example, system administrators might use <code>grep<\/code> to find errors in a log file, such as the <code>\/var\/log\/messages<\/code> file, but only for a particular day. With <code>grep<\/code>, you can match text at the beginning of a line, the end of a line, or anywhere in between &#8211; or find lines that <em>do not<\/em> contain the text pattern. Add <code>grep<\/code> to your systems administrator \u201ctoolkit\u201d to make your work easier.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>You can use regular expressions to match letters and patterns, to help you play a letter game.<\/p>\n","protected":false},"author":33,"featured_media":4314,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_lmt_disableupdate":"","_lmt_disable":"","footnotes":""},"categories":[100,69,5],"tags":[104,388,91,383],"class_list":["post-5265","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-command-line","category-fun","category-linux","tag-command-line","tag-grep","tag-linux","tag-regular-expressions"],"modified_by":"David Both","_links":{"self":[{"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/posts\/5265","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/users\/33"}],"replies":[{"embeddable":true,"href":"https:\/\/www.both.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=5265"}],"version-history":[{"count":1,"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/posts\/5265\/revisions"}],"predecessor-version":[{"id":5270,"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/posts\/5265\/revisions\/5270"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.both.org\/index.php?rest_route=\/wp\/v2\/media\/4314"}],"wp:attachment":[{"href":"https:\/\/www.both.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=5265"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.both.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=5265"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.both.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=5265"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}