Work the Shell

A Number-Guessing Game

Dave Taylor

Issue #283, November 2017

Guess a number—Dave writes a simple guessing game as a demonstration of how to produce clear, readable shell scripts and solve mathematical equations.

There are some basic computer algorithms that suggest games, weird as that may sound. One example leaps right to mind: search as a strategy for a process of elimination for a guessing game. The game Mastermind is based on that, with its colored pegs and oft-confusing feedback mechanism for you to (hopefully) zero in on the secret sequence.

Let's go simpler than that though. Let's implement a number-guessing game as a way to learn about binary search. The concept is easy. If you have to guess a number between 1 and “n”, each guess should focus on dividing the remaining pool of possible values in half.

Your first guess might be 50. If it's too low, you just chopped out 51 of the 100 possible values (1..49 and the value 50 itself, since it wasn't a match). If it's too high, same again, but now you know it must be in the sequence 1..49.

There's math behind this actually, and it revolves around logarithms. Yes, I can't believe we're talking about logs, but the formula to calculate the worst-case number of guesses is log2(LISTSIZE).

For example, with a list of 100 entries: log2(100) = 6.644. Since you can't have a fractional guess, of course, that means the computer never should guess more than seven times to identify any randomly chosen number 1..100. And, who knows, it might guess it a lot faster than that. Imagine if 50 was the randomly chosen value, for example. That should be guessed in, well, one guess.

To calculate this value in Linux, the bc binary calculator is the tool for the job. Unfortunately, it doesn't know how to calculate base-2 logarithms, but math to the rescue! log2(N) is equal to log(N) / log(2). You knew that, right?

That's a formula you can feed to bc, although as one of the oldest Linux programs, bc is famously un-user-friendly. In fact, here's how I use this formula along with the usual bc rigmarole to get a solution for “size”:

echo "scale=4;(l($LISTSIZE)/l(2))" | bc -l

bc is weird in that if you specify a scale of zero, it won't calculate any values after the decimal point for any interim calculations either. The result: log(2) = 0, and the equation fails with a divide-by-zero error. D'oh.

To turn this formula into a usable calculation for a script, you need to keep in mind that any value > 0.000, even 0.0001, must round upward for maximum/worst-case guess. That's a ceiling function, as they say in mathematics, but bc doesn't have that either. Instead, here's a hack: simply add 0.999 to the resultant value and chop off everything after the decimal point.

It works. Try it.

Here's my resultant formula, all ready for a shell script:

steps="$(echo "scale=4;(l($LISTSIZE)/l(2)+0.99)" | bc -l \
       | cut -d. -f1)"

That's actually probably the hardest part of this program. The other piece is to choose a value randomly between 1..n for some value of “n”, but that's a breeze:

value=$(( ( $RANDOM % $LISTSIZE ) + 1 ))

The main loop consists of prompting the user for a value, then indicating whether it's a match (well done!), too high or too low. Then looping and prompting them again. It's pretty simple, actually, and you hopefully should be able to code it all by yourself without ever reading further into this column.

Still here?

Okay, let's proceed.

You'll recall that echo -n omits the carriage return at the end of a line, so the sequence of:

echo -n "Enter something: "
read userinput

is quite a common one in interactive shell scripts. This will be no different, prompting like this:

echo -n "Your guess: "
read playerguess

You'll notice that one of the other things I'm demonstrating in this particular shell script is the use of long, mnemonic variable names. Too many scripts have “i” and “j” and “k” as variables without ever explaining what they do or what they represent. That's just bad coding.

There's a fun, classic way to create an infinite loop in a shell script, and that's what I'll do with this number-guessing game too:

while [ /bin/true ]
do
   statements 
done

Within the loop, end conditions must be checked, and when met, the exit command easily can be used to quit the script. Just want to quit the loop? That's what “break” can do for you in this sort of situation. Simply specify how many levels of loop you want to jump out of as part of the break statement if one isn't enough.

That's now enough that you can figure out what I'm doing, I expect:

pick a number
while looping
   ask for guess
   if guess = number 
      you got it. 
   if guess < number
      too low, guess higher
   else guess > number
      too high, guess lower
loop

What does that look like as a shell script? Hey, I thought you'd never ask:

while [ /bin/true ] ; do
  /bin/echo -n "Your guess: "
  read playerguess
  if [ $playerguess -eq $value ] ; then
    echo -n "Got it! Nice. That took you $guess guesses."
    steps="$(echo "scale=4;(l($max)/l(2)+0.99)" | bc -l \
           | cut -d. -f1)"
    echo "I can solve it in less than $steps steps."
    exit 0
  elif [ $playerguess -lt $value ] ; then
    echo "Nope. Too low."
  else
    echo "Nah. Too high."
  fi
  guess=$(( $guess + 1 ))          # another guess...
done

You can see that I've added a guess counter, ingeniously called guess, and that the player guess is, well, playerguess. This makes the code nice and readable, although the mathematical equation tucked into the middle is a bit gnarly looking by comparison. It definitely could do with a comment to make it more clear.

The game gets a bit more interesting if the user can specify a list size when invoking the program, which easily can be done by having LISTSIZE (okay, I call it “max”) as a variable instead of hard-coded:

max=100         # maxvalue

if [ $# -gt 0 ] ; then
  max=$1
fi

I probably should check that the value is indeed an integer, but I'll leave that task as an exercise for you, the reader.

A few tweaks to the prompts and output make it a bit friendlier and more grammatically correct. Here's a test:

$ guess-number.sh 50

Guess my number between 1 and 50. Ready? Go!
Guess #1 is: 25
Nah. Too high.
Guess #2 is: 20
Nah. Too high.
Guess #3 is: 10
Nah. Too high.
Guess #4 is: 5
Nope. Too low.
Guess #5 is: 6
Nope. Too low.
Guess #6 is: 7
Got it! Nice. That took you 6 guesses.
For the record, I would have solved it no more than 6 steps.

I deliberately did a pretty inefficient search. In fact, each time, you should divide the remaining values by two and make that your guess, so when you learned 25 was too high, the next guess should have been 12.5 (or 12) and so on. Still, this did no worse than the worst-case scenario of six guesses.

If you really want to do something interesting, note how by using this strategy, you could guess from a list of MAXINT values (32,767) and still have no more than 16 guesses, worst case, to figure out a randomly chosen number in the range 1..MAXINT.

Ah, the power of math and the joy of binary search algorithms.