Issue #283, November 2017

Guess a number—Dave writes a simple guessing game as a demonstration of how to produce clear, readable shell scripts and solve mathematical equations.

There are some basic computer algorithms that suggest games, weird as that
may sound. One example leaps right to mind: search as a strategy for a
process of elimination for a guessing game. The game
*Mastermind* is based on
that, with its colored pegs and oft-confusing feedback mechanism for you to
(hopefully) zero in on the secret sequence.

Let's go simpler than that though. Let's implement a number-guessing game as a way to learn about binary search. The concept is easy. If you have to guess a number between 1 and “n”, each guess should focus on dividing the remaining pool of possible values in half.

Your first guess might be 50. If it's too low, you just chopped out 51 of the 100 possible values (1..49 and the value 50 itself, since it wasn't a match). If it's too high, same again, but now you know it must be in the sequence 1..49.

There's math behind this actually, and it revolves around logarithms. Yes, I can't believe we're talking about logs, but the formula to calculate the worst-case number of guesses is log2(LISTSIZE).

For example, with a list of 100 entries: log2(100) = 6.644. Since you can't have a fractional guess, of course, that means the computer never should guess more than seven times to identify any randomly chosen number 1..100. And, who knows, it might guess it a lot faster than that. Imagine if 50 was the randomly chosen value, for example. That should be guessed in, well, one guess.

To calculate this value in Linux, the `bc` binary calculator is the tool for
the job. Unfortunately, it doesn't know how to calculate base-2
logarithms, but math to the rescue! log2(N) is equal to
log(N) / log(2).
You knew that, right?

That's a formula you can feed to `bc`, although as one of the oldest Linux
programs, `bc` is famously un-user-friendly. In fact, here's how I use
this formula along with the usual `bc` rigmarole to get a solution for
“size”:

echo "scale=4;(l($LISTSIZE)/l(2))" | bc -l

`bc` is weird in that if you specify a scale of zero, it won't calculate
any values after the decimal point for any interim calculations either. The
result: log(2) = 0, and the equation fails with a divide-by-zero error.
D'oh.

To turn this formula into a usable calculation for a script, you need to
keep in mind that any value > 0.000, even 0.0001, must round upward for
maximum/worst-case guess. That's a ceiling function, as they say in
mathematics, but `bc` doesn't have that either. Instead,
here's a hack: simply
add 0.999 to the resultant value and chop off everything after the decimal
point.

It works. Try it.

Here's my resultant formula, all ready for a shell script:

steps="$(echo "scale=4;(l($LISTSIZE)/l(2)+0.99)" | bc -l \ | cut -d. -f1)"

That's actually probably the hardest part of this program. The other piece is to choose a value randomly between 1..n for some value of “n”, but that's a breeze:

value=$(( ( $RANDOM % $LISTSIZE ) + 1 ))

The main loop consists of prompting the user for a value, then indicating whether it's a match (well done!), too high or too low. Then looping and prompting them again. It's pretty simple, actually, and you hopefully should be able to code it all by yourself without ever reading further into this column.

Still here?

Okay, let's proceed.

You'll recall that `echo -n` omits the carriage return at the
end of a line, so the sequence of:

echo -n "Enter something: " read userinput

is quite a common one in interactive shell scripts. This will be no different, prompting like this:

echo -n "Your guess: " read playerguess

You'll notice that one of the other things I'm demonstrating in this particular shell script is the use of long, mnemonic variable names. Too many scripts have “i” and “j” and “k” as variables without ever explaining what they do or what they represent. That's just bad coding.

There's a fun, classic way to create an infinite loop in a shell script, and that's what I'll do with this number-guessing game too:

while [ /bin/true ] do statements done

Within the loop, end conditions must be checked, and when met, the exit command easily can be used to quit the script. Just want to quit the loop? That's what “break” can do for you in this sort of situation. Simply specify how many levels of loop you want to jump out of as part of the break statement if one isn't enough.

That's now enough that you can figure out what I'm doing, I expect:

pick a number while looping ask for guess if guess = number you got it. if guess < number too low, guess higher else guess > number too high, guess lower loop

What does that look like as a shell script? Hey, I thought you'd never ask:

while [ /bin/true ] ; do /bin/echo -n "Your guess: " read playerguess if [ $playerguess -eq $value ] ; then echo -n "Got it! Nice. That took you $guess guesses." steps="$(echo "scale=4;(l($max)/l(2)+0.99)" | bc -l \ | cut -d. -f1)" echo "I can solve it in less than $steps steps." exit 0 elif [ $playerguess -lt $value ] ; then echo "Nope. Too low." else echo "Nah. Too high." fi guess=$(( $guess + 1 )) # another guess... done

You can see that I've added a guess counter, ingeniously called
`guess`,
and that the player guess is, well, `playerguess`. This makes the code nice
and readable, although the mathematical equation tucked into the middle is a
bit gnarly looking by comparison. It definitely could do with a comment to
make it more clear.

The game gets a bit more interesting if the user can specify a list size
when invoking the program, which easily can be done by having
`LISTSIZE`
(okay, I call it “max”) as a variable instead of hard-coded:

max=100 # maxvalue if [ $# -gt 0 ] ; then max=$1 fi

I probably should check that the value is indeed an integer, but I'll leave that task as an exercise for you, the reader.

A few tweaks to the prompts and output make it a bit friendlier and more grammatically correct. Here's a test:

$ guess-number.sh 50 Guess my number between 1 and 50. Ready? Go! Guess #1 is: 25 Nah. Too high. Guess #2 is: 20 Nah. Too high. Guess #3 is: 10 Nah. Too high. Guess #4 is: 5 Nope. Too low. Guess #5 is: 6 Nope. Too low. Guess #6 is: 7 Got it! Nice. That took you 6 guesses. For the record, I would have solved it no more than 6 steps.

I deliberately did a pretty inefficient search. In fact, each time, you should divide the remaining values by two and make that your guess, so when you learned 25 was too high, the next guess should have been 12.5 (or 12) and so on. Still, this did no worse than the worst-case scenario of six guesses.

If you really want to do something interesting, note how by using this
strategy, you could guess from a list of `MAXINT` values (32,767) and still
have no more than 16 guesses, worst case, to figure out a randomly chosen
number in the range 1..MAXINT.

Ah, the power of math and the joy of binary search algorithms.

Copyright © 1994 - 2018 Linux Journal. All rights reserved.