LJ Archive

Work the Shell

Calculating Word Point Values

Dave Taylor

Issue #217, May 2012

Dave continues improving the Scrabble and Words With Friends script.

My last article ended by wrapping up the word finder utility for Scrabble and Words With Friends, a fun and complicated project that ended up requiring about 65 lines of Bourne Shell. That's not too long, but for a shell script, actually, it is rather long, and it was one of the scripts we've written that most begged to be coded in Perl or another more string-friendly language. Still, we persevered, right?

In this article, let's wrap things up by looking at word values based on the individual letter values in both Scrabble and Words With Friends.

To start, here's how the point chart looks for Words With Friends:

A=1   D=2   G=3   J=10   M=4   P=4   S=1   V=5   Y=3
B=4   E=1   H=3   K=5    N=2   Q=10  T=1   W=4   Z=10
C=4   F=4   I=1   L=2    O=1   R=1   U=2   X=8

This contrasts with the somewhat lower point value of letters in the game Scrabble:

 
A=1   D=2   G=2   J=8   M=3   P=3   S=1   V=4   Y=4
B=3   E=1   H=4   K=5   N=1   Q=10  T=1   W=4   Z=10
C=3   F=4   I=1   L=1   O=1   R=1   U=1   X=8

If you look closely, you'll see that just about every word is going to be worth less in Scrabble than in Words With Friends—interesting. I always assumed that the two used the same basic letter-for-letter point values, so that calculating point values in one game taught you how to calculate points for the other—not so!

Calculating Point Values

There are a lot of ways to translate a sequence of letters into their individual point values and calculate the sum of those values. Indeed, an array comes to mind immediately, but the problem is that you still have to write the code to step through the individual letters, and if you're doing that, why not use sed for the letter → numeric value substitution instead?

It turns out that's a fast and elegant solution, and I've coded it also to compare easily what letters have a given point value in wwf (aka Words With Friends) and s (aka Scrabble).

Here's how I coded the one-point letters in each:

wwf1="s/[asetior]/ 1 /g
s1="s/[asentiloru]/ 1 /g"; 

As you can clearly see, Scrabble has a lot more one-point letters than Words With Friends does. Again, I had no idea until I started this analysis.

With sed expressions coded up this way, it's easy to turn a word like “cat” into “4 1 1”. Here's the full set of substitutions:

wwf1="s/[asetior]/ 1 /g"
wwf2="s/[dlnu]/ 2 /g"
wwf3="s/[ghy]/ 3 /g"
wwf4="s/[bcfmpw]/ 4 /g"
wwf5="s/[kv]/ 5 /g"
wwf8="s/[x]/ 8 /g"
wwf10="s/[jqz]/ 10 /g"

s1="s/[asentiloru]/ 1 /g"
s2="s/[d]/ 2 /g"
s3="s/[mpbc]/ 3 /g"
s4="s/[vyhwf]/ 4 /g"
s5="s/[k]/ 5 /g"
s8="s/[jx]/ 8 /g"
s10="s/[qz]/ 10 /g"

Quite honestly, entering all that data is the hardest part of creating the word-point-value script. It's tedious work, so you'll be smart to copy and paste.

Now that you have a bunch of numeric values, however, what do you do with them? It turns out that's easy too:

sed 's/  / + /g'

Think about the original substitution, and you'll see what I'm doing: the first letter substitutes to <space> letter <space>, as does every subsequent letter. Therefore, individual spaces denote the very beginning and end of the word, while double spaces are only between digits. Replace those double spaces with a +, and that sequence of digits translates into a simple mathematical formula:

4 + 1 + 1

To solve simple math, you can use $(( )) as a shell notation, or you can call the built-in function expr—same basic result.

Before we get there, however, here's how I'll use that sequence of sed substitutions in the script:

wwfexpr=$(echo $1 | sed "$wwf1;$wwf2;$wwf3;$wwf4;$wwf5;$wwf8;$wwf10" 
 ↪| sed 's// + /g')

sexpr=$(echo $1 | sed "$s1;$s2;$s3;$s4;$s5;$s8;$s10" | 
 ↪sed 's/  / + /g')

You can see I've tucked the double-space-to-plus-sign substitution into the same subshell invocation too—short and neat. In fact, those two lines are the heart of the script. There's only one line left actually, and it both calculates the actual values and shows the result:

echo "\"$1\" has a base point value of $(expr $wwfexpr) in WwF 
 ↪and $(expr $sexpr) in Scrabble"

This is what I really like about programming with the power of the entire Linux shell at your fingertips: this script that calculates point values for a given word in both Scrabble and Words With Friends is actually only three lines long, if you don't count the variables we set up at the beginning.

Now, let's run it to see what kind of values we see:

$ sh wordvalue.sh calculate
"calculate" has a base point value of 18 in WwF and 13 in Scrabble

Let me explain the sh script.sh notation briefly, in case you haven't see this approach before. A classic way that hackers Trojan Horse a Linux or UNIX system is to drop a shell script like vi or ls into somewhere like the /tmp directory. It's not a problem, unless your PATH looks like this:

.:/bin:/usr/bin:/usr/local/bin

in which case you can unwittingly run the invasive script and possibly create a setuid root copy of the shell for the bad guys to exploit at their later convenience. Security's a bit far afield for this particular column, but suffice it to say that for security reasons, I never have “.” in my PATH.

Therefore, I could use ./script to invoke the script in my current directory if it's marked as executable, but since I have so many scripts lying around, I find it even safer to not mark them as executable until I'm 100% sure they're done and tested. Instead, sh script works just as well, although it spawns a subshell for execution.

Now you know. And, here are more examples:

$ sh wordvalue.sh word
"word" has a base point value of 8 in WwF and 8 in Scrabble

$ sh wordvalue.sh linux
"linux" has a base point value of 15 in WwF and 12 in Scrabble

$ sh wordvalue.sh journal
"journal" has a base point value of 19 in WwF and 14 in Scrabble

At this point, I'll leave it as an exercise for you, the reader, to figure out how to graft this functionality onto the script we wrote in the previous few articles that calculated possible words from a set of letters.

The additional bonus task is to be able to analyze the board so you can figure out how to cover DL, TL, DW and TL squares, as available (that stands for double letter, triple letter, double word and triple word, in case you're not a hard-core word-gamer). Beware though, it's considerably more difficult, because now you have to figure out how to enter the current state of the board—definitely extra credit!

Dave Taylor has been hacking shell scripts for more than 30 years. Really. He's the author of the popular Wicked Cool Shell Scripts and can be found on Twitter as @DaveTaylor and more generally at www.DaveTaylorOnline.com.

LJ Archive