Google invents a new programming language

Ready, Set, Go!


Google is good for more than just search results. The search engine giant is now in the blocks and set to launch a new programming language. Will this advance be successful?

By Marcus Nutzinger and Rainer Poisel

Sergey Kandakov, 123RF

At a 2007 meeting of Google developers, former Bell Labs veterans Rob Pike and Ken Thompson wondered whether the art of the programmer had gotten to be too much about waiting. According to Pike, "Things were taking too long. Too long to code; too long to compile ..." [1].

One problem, according to Pike, was that programming languages haven't changed much during the past several years, but the requirements and expectations have continued to evolve.

Contemporary programs must accommodate client-server networking, massive compute clusters, and multicore processors, and at the same time, developers must pay additional attention to security and stability. Also, the systems for testing and controlling data types continue to become increasingly complicated.

The Google developers wanted a language that was as efficient as statically typed languages like C and as easy to use as dynamic languages like Python. They also wanted good support for concurrency, and they wanted to include a garbage collection system (as in Java or C#) for automatically cleaning up memory.

After several months of planning and several more months of coding, the Google team unveiled a new "expressive, concurrent, and garbage-collected" programming language called Go [2]. The tools you need for getting started with the Go language are available right now through the project website.

If you feel like testing a new programming language, you can try your luck with this experimental language designed for the next generation of programming.

Environment Design

The source code for the Go programming tools is available from a repository on the project homepage [3]. After compiling the various Go components (see the "Installation" box), the next task is to build the libraries. This step should give you an indication of how well Google's new language actually performs because the libraries are written in Go. You should notice that an entire library builds in just a couple of seconds.

Currently, two compilers are available - namely, the gc compiler and gccgo. The gccgo tool interacts with the GNU C compiler, but it is not as highly developed as the gc tools, whose naming scheme derives from Plan 9 [5]. The number designates the platform, in which 5 stands for ARM, 6 for 64-bit x86 systems, and 8 for 32-bit x86 systems. The letter designates the tool itself (see Table 1).

Figure 1 shows the build and linking process, and Listing 1 contains the obligatory "Hello World" program (but in German here). Line 3 contains an import, which you will be familiar with from Python and Java. Go requires that programs have a main() function as the entry point, just like C or C++. The Println() function in the fmt library finally outputs the text. The developers deliberately used a naming scheme with an uppercase first letter for functions. One important feature of Go is that variables and functions automatically have global visibility if their names begin with a capital letter.

Figure 1: The commands for compiling and linking a Go program are similar to the initially unfamiliar Plan 9 notation.
Listing 1: hello.go
01 package main
02
03 import "fmt"
04
05 func main () {
06   fmt.Println("Hallo, Welt!")
07 }
Installation

If a firewall blocks your Internet access during the installation, you will need a workaround. Disable the http and net subsystem tests in the Makefile so that the results of their execution are not preconditions for the success of the overall installation [4]. To do so, add http and net entries to the value of the NOTEST variable in the $GOROOT/src/pkg/Makefile file.

Easy Entry

In Go, the semicolon doesn't terminate an instruction; instead, it is a separator, as in a list. If the program is only one instruction, a semicolon is not needed. Go inherits from various languages (e.g., C, C++, Python), but the syntax differs at various points. Google offers both a general introduction to the language syntax [6] and an overview of more advanced topics [7].

In this article, we describe a small programming project that demonstrates Go's client-server capabilities and lays the foundation for a minimal chat program. The server process listens for TCP connections on a specific port; when the connection is established, clients send messages to the server using the defined formats and then terminate.

Server Tasks

The server, as implemented in Listing 2, starts importing packets in line 3 and later uses them as library calls. Line 15 emphasizes that the developers have inverted the order of the keywords in the variable declaration. According to the project document, this improves clarity of syntax and saves typing. Go also automatically identifies types if declaration and initialization occur in one step, making additional type definitions redundant. The code starts by defining two constants of the standard type int and a global variable of type int *. The listenPort variable is thus a pointer. This functionality comes from C/C++.

Listing 2: server.go
001 package main
002
003 import (
004   "bytes";
005   "encoding/binary";
006   "flag";
007   "fmt";
008   "net";
009   "os";
010   "os/signal";
011   "syscall";
012   "./build/msg/msg"
013 )
014
015 const (
016   defPort = 7777;
017   bufSize = 1024
018 )
019
020 var listenPort *int = flag.Int("p", defPort, "port to listen for connections")
021
022 // wait for incoming tcp connections
023 func acceptor(listener *net.TCPListener, quit chan bool) {
024   var buf [bufSize]byte
025
026   for {
027     conn, e := listener.AcceptTCP()
028     if e != nil {
029       fmt.Fprintf(os.Stderr, "Error: %v\n", e)
030       continue
031     }
032
033     num, e := conn.Read(&buf)
034     if num < 0 {
035       fmt.Fprintf(os.Stderr, "Error: %v\n", e)
036       conn.Close()
037       continue
038     }
039
040     go handleClient(conn, buf[0:num])
041   }
042 }
043
044 // handle a single client connection
045 func handleClient(conn *net.TCPConn, bytebuf []byte) {
046   message := new(msg.Message)
047   buf := bytes.NewBuffer(bytebuf)
048
049   binary.Read(buf, binary.LittleEndian, &message.SenderLen)
050   s := make([]byte, message.SenderLen)
051   buf.Read(s)
052   message.SetSender(string(s))
053   binary.Read(buf, binary.LittleEndian, &message.DataLen)
054   d := make([]byte, message.DataLen)
055   buf.Read(d)
056   message.SetData(string(d))
057
058   fmt.Printf("%s connected\n > %s\n\n", message.GetSender(),
059      message.GetData())
060
061   conn.Close()
062   }
063
064 // read from signal.Incoming channel and
065 // return SIGINT is received
066 func signalHandler(quit chan bool) {
067   for {
068     select {
069       case sig := <-signal.Incoming:
070       fmt.Printf("Received signal %d\n", sig)
071       if sig.(signal.UnixSignal) != syscall.SIGINT {
072         continue
073       }
074       quit<- true
075       return
076     }
077   }
078 }
079
080 func main() {
081   flag.Parse()
082   address := fmt.Sprintf("%s:%d", "127.0.0.1", *listenPort)
083   quit := make(chan bool)
084
085   socket, e := net.ResolveTCPAddr(address)
086   if e != nil {
087     fmt.Fprintf(os.Stderr, "Error: %v\n", e)
088     os.Exit(1)
089   }
090   listener, e := net.ListenTCP("tcp4", socket)
091   if e != nil {
092     fmt.Fprintf(os.Stderr, "Error: %v\n", e)
093     os.Exit(1)
094   }
095
096   go signalHandler(quit)
097
098   fmt.Printf("Listening on %s:%d\n\n", socket.IP.String(), socket.Port)
099   go acceptor(listener, quit)
100
101   for {
102     select {
103       case <-quit:
104       fmt.Printf("Shutting down\n")
105       listener.Close()
106       return
107     }
108   }
109 }

Useful Modules

The main() function in line 80 parses the command-line parameters. To do this, it relies on the functions provided by the flag package.

One of Go's special features is the system of channels used for communication between Goroutines, the counterpart to threads (see Figure 2). Line 83 creates a new channel, which the individual threads can use to exchange bool-type data. The box titled "Make vs. New" describes the make keyword, which is used to create new instances, and how it differs from new.

Figure 2: Goroutines communicate via a channel. At time t1, A is reading from the channel and thus blocks until time t2, when B writes data to the channel.

The := syntax, which Go borrows from Pascal, offers a shorter variant of variable declaration with simultaneous initialization. The program uses this idiom in lines 85 and 90 to create a socket and a TCP handler with the help of the net library.

Functions can return multiple values in Go; this explains the list of variables that you see on the left side of the assignment in line 85.

The go keyword starts a Goroutine. The main program creates a parallel thread in line 96 with the signalHandler() function while continuing to run normally itself. The program thus runs two Goroutines.

One of them, starting in line 66, is responsible for signal handling; the other starts in 23 and listens on the TCP port. An infinite loop at the end of main() uses select to wait on the channel for an incoming message of quit.

Make vs. New

Go includes two keywords for a similar purpose: programmers can use both make and new to allocate memory. The new keyword - which is popular in object-oriented languages like C++ and Java - allocates memory for a new object and returns a pointer to the created instance of the requested object type as a return value. The new parameters define the type for which Go is to allocate memory space.

In contrast, developers use make to create slices, maps, or channels. This keyword doesn't support other types. The return value here is not a pointer to a new instance of the data type passed in, but the value itself. Additionally, the resulting object is initialized internally and thus is available for use immediately. If new is used for these three types, Go returns a nil pointer because the underlying data structures are not initialized [7].

The make keyword also supports additional arguments. For slices, this is the length and the capacity of the field in question. The former defines the current length of the slice when it is created, the latter is the length of the underlying field (i.e., the length to which the slice can grow).

As an example, line 50 in Listing 2 creates a byte slice that is based on an array of the length message.SenderLen.

Even though the differences make sense, the question is why the Google developers decided to implement two keywords for such similar tasks. Newcomers in particular will find it hard to appreciate the difference.

Wait and Listen

If the loop receives a message, the program terminates. Developers can use the select instruction to wait on multiple channels. If different instructions arrive on different channels, the Go run-time environment randomly selects a message for processing.

Go uses the func keyword to introduce functions. In the case of acceptor(), the program waits for TCP connections in an infinite loop in line 23 and reads a maximum of 1,024 bytes from each connection. The function then passes the actual byte count to the handleClient() Goroutine.

The transfer argument with the buf[0:num] syntax in line 40 is another special feature. Go creates slices this way, which play an important role when working with fields in the Go language. Slices are similar to the fields used in other programming languages, such as C and C++; in other words, they point to a memory area that includes some meta-information, such as the length.

To create a slice, either a field has to exist already (e.g., buf, which is created by line 24) or Go automatically generates it, as in line 50, when a new slice is created using make.

The handleClient() function, which is called for each client connection, is then handed a byte slice - that is, the bytes that it read. Lines 49 through 56 detour via the bytes and binary packages to pour the bytestream into a structure defined for the message transmission, which is specified in the code in Listing 3, msg.go.

Listing 3: msg.go
01 package msg
02
03 type Message struct {
04   SenderLen uint32;
05   sender []byte;
06   DataLen uint32;
07   data []byte
08 }
09
10 func (m *Message) GetSender() string {
11   return string(m.sender)
12 }
13
14 func (m *Message) SetSender(s string) {
15   m.sender = stringToBytes(s)
16   m.SenderLen = uint32(len(s))
17 }
18
19 func (m *Message) GetData() string {
20   return string(m.data)
21 }
22
23 func (m *Message) SetData(s string){
24   m.data = stringToBytes(s)
25   m.DataLen = uint32(len(s))
26 }
27
28 // helper function to convert a given
29 // string to a byte slice
30 func stringToBytes(s string) []byte {
31   slice := make([]byte, len(s))
32
33   for i := 0; i < len(s); i++ {
34     slice[i] = s[i]
35   }
36   return slice
37 }

Message Format

Line 1 of the msg.go file creates a package called msg, which is defined as a structure with four elements in line 3. The code uses uppercase letters to demonstrate how Go makes elements public. Programmers can use access methods to manipulate the other two attributes - that is, sender and data.

These methods are introduced by the func keyword (like functions); however, they expect a target after the keyword - m in this case - for which they can be called. The class model thus differs from the models in C++ and Java, which list methods inside type definitions.

Basically, any type can be a target; in other words, primitive data types and non-pointer types are allowed. This means the String() method can be defined for arbitrary data types. Developers can thus modify the output from Println() to suit their own purposes. The method

func (m *Message) String() string {
   return fmt.Sprintf(
      "Sender=%s, Data=%s",
      m.sender, m.data)
}

defines custom output for the Message structure, which is used when the function fmt.Println(m) is called.

Vital Signs

The client program acts in a way similar to the server. To download the program, as well as the other examples, go to the Linux Magazine website for archived code [8].

Once the client has initialized a socket, it opens a TCP connection to the server. It creates a client-side Message structure with information from the command-line parameters and transmits it byte-wise via the socket. After doing so, the client terminates.

If you enjoy experimenting, you could develop a full-fledged chat tool that manages a list of the connected clients. A good method is to use a map, which is the Go equivalent of a hash in Perl. The server then sends incoming messages to all the clients listed in the map.

The examples show some interesting constructs in Go. The programming language also comes with a comprehensive library of packages for various purposes. Although you can't expect the scope of the Java Enterprise Edition (JEE) just yet, for its age, Go's package list is pretty impressive, especially considering that Google will make sure it grows. For an overview, take a look in the src/pkg directory, which is below the Go installation directory.

Other Interfaces

Interfaces, a language feature that Java programmers will be familiar with, need separate investigation. Although the interfaces concept has significance in Go, the implementation is different. Go lacks keywords like Java's implements. The code supports an interface once it implements all the methods defined by the interface.

One well-known example of this is the Reader interface in the io [9] package. It defines a public Read() method, which accepts a byte slice and returns two values. Each class that has a Read() with precisely this signature automatically implements the Reader interface.This means that developers can now pass their own variable types to functions that expect an io.Reader, which saves additional typing and facilitates the flexible exchange of implementations.

The chat example uses this interface technique. Because Listing 3 contains a string() method for the Message type, the type implements the Stringer interface [10]. From now on, output functions will use this self-defined method to format these objects.

GCC Go!

The compiler, with the GCC back end, provides an alternative to the native Go compiler, gc. The Go project provides installation notes [11]. It doesn't pose any impossible tasks for experienced programmers, but it does download more than 65,000 files from the source repository, which take up no less than 1.3GB of disk space.

After you look at the size of the code you have generated, you might discover it is worth going to the trouble of installing the alternative compiler. Whereas binaries that are generated with the native compiler will always be several hundreds of kilobytes, the size of executables created with gccgo and dynamically linked libraries more likely will be the size of normal C programs. On the downside, the programs depend on the Go run-time library and other tools (see Figure 3).

Figure 3: A shock for developers: Go offers you either a small binary of 33KB with an enormous number of run-time libraries or a statically linked program that weighs in at 3MB.

Lift Off!

The execution speed of the binaries is vastly different, but it also depends heavily on the optimizations and settings you use. To provide objective and compatible speed data, the test program in Listing 4 avoids incalculable waits caused by input and output operations.

This simple benchmark program implements a bubble sort algorithm [12] by filling an array with a reproducible sequence of 32-bit pseudo-random numbers. The program then sorts the values in ascending order. This example avoids language-specific features as far as possible.

The C example implements the arrays with pointers (see Listing 5), whereas the Go example uses slices. In both cases, the testers created static binaries that do not contain debugging symbols. The source code for the programs referred to by this article is available on the Linux Magazine website under Article Code [8]. Figure 4 shows that pure C is still far quicker.

Figure 4: Fast, but with room for improvement: The bubble sort implementation in Go still takes more than twice as long to execute as a comparable C program for the same task.
Listing 4: perf.go
01 package main
02
03 const (
04   SIZE = 200000;
05   MULT = 1103515245;
06   MASK = 4096;
07   INC = 12345;
08 )
09
10 func bubbleSort(numbers []uint32) {
11   for i := (len(numbers) - 1); i >= 0; i-- {
12     for j := 1; j <= i; j++ {
13       if numbers[j-1] > numbers[j] {
14         numbers[j-1], numbers[j] =
15           numbers[j], numbers[j-1]
16       }
17     }
18   }
19 }
20
21 func main () {
22   var arrayOfInt [SIZE]uint32
23   var lNext uint32 = 1;
24
25   for i := 0; i < len(arrayOfInt); i++ {
26     lNext = lNext * MULT + INC;
27     arrayOfInt[i] = (uint32)(lNext/MASK) % SIZE;
29   }
31   bubbleSort(&arrayOfInt)
33 }
Listing 5: perf.c
01 #include <stdint.h>
02 #include <stdlib.h>
03
04 #define SIZE 200000
05 #define SEED 1
06 #define MULT 1103515245L
07 #define MASK 4096
08 #define INCR 12345
09
10 void bubbleSort(int numbers[], int array_size)
11 {
12   int i, j, temp;
13
14   for (i = (array_size - 1); i >= 0; i--)
15   {
16     for (j = 1; j <= i; j++)
17     {
18       if (numbers[j-1] > numbers[j])
19       {
20         temp = numbers[j - 1];
21         numbers[j - 1] = numbers[j];
22         numbers[j] = temp;
23       }
24     }
25   }
26 }
27
28 int main(void)
29 {
30   uint32_t *lArray = NULL;
31   uint32_t lCnt = 0;
32   uint32_t lNext = SEED;
33
34   lArray = (uint32_t*)malloc(sizeof(uint32_t) * SIZE);
35
36   for (lCnt = 0; lCnt < ARRAY_SIZE; lCnt++)
37   {
38     lNext = lNext * MULT + INCR;
39     lArray[lCnt] = (uint32_t)(lNext / MASK) % SIZE;
40   }
41   bubbleSort(lArray, ARRAY_SIZE);
42   return free(lArray);
43 }

Moving Targets

In the future, the Go Utils will include a debugger. Google is also looking to facilitate cooperation between Go and C. The language designers are still philosophizing about whether to allow input from object-oriented language concepts, such as exceptions and generics. They are considering writing future versions of the Go compiler, itself in Go, because both a lexer and parser are now available as Go libraries. The project roadmap shows more details [13].

Many of Go's concepts sound promising, but the language still needs to prove its value in major projects. The simple, but powerful, syntax, which is somewhere between Java and C, will interest both camps. The pragmatic library design also promises fast results. Some details of the Go language seem fairly academic, but the language itself is undeniably interesting. The build and execution environment and core parameters, such as execution speed, still need some polishing, but the Go team is already at work addressing those issues. The only remaining question is whether mainstream developers will adopt Go.

INFO
[1] Xkcd "Compiling": http://xkcd.com/303/
[2] The Go Programming Language Google Tech Talk: http://www.youtube.com/watch?v=rKnDgT73v8s
[3] Installation how-to by the project: http://golang.org/doc/install.html
[4] HTTP client-server failures: http://code.google.com/p/go/issues/detail?id=5
[5] Plan 9 from Bell Labs: http://plan9.bell-labs.com/plan9/
[6] Go tutorial: http://golang.org/doc/go_tutorial.html
[7] Effective Go: http://golang.org/doc/effective_go.html
[8] Source code for this article: http://www.linux-magazine.com/Resources/Article-Code
[9] Reader interface io.go: http://golang.org/src/pkg/io/io.go
[10] Stringer interface print.go: http://golang.org/src/pkg/fmt/print.go
[11] Setting up and using gccgo: http://golang.org/doc/gccgo_install.html
[12] Bubble sort: http://www.sorting-algorithms.com/bubble-sort
[13] Go roadmap: http://go.googlecode.com/hg/doc/devel/roadmap.html
AUTHORS

Marcus Nutzinger and Rainer Poisel are on the scientific staff of the Institute for IT Security Research at St. Pölten University of Applied Science in Austria. Within the scope of the "StegIT-2" project, they are researching methods of preventing the embedding of secret messages in voice over IP calls. Both authors lecture on the subject of IT security.