Sunday, 6 March 2016

Understanding GOPATH

During the process of writing the Bing Web API in Go, I learned a lot about how to use the GOPATH environment variable properly, and assembling all my projects in the same project path. Doing this makes all Go development easier. It actually took me a long time to accept that this was the best way to do it, but now I'm reorganising all my code to fit this pattern. 

For those who want to have a better understanding of GOPATH, have a look at the go environment variable and the go build package. It basically says

GOPATH=/home/user/gocode

/home/user/gocode/
    src/
        foo/
            bar/               (go code in package bar)
                x.go
            quux/              (go code in package main)
                y.go
    bin/
        quux                   (installed command)
    pkg/
        linux_amd64/
            foo/
                bar.a          (installed package object) 
In my case "foo/" is "github.com/borglefink/" and "bar/" is "bingapi/". All new projects hosted at my github.com account are added as siblings to "bar/" (like the suggested "quux/" above). Each of those "siblings" are connected to their own git repository. If I don't have my projects open-sourced, I can keep them on the "foo/" or the "bar/" level, whatever works for me.

If you don't quite "get" the structure, I really recommend looking at the documentation mentioned above.

Bing Search API in Go

In my thesis work I'm currently using Node.js (with Express etc, etc) as a client for doing web search with the Bing Search API under the hood. This works fine, it is snappy and a delight to work with.

However, under the hood there is more than meets the eye. Bing Search API has a "page size" limitation of 50, and a overall search limit of 1000 (searching through the API will give 1000 results as a maximum). This means that to get the 1000 results, 20 searches รก 50 results is necessary. 

This is not entirely straight forward due to Node.js' asynchronous nature. Some plumbing is needed to be sure that all 20 search results have arrived before presenting this to the user. There are ways around this (like showing the first 50, and putting the rest in storage when they arrive), but I wanted to get all results at the same time. So some synchronisation was needed.

As a part of testing the API, a set of offline (batch) routines was devised, and I decided to use Go for this. Using a language for this designed for concurrency must be smart, yes? Only there wasn't a full Bing API for Go. 

So I wrote one. Have a look at the source code at GitHub, and the documentation at GoDoc.