1. 程式人生 > >URL Shortener Golang web service with Mongodb

URL Shortener Golang web service with Mongodb

Web technologies are in the heart of the software industry in this day and age. And one of the most popular approaches for web services to communicate with the outside world is the HTTP RESTful API design, which is integral in making modern software scalable and open. At the same time,  NOSQL databases

 are gaining more and more market share as the new data storage engines of choice for a lot of software professionals. Since Go is a very modern language, it’s a very smooth experience to combine the two technologies via Go. This article serves as a practical tutorial into how to build a Golang web service combined with the popular
Mongodb
NOSQL document store database. The web service does URL shortening, meaning you could use it to beautify long ugly URL addresses, by providing a shorter URL that you would like to use instead (think http://tinyurl.com).

All the code can be found at GitHub. Here are the different sections for the tutorial:

Go and HTTP REST

Go comes natively with an enough toolbox to build a super powerful web service that supports RESTful API ,however, I found it more fun as well as practical to work with the Gorilla web toolkit package which provides a thin layer on top of Go’s native web libraries. Gorilla makes it more convenient and straight forward to write web services in Go. This tutorial uses the Gorilla Mux library to build our Golang web service, if you are interested in how Go native web libraries work, check out this article. There are of course other options to write web software in Go, they include Martini and Revel among others.

Go and Mongodb

Mongodb is a popular scalable document store NOSQL database, what this simply means is that Mongo doesn’t rely on a bunch of tables with a bunch of relations between them like your typical SQL database. It rather relies on “Collections” & “Documents”. A Document is simply what the definition implies: a document that contains some data, it is equivalent to a row in a SQL database. A Collection is a bunch of similar documents stacked together, this is the same as a SQL table. Multiple collections then fall under a “database” which serves the same purpose as a database in a SQL environment.

NoSQL is gaining more and more popularity nowadays due to it’s practicality. If you want to store a lot of data that doesn’t necessarily need complex relations throughout, or table like data structures. A NoSQL database will be a worth companion.

More information on Collections and Documents are here.  A good read to get the gist of Mongo is here. When writing code in Go, we use the powerful  mgo package to interface with Mongodb and unleash it’s full potential.

The link shortener Golang web service design

Now it’s time to build the API. Like any nice piece of software, let’s start by the design phase.

First I envisioned the API mechanics as follows:

  1. The Golang web service starts by listening to http requests
  2. The requester would send an HTTP POST request to  the url <http path>/Create with a JSON body that looks like this:
    { 
        shorturl : "myshorturl"
        , 
        longurl:"http://path/to/a/long/url!"
        
    }
    
  3. The web service behind the API stores a mapping between the shorturl and the longurl in a Mongodb document
  4. A requester sends an HTTP GET request with a short url that looks like this: http://path/shorturl
  5. If the short url exists in the database, the web service redirects the request to the long url that corresponds to the short url provided
  6. If the short url does not exist, return an error message

Sounds good? good, now let’s think what components should go into building the web service. Based on the mechanics of the API explained above, we need:

  1. A web server that awaits and serves http requests ( REST API server layer)
  2. A data layer to handle the interaction with the database
Golang Web Service Design
Golang web service Design

The link shortener Golang web service REST layer

In order to write a proper Golang web server in Go via the Gorilla toolkit, we need to:

  1. Create routes that describe of URLs we support in our web server
  2. Create handler functions that describe the action the API needs to take for each URL
  3. Create a router to handle incoming traffic, the router will need to be initialized with the routes and the handlers
  4. Start listening to incoming traffic and serve the incoming requests via the newly created router

Let’s explain those points in a more insightful detail

Handler functions

In the Go world, http requests are typically handled by functions of type http.HandlerFunc. This simply means that if your function has the following signature, you will be able to use it to define the logic you want to trigger when http requests to your web server occur:

func myHandlerFunction(w http.ResponseWriter, r *http.Request) {
    //read from r and write to w!!    
}

Here is an example of the handler function we would want to trigger when a user visits the root URL of the web server

func UrlRoot(w http.ResponseWriter, r *http.Request) {
    fmt.Fprint(w, "Hello and welcome to the Go link shortner API \n"+
    	"Do a Get request with the short Link to get the long Link \n"+
		"Do a POST request with long Link to get a short Link \n")
}

The handler functions contain two parameters:

  1. An http.ResponseWriter which is used to write our response to the http request
  2. An *http.Request which we use to understand the contents of the incoming request
But how do I map a handler function to a specific incoming http request?

Enters the concept of routes and routers. First you create a list of routes that link your handler functions to a url and the HTTP request type, then you feed these routes to a router which will take care of the rest

type Route struct {
    Name        string
	Method      string
	Pattern     string
	HandlerFunc http.HandlerFunc
}

type Routes []Route

/*
	Create the routes for the API. The API supports three URLs:
		1- GET "/" => Shows a description for the API
		2- GET "/{shorturl}" => If the shortUrl exists in the backend database, redirect to the long url that corresponds to func init() {
		3- Post "/Create" => Takes a post request with http body of {
																	shorturl: "short Link"
																	longurl:  "original long link"
																	}
		 Causes the API to create a mapping between the short url and the long url in the backend database
*/

func CreateRoutes()  Routes {
	return Routes{
		Route{
			"UrlRoot",
			"GET",
			"/",
			UrlRoot,
		},
		Route{
			"UrlShow",
			"GET",
			"/{shorturl}",
			UrlShow,
		},
		Route{
			"UrlCreate",
			"POST",
			"/Create",
			UrlCreate,
		},
	}
}

Wondering what {shorturl} is? The curly braces is basically how we tell the router if it is to expect a variable. In the case of the UrlShow Get request, this variable stores the short url.

You can then retreive the variable from the handler function by using the mux.Vars[“variable”] , here is how the function will look like:

func (Ls *LinkShortnerAPI) UrlShow(w http.ResponseWriter, r *http.Request) {
    //retrieve the variable from the request
	vars := mux.Vars(r)
	sUrl := vars["shorturl"]
	if len(sUrl) > 0 {
        //Ls.myconnection is a pointer to the data layer
		//Find long url that corresponds to the short url from the database
		lUrl, err := Ls.myconnection.FindlongUrl(sUrl)
		if err != nil {
			fmt.Fprintf(w, "Could not find saved long url that corresponds to the short url %s \n", sUrl)
			return
		}
		//Ensure we are dealing with an absolute path
        //Redirect to the correponding long url
		http.Redirect(w, r, lUrl, http.StatusFound)
	}
}

Now to feed these routes to a router:

func NewLinkShortenerRouter(routes Routes) *mux.Router {
    // When StrictSlash is set to true, if the route path is "/path/", accessing "/path" will redirect
	// to the former and vice versa.
	router := mux.NewRouter().StrictSlash(true)
	//Feed the router the necessary information for the web service to function properly
	for _, route := range routes {
		router.
			Methods(route.Method).
			Path(route.Pattern).
			Name(route.Name).
			Handler(route.HandlerFunc)
	}
	return router
}

“mux” in the code is the name of the Gorilla package that includes the router.

JSON parsing

Now that we have a router created that can map a function to a specific HTTP request, it’s time to figure out how to handle a POST request. Like any respectful REST API, when the Golang web service receives a POST request, it will need to read the JSON body of the request and parse it to something meaningful. In Go, the package to encode & decode JSON formats is called encoding/json. Here is the gist of how it works:

  1. You create a Go struct to represent the JSON data using struct tags, here is how it looks like for our program:
    type UrlMapping struct {
        ShortUrl string `json:shorturl`
    	LongUrl  string `json:longurl`
    }
    
  2. If you want to parse JSON data  from an HTTP request body to the struct, you create a decoder around the http request body reader (which is like a stream in other languages), then give it a variable of the struct type that represents the JSON body. For our program, the code will look like this:
    func (Ls *LinkShortnerAPI)UrlCreate(w http.ResponseWriter, r *http.Request) {
        //create a pointer to the UrlMapping struct
        //urlMapping struct maps to the JSON body
        reqBodyStruct := new(UrlMapping)
    	//create a new decoder around the http request body, then feed it the newly created struct variable
        if err := json.NewDecoder(r.Body).Decode(&reqBodyStruct); err != nil {
    		w.WriteHeader(http.StatusBadRequest)
    		return
    	}
        // use the data layer to add url mapping
    	Ls.myconnection.AddUrls(reqBodyStruct.LongUrl, reqBodyStruct.ShortUrl)
        return
    }
  3. If you want to parse JSON data from a Go struct to an HTTP response, you create an encoder around your http response writer (again like a stream in other languages), then give it the variable of the struct type that represents the JSON body. Let’s say for our program, we want to add a status message response that the API should send back to the requester whenever we receive a new POST request:
    • First, the struct containing our status message will look like this:
      type APIResponse struct {
          StatusMessage string `json:statusmessage`
      }
    • Second, the encoder will look like this:
         // w is http.ResponseWriter
         responseEncoder := json.NewEncoder(w)
         // LS.myconnection is a pointer to our data layer
         err := Ls.myconnection.AddUrls(reqBodyStruct.LongUrl, reqBodyStruct.ShortUrl)
      	if err != nil {
      		w.WriteHeader(http.StatusConflict)
      		if err := responseEncoder.Encode(&APIResponse{StatusMessage: err.Error()}); err != nil {
      			fmt.Fprintf(w, "Error %s occured while trying to add the url \n", err.Error())
      		}
      		return
      	}
      	responseEncoder.Encode(&APIResponse{StatusMessage: "Ok"})
  4. In a proper Golang web service, chances are you will end up putting the encoder and decoder together in the same handler function since you would decipher the request and then send some kind of response back to the requester. Here is how the final function to create a short url to long url mapping will look like in our code:
    func (Ls *LinkShortnerAPI) UrlCreate(w http.ResponseWriter, r *http.Request) {
        reqBodyStruct := new(UrlMapping)
    	responseEncoder := json.NewEncoder(w)
    	if err := json.NewDecoder(r.Body).Decode(&reqBodyStruct); err != nil {
    		w.WriteHeader(http.StatusBadRequest)
    		if err := responseEncoder.Encode(&APIResponse{StatusMessage: err.Error()}); err != nil {
    			fmt.Fprintf(w, "Error occured while processing post request %v \n", err.Error())
    		}
    		return
    	}
    	err := Ls.myconnection.AddUrls(reqBodyStruct.LongUrl, reqBodyStruct.ShortUrl)
    	if err != nil {
    		w.WriteHeader(http.StatusConflict)
    		if err := responseEncoder.Encode(&APIResponse{StatusMessage: err.Error()}); err != nil {
    			fmt.Fprintf(w, "Error %s occured while trying to add the url \n", err.Error())
    		}
    		return
    	}
    	responseEncoder.Encode(&APIResponse{StatusMessage: "Ok"})
    }
Is that all for the web server?

Nope, but there is only one little step left before our web server can accept incoming data. We need to tell our software to listen to a specific address , which will be our root. In Go, this is done via the http.ListenAndServe() function. In our code, say we want to listen to local port 5100, so it will look like this:

//This will start the web server on local port 5100
    http.ListenAndServe(":5100", router)

And this should take care of the web part of our Golang web service, now let’s switch focus to the data layer.

The link shortener Golang web service data layer

Now let’s talk database. In a well designed piece of software, a data layer in an application is typically the part of your program that takes care of any interaction with the backend database where you save and retrieve data. This ensures the rest of your program doesn’t have to know about any of the specifics of the exchange between the web service and the database, which makes the code cleaner, easier to maintain by multiple teams , and easily extensible.

I like Mongodb, it is simple yet powerful , and has a passionate community around it. It’s free, open source and can fix a lot of complex application problems. Mongodb uses a binary form of JSON called BSON to store data in the documents, this means that data stored in a Mongodb document could be easily modeled like any JSON document. I use the mgo package to interact with Mongodb from within the Go.

So how to use mgo?

mgo uses the concept of “sessions” to connect to the database, a session could be considered a socket connection from a socket pool. In order to get a working session, you need to:

  1. Dial the database, similar to tcp dialing in Go, mgo will give you back a session
  2. You use the session provided to query one of the databases in Mongo
  3. From the database, retrieve a collection
  4. Now, you can do all the updates, the inserts or the reads you need on the documents inside the collection

In our code, we will create a struct to be the model between our data layer and the rest of the Golang web service. Structs in Go are the closest thing you would get in the language that would resemble an object. Our struct will need to store an instance of the session so that we could reuse it for subsequent requests.

type MongoConnection struct {
    originalSession *mgo.Session
}
Unique indexing

For the purpose of the link shortener API, we need to ensure that we won’t have multiple shorturls with the same name in our database. This will ensure the mapping is unique and no collisions would occur when two users request to create the same shorturl. We accomplish that by creating a unique index and attaching it to the shorturl field.

index := mgo.Index{
    		Key:      []string{"$text:shorturl"},
			Unique:   true,
			DropDups: true,
		}
urlcollection.EnsureIndex(index)

Now we attach a function to that struct to create the connection for our Golang web service

func (c *MongoConnection) createLocalConnection() (err error) {
    fmt.Println("Connecting to local mongo server....")
	c.originalSession, err = mgo.Dial(CONNECTIONSTRING)
	if err == nil {
		fmt.Println("Connection established to mongo server")
		urlcollection := c.originalSession.DB("LinkShortnerDB").C("UrlCollection")
		if urlcollection == nil {
			err = errors.New("Collection could not be created, maybe need to create it manually")
		}
		//This will create a unique index to ensure that there won't be duplicate shorturls in the database.
		index := mgo.Index{
			Key:      []string{"$text:shorturl"},
			Unique:   true,
			DropDups: true,
		}
		urlcollection.EnsureIndex(index)
	} else {
		fmt.Printf("Error occured while creating mongodb connection: %s", err.Error())
	}
	return
}
Multiple concurrent sessions

Now that we have our connection to the database established, we need to consider the possibility of heavy traffic. Say you built the link shortener Golang web service today and everybody likes it, you will get tons of people posting and getting requests. You would guess that having a single session to the database handle all of the traffic would not performant, and you’d be right. What to do then?

mgo offers the ability to generate multiple concurrent sessions via a socket pool. You can invoke a new session from an original session by calling originalSession.Copy(). This will create a new parallel session with the original session’s authentication information that you could use at your own leisure. You then close the new session when you are done with it. After you close the session, the socket it used will go back to a socket pool.

func (c *MongoConnection) getSessionAndCollection() (session *mgo.Session, urlCollection *mgo.Collection, err error) {
    if c.originalSession != nil {
		session = c.originalSession.Copy()
		urlCollection = session.DB("LinkShortnerDB").C("UrlCollection")
	} else {
		err = errors.New("No original session found")
	}
	return
}
Read and write data to Mongodb

To work directly with Mongodb documents via mgo, we first need to create a struct that can host the document. The struct that will host the document has to match the Mongodb document it models,  we then use struct tags to represent the field names. Here is how it looks like:

type mongoDocument struct {
    Id       bson.ObjectId `bson:"_id"`
	ShortUrl string        `bson:"shorturl"`
	LongUrl  string        `bson:"longurl"`
}

Now we could use that struct in our queries to read or write data to the database.  The read queries in Mongodb are straight forward: If I want to find a shorturl called “blah” in the database for example, the query will look like be:

1 <span class="nx">db</span><span class="p">.</span><span class="nx">collection</span><span class="p">.</span><span class="nx">find({shorturl:"blah"})</span>

In case of Go and mgo, the API call will look like this:

result := mongoDocument{}
err = urlCollection.Find(bson.M{"shorturl": shortUrl}).One(&result)
if err != nil {
    	return
	}

bson.M{} is used to indicate that the query should be bson format. This is required when doing Find() via mgo

Now let’s see how the entire function would look like

func (c *MongoConnection) FindlongUrl(shortUrl string) (lUrl string, err error) {
    //create an empty document struct
	result := mongoDocument{}
	//get a copy of the original session and a collection
	session, urlCollection, err := c.getSessionAndCollection()
	if err != nil {
		return
	}
	defer session.Close()
	//Find the shorturl that we need
	err = urlCollection.Find(bson.M{"shorturl": shortUrl}).One(&result)
	if err != nil {
		return
	}
	return result.LongUrl, nil
}

Now, to insert data to mongodb, we use the mgo insert function.  How it works is that you provide the insert function a struct that contains the new data you want to insert. We then check if an error occurred, when an error occurs,we can verify whether it was a duplicate short url issue or not so that we can report it back. To check the duplicate error, we use the mgo.IsDup(err) method.

func (c *MongoConnection) AddUrls(longUrl string, shortUrl string) (err error) {
    //get a copy of the session
	session, urlCollection, err := c.getSessionAndCollection()
	if err == nil {
		defer session.Close()
		//insert a document with the provided function arguments
		err = urlCollection.Insert(
			&mongoDocument{
				Id:       bson.NewObjectId(),
				ShortUrl: shortUrl,
				LongUrl:  longUrl,
			},
		)
		if err != nil {
			//check if the error is due to duplicate shorturl
			if mgo.IsDup(err) {
				err = errors.New("Duplicate name exists for the shorturl")
			}
		}
	}
	return
}

And that should cover all the major parts of the web service. Hope that was useful to you ?