rss resume / curriculum vitae linkedin linkedin gitlab github twitter mastodon instagram
Microservices in Go: Pagination using Elasticsearch
Jun 14, 2021

Disclaimer: This post includes Amazon affiliate links. If you click on one of them and you make a purchase I’ll earn a commission. Please notice your final price is not affected at all by using those links.

I previously covered Elasticsearch, in that post I mentioned what is needed to support searching values in our Microservice, this time I’m going to implement another common feature involving search: Pagination.

How does pagination work in Elasticsearch?

Records indexed in Elasticsearch represent a bulk of available values matching our searching criteria, this collection consists of n records and it represents the maximum total of records at our disposal that we can return back to our customers.

Pagination Elasticsearch - n records

With that in mind, the way the Elasticsearch API works is by using two values from and size, which allow us to determine the specific values to retrieve:

Pagination Elasticsearch - n records

Pagination Elasticsearch - n records

Pagination Elasticsearch - n records

In practice this API works like a sliding window, where we explicitly have to indicate the start (from) and the end (from + size) of that said window while keeping in mind the maximum value (total) we can retrive.

Besides indicating those two arguments during search we should also index our records correctly depending on how we are trying to search the values and what we are planning to use for sorting those records, this is so we can always have a deterministic way to fetch the results.

Another thing to consider is that this way of searching records has a limitation, if we want to search beyond 10,000 records then we should consider using something like Search after which allows a much more advanced way to deal with searching results.



Adding pagination support to our Go service

The code used for this post is available on Github.

In the previous implementation we didn’t explicitly create a mapping for our records, we let Elasticsearch do that for us, this won’t work correctly for our Pagination feature, this time we need to define that in advance before indexing new records.

To do that we have to indicate the properties to use in our mapping as well as the concrete types each field is supposed to use, in our case doing something like the following is enough:

curl -X PUT -H 'Content-Type: application/json' "http://localhost:9200/tasks" -d '
{
  "mappings": {
    "properties": {
      "id": {
        "type": "keyword"
      },
      "description": {
        "type": "text"
      }
    }
  }
}'

That way:

  • id is defined as keyword, this is so we can use it as an actual ID but also for sorting purposes (you will see how this is used if you continue reading), and
  • description is defined as text, this is so we can search by the content of the description.

Next we will have to update our API to support those two new fields and a way to return back the results we found. The way I choose doing that is by defining two new Args types:

type SearchArgs struct {
	Description *string
	Priority    *Priority
	IsDone      *bool
	From        int64
	Size        int64
}

type SearchResults struct {
	Tasks []Task
	Total int64
}

Both are added to the domain package (internal), that way we could add concrete validations to the args one, if needed. This approach may not be ideal when talking about Domain Driven Design but that’s the tradeoff I’m making here.

After that what’s next is to update the current APIs meant to be used for searching to support those new types, specifically I’m referring to the Elasticsearch Repository, changing it from:

func (t *Task) Search(ctx context.Context, description *string, priority *internal.Priority, isDone *bool) ([]internal.Task, error) {

to:

func (t *Task) Search(ctx context.Context, args internal.SearchArgs) (internal.SearchResults, error) {

And finally, we should update the payload in the search request to use those new fields:

query["sort"] = []interface{}{
	"_score",
	map[string]interface{}{"id": "asc"},
}

query["from"] = args.From
query["size"] = args.Size

With all of those changes we have support for pagination!

Conclusion

Implementing pagination in Elasticsearch is more or less easy to do, the hard parts are related to defining the correct mapping and the rules to use when searching records. Truly Pagination enforcing (in a way) the definition of those mappings does not make it harder, using Elasticsearch in general is difficult when we don’t know in advance how our customers are planning to search records, as soon as that is formalized and implemented everything else gets better.

Elasticsearch could be overwhelming from time to time but I assure you after reading the official documentation a lot of your questions should have concrete answers.

If you’re looking to sink your teeth into more Elasticsearch-related topics I recommend the following books:


Back to posts