Wikipedia defines Schema Migrations as (emphasis mine):
… refers to the management of incremental, reversible changes and version control to relational database schemas. A schema migration is performed on a database whenever it is necessary to update or revert that database’s schema to some newer or older version.
… and a related definition of Database refactoring as:
… simple change to a database schema that improves its design while retaining both its behavioral and informational semantics
Repository including the code example.
When building software products, part of the idea of using control version systems is to improve the collaboration between all team members working on the project, this definitely includes checking in the code and their dependencies used for building the actual artifacts that in the end customers use, but also everything that happens to be needed for that final product to be available.
Usually products we build need a datastore, most of the times that datastore happens to be a Relational Database. Keeping track of the evolution of that datastore is important, specially when evolving live environments to determine if new changes break existing data.
Tools
In general, no matter what language there were programmed in, database schema migration tools follow two flavors:
- DSL-based: where there’s some sort pseudo language for defining the instructions meant to be used for interacting with the database engine; and
- Explicit SQL: where literally SQL Statements are used to define the required instructions.
For Go specifically the following are some concrete options available:
- Build your own 😄
- golang-migrate/migrate: CLI + Package (Explicit SQL),
- github.com/go-pg/migrations: Package (Explicit SQL),
- github.com/jackc/tern: CLI (Explicit SQL),
- github.com/gobuffalo/pop: CLI (DSL + Explicit SQL), and
- many more…
My personal favorite is golang-migrate/migrate
because it supports a variety of data inputs (migrate
calls them sources), to mention a few:
- Filesystem, reads from filesystem, or
- AWS S3, reads files from Amazon Web Services S3.
With the option to implement your own in case your input is not supported, and most of the popular database engines, for example:
And because this range of different sources and databases, I recommend you to keep in eye out when installing migrate
because by default it installs all of them, and maybe that is something you don’t want because it brings dependencies to your project you don’t need, for example, for building migrate
with Filesystem and PostgreSQL only, you would need something like:
go install -tags 'postgres' github.com/golang-migrate/migrate/v4/cmd/migrate
Running migrate -h
should indicate the supported source and databases drivers, in the case above it prints out:
Source drivers: file
Database drivers: postgres, postgresql, stub
Best Practices, Conventions and Guidelines
Selecting a tool for handling migrations is one part of the puzzle, the other one would be to define the best practices, conventions and guidelines to follow when working with database schema migrations:
Define filename conventions
Tools that support creating migrations, like
migrate
, provide a default format for those generated files, in the case ofmigrate
by default the files are prefixed with the timestamp the tool was invoked; it also support using files in sequence.Prefer using SQL instead DSL
Using DSL instead of explicit SQL has at least two benefits: to allow engineers with not a lot of SQL experience to contribute changes to database schema migration files quickly, and to avoid database engine lock-in. In my experience learning SQL and using explicit statements is a better time investment, because it lets you understand exactly what instructions are being executed in your migration files.
Actual migration should be part of the deployment pipeline, to be deployed together with the code that refers to those changes at the same time.
The idea behind this, is to allow reacting to possible failures quickly, if we deploy the code changes before the database schema changes we could get errors when code refers to non existing changes in tables, a similar situation happens if we deploy the migration before but in a different step. We should make an effort to migrate the database up and then apply the new code in the same step that is part of your pipeline, that way if the migration fails the code meant to use those changes will not be deployed either.
Conclusion
Database schema migrations help you maintain changes made to your database and although there are lot of tools available in the Go ecosystem the tricky part is not selecting one of them but finding the perfect balance when making those changes live and the steps needed to follow when reacting to migration errors.