If you are one of the lucky ones and had the opportunity to work on a Java web project from scratch, or if the project you are maintaining isn’t legacy and/or wasn’t poorly designed (at least from a data modeling perspective), then you most likely have used Spring and Hibernate (likely with JPA) to develop and manage the persistence of your application.
In the case you have been working for a client using these technologies chances are you have had to manage multiple environments, or 2 at the very least - Development and Production, each with it’s own configuration and datasource.
Once you have defined the persistent model of your application, Hibernate is kind enough to offer you a handy tool for creating the schema of your database automatically: Auto DDL. This feature is seen as magical by many java newcomers and with reason, since it can not only generate the initial schema for your database without you having to write any SQL, but can also keep up with the changes of your model as it evolves with time.
This is all good but the thing is that Hibernate only has 2 ways of “keeping up” with your model, it can update the schema, which really only works for schema additions, or it can throw everything out the window (create-drop) and build it again from the ground up.
This rarely is a problem in development, specially if you are working with an in-memory database like H2, but it can be catastrophic in production!
Just imagine throwing away all your client’s precious data just to add a column to a table!
In reality, when dealing with real data you usually disable automatic schema generation by removing auto-ddl completely or by setting it in the validate mode. It’s so unsafe that even the Hibernate team advice against using it in production environments in their book “Java Persistence with Hibernate”.
Then to perform any modifications to your schema, you usually proceed with the following steps:
1) You export your data from the production database, either to modify it in order to adjust it to the new schema or merely as a backup.
2) If you are productive (i.e. lazy) like me, you tell your DB engine to make a schema generation script for your latest version.
Or if you are really into SQL you manually create a script with all the changes you need to perform to the production schema to bring it into the latest version.
3) And finally you either run your custom script or you drop your database and run the new schema generation script and re import the production data into the new schema.
As you can see, much of this process is manual. But the worst part is GOING BACK!
What if you made a mistake?
What if you want to test if a newly encountered bug existed with a previous version of the schema?
You better have that “schema” folder in your local machine (or GIT) with all the versions and the differences of all the changes made each time…
Wouldn’t it be great to have a GIT-like tool for your database schema? Something that can both generate and remember every single change you made to your persistent model, in which order you made them, and have the ability to rollback to a previous “commit” whenever you feel like it?
Liquibase is one of the most popular database version control tools and the one we are gonna be looking at here.
It has a community version which is open-source and a Pro version which comes with a couple extra punches here and there.
I know what you are thinking and don’t worry, the community version is more than enough for most use cases, and for the ones it can’t handle you probably need a DBA anyway.
The basic idea of Liquibase is to provide an automated way of generating database migration scripts to tell your DB Engine how to go from “Schema A” to “Schema B”, where “Schema A” can be any state (including an empty database) and “Schema B” can be any other database schema; and to keep a version log of each of the changes in those scripts.
Leveraging a very powerful plugin called “liquibase-hibernate”, we can also use our own JPA entities as a schema, so we can generate our migration scripts comparing our current (outdated) database schema with our updated model without the need of first pouring our model into a second database.
“Ok i get it, Liquibase is good. Now how do i use it?”
Glad you asked, lets use a basic Spring JPA application (using Java 8) as an example.
Since I don’t want to waste your time (or mine) writing 4 or 5 pages of Spring configuration, we are gonna turn on easy mode (2017+ mode) and use SpringBoot to handle the heavylifting for us, which is something I highly encourage you to do as well on your everyday Spring programming.
We are gonna use Maven to handle our dependencies and make the process even more straightforward, and we are going to use MySQL as our database engine.
Here is the pom.xml file we will use on our little example app:
We are simulating the creation of a REST API so we are importing both Spring Web and Spring Data JPA.
If you look closely you can see that we are not only adding liquibase (core) but we are also including a liquibase maven plugin which we will use to run liquibase commands using maven, and we are also adding the liquibase-hibernate plugin we mentioned earlier.
We have defined only 2 environments to keep things simple, Production and Development.
Naturally, SpringBoot provides built-in support for liquibase and you can easily configure it with a couple of steps:
- Create a folder inside src/main/resources to place all your liquibase related files.
I usually have a “liquibase” folder and a “liquibase/config” folder specific for configuration files, so we are gonna use that setup in this example.
- Inside your liquibase config folder create 1 liquibase.properties file for each environment. The naming convention is as follows:
- The configuration on each environment is essentially the same in this example, since our “production” database will just be a different local database. But the contents of this files will obviously be different once you deploy to a cloud environment.
So here is the configuration of the example:
There is a lot to unpack here but don’t worry, it’s actually pretty simple.
- Everything between
driver is just to tell liquibase how to connect to your database. We could just use the spring boot datasource configuration in the application.properties files but since we also want to use liquibase with the maven plugin we have to add it here.
- The first 3 lines, in order, tell liquibase where to read the changelog file, where to append the changes when using the diff command (more on this in a moment), and where to output any generated changelog files, respectively.
We have set the diff to append any new changes to our existing changeLog file, and also indicated liquibase to generate a new changelog file when running the generateChangeLog command, mainly to avoid unwanted overrites of our precious changelog. (Too many changelogs right? :))
- The last 3 lines that start with “reference” are used by the diff command to know which is the schema to use as reference when comparing with the current configured database.
This is where our liquibase-hibernate plugin comes in. You may have already noticed how weird the referenceUrl looks, well that is because with the help of this lovely plugin we are telling liquibase to use OUR MODEL as the reference schema when generating diff files. If you look closely you will spot the
- By default, if Spring Boot detects that you have imported liquibase into the project it will enable it.
If you want your app to use liquibase directly and keep the schema up to date everytime it runs this is exactly what you want, and all you have to do is tell Spring where to find your changelog file. We do this by adding the following property into the application.properties file:
spring.liquibase.change-log = PATH_TO_CHANGELOG
In my case, the change log is in the “liquibase” folder inside the app’s standard resources folder so the path looks something like this:
spring.liquibase.change-log = classpath:liquibase/liquibase-changeLog.xml
- In the case that you want to use Auto-DDL for your development environment instead of relying on liquibase, or if you don’t want to run liquibase when running the app and instead just want to use it externally to update your production schema, you should disable it using the following spring boot property:
spring.liquibase.enabled = false
So here is how the application.properties should look like:
Super simple right?
And that’s pretty much it really for what configuration is concerned. So now that we are all set up let’s get our hands dirty…
Our app’s model (Version 1.0)
Since we are focusing on how to use liquibase and not how to use JPA we are gonna keep things as simple as possible and start with just a single entity → Customer
This is the way our app models customers:
(I have omitted getters and setters for simplicity’s sake)
And here is the base class for all our persistent entities:
So far so good, a very simple model indeed. So lets move on how to make liquibase use this to create the schema.
Generating the initial changeLog
In the real world you will most likely already have a database up and running when you make the (brilliant) decision of start using liquibase. Is in this situation that you generate your initial changelog using the “generateChangeLog” command (a shocker, i know).
In order to generate the changelog using the liquibase maven plugin all you have to do is run the command specifying the appropriate profile file:
mvn liquibase:generateChangeLog -Pdevelopment
Like we just mentioned, this will generate the changelog file based on an already existing schema, which is the one pointed at by the “url” property we defined in the liquibase.properties file. So if we assume that the customer database has already been created, running the generateChangeLog command will create a file like the following:
IMPORTANT: If you are following along you may have noticed that i have added a “ logicalFilePath=“path-independent” ” property to the <databaseChangelog> tag.
By default, Liquibase identifies each migration/changeset by a combination of it’s id, author and filename (including it’s path). Since we are using a classpath based changelog path it might cause us problems when later on trying to rollback changes because the filename might differ depending on the context.
To avoid this we define this property with a constant value to make sure that every migration/changeset in the changelog its identified only by its id and author fields.
“But i’m following this example so i don’t have a database already…”
In this example we are starting with liquibase from scratch so we don’t have an already existing schema to build the changelog from, but we DO have our entities. So what we can do is use the “diff” command to tell liquibase that we want to append all the changes necessary to go from our empty schema to our entities schema. We do this by running the command as follows:
mvn liquibase:diff -Pdevelopment
Running this command generates practically the same result as running the generateChangeLog command on an already existing database with this schema. The only difference in this case is that the liquibase-hibernate plugin that we are using has a little more information when going from entities to schema and so it can even add a name to the primary key of the table:
Like we said earlier, running the diff command will always append the necessary changes to go from A to B to your changelog file. If the file doesn’t exist it will create it with those changes only.
It’s important to remember that it APPENDS the changes, so if you run it twice without updating the schema after the first run, you guessed it, you will end up with duplicated changes in your changelog.
Now that our changelog is good to go all we have to do to apply these changes to the database is running the app (if we have liquibase enabled in the properties file), or running the “update” command from the liquibase maven plugin: mvn liquibase:update -Pdevelopment
After running the command we can see that the schema has been correctly generated:
Evolving the model (Version 1.1)
Now that we have our initial schema up and running lets do some changes to the model, simulating the natural evolution of our application code.
After some time, our client tells us that he wants to store more data of the purchases like the date of the purchase and the amount.
So now we have to update our model with this new entity. Easy peasy right? We just create the Purchase entity and add the relationship with the Customer:
And after creating it we realize that we can now calculate a customer’s amount spent by just adding up the amounts of it’s purchases, so we add the list of purchases to the customer and remove the “amountSpent” property:
I’m naively assuming we will always need the customers purchases information when retrieving them, and hence i have defined an EAGER fetch strategy.
We now run the diff command again and the changes to the schema are appended to our already existing changelog:
Sweet, now we just run the app or use the update command and our database is up to date once more:
Evolving the model, again (Version 1.2)
After reviewing the app’s features we realize that the customer’s address shouldn’t be just a string, and should instead be a Value Object with a couple of fields: Street, City and Country.
So we refactor, run the diff again and generate a new version of the schema:
This is becoming almost second nature right?
Lets now imagine for a moment that we screwed up. I know, I know, its hard to picture something so unusual, but bear with me for a moment here ;)
We messed up when updating the model and found out that the customer address should be just a string after all. We curse our OO mindset for a bit and try to figure out how to fix the error.
We could update the model to leave everything how it was, generate a new diff for this “1.3” version (which really is 1.1 in disguise), and run the update on the database. Or… we could just rollback to the previous version :)
Luckily for us, using liquibase rollback functionality is quite straightforward, but first we need to understand how liquibase handles migrations under the hood.
Databasechangelog and Databasechangeloglock tables
If you inspect the database we have been modifying up until now, you will notice that there are 2 tables that don’t belong to our schema: “databasechangelog” and “databasechangeloglock”.
This tables are automatically generated by liquibase and they serve the following purposes:
-databasechangelog: Here is where liquibase keeps a record of every changeset/migration run on the database. As the name implies, its basically the database side of the changelog file.
Each changeset corresponds to a row in the table, and is identified by a combination of the id, author, and filename columns.
-databasechangeloglock: This table allows liquibase to acquire a lock on the databasechangelog table to ensure that concurrent access don’t mess up the migrations.
So if we inspect our current databasechangelog table we can see every changeset ran on the database so far:
Going back to “Going back”
So now that we understand how migrations are stored, by inspecting the databasechangelog table a bit the solution to our problem comes up pretty easily. All we need to do is tell liquibase to rollback the latest 4 changesets in that table and we would be back to version 1.1.
Great, so how do we do that? Using the “rollback” command:
mvn liquibase:rollback -Dliquibase.rollbackCount=4 -Pdevelopment
After performing the rollback we can also add a tag to later on be able to rollback without having to count rows in a table:
mvn liquibase:tag -Dliquibase.tag=tag1 -Pdevelopment
mvn liquibase:rollback -Dliquibase.rollbackTag=tag1 -Pdevelopment
So we first rollback our code using git and then we rollback our database schema using liquibase, and we are back to 1.1 in no time!
But wait… THERE IS A CATCH!
Unfortunately there are some changesets that liquibase prefers the developer to decide how to rollback. Liquibase can automatically rollback commands like “create table”, “rename column” and “add column”, but things like “drop table”, “insert data” and “drop column” (like our migration number 8) must have a explicitly defined rollback strategy.
I know, this is a bummer but don’t sweat it, its actually pretty easy to define our rollback strategy. We just add a <rollback> tag inside the changeset we want to rollback with the steps we want to take to rollback it.
In this case we just need to add the “address” column again to the Customers table so it is as simple as modifying the dropColumn changeset to the following:
See? Nothing fancy. Now we can safely run our rollback command and go back to our good old version 1.1!
Keeping it clean
By now you should have a decent idea of how to use Liquibase to your advantage and after seeing (and perhaps copy-pasting) the code examples above, you might as well have just left this page to never return.
But in the event that you are still here, why don’t we apply some good practices and tidy up our log a bit.
Since your model will likely evolve over time you are gonna find yourself adding dozens of changes to your changelog, which will make it very hard to read the file and to find out which changes were introduced in a particular version of the code.
The ideal way of handling this issue would be to have one changelog file for each set of changes introduced in a particular version.
Luckily for us, liquibase makes this pretty easy using the “<include>” tag.
All it takes is:
1) Defining a separate output file for the diff command so that it doesn’t append the changes directly to our changelog. We do this by simply changing the “diffChangeLogFile” path to point to a different file, in the liquibase.properties file.
2) Running the diff command to generate the file and adding the logicalFilePath=“path-independent” property to its changelog tag, as we did before to our base changelog file.
3) Renaming the diff file to something like liquibase-changeLog-VERSION.xml to make it easily identifiable.
4) Adding the reference to the file using the <include> tag in the base changelog file.
After doing that you should end up with something like this:
You can surely see how much we have improved the readability of the file with this simple change!
- As you saw in the rollback section, despite being quite flexible liquibase is not an auto-magical solution and requires the developer to manually define rollback strategies for some of the generated changesets.
- In spite of working correctly most of the time (like 98% in my experience), developers should review the changelogs generated by liquibase to make sure that the changes make complete sense taking into account the changes in the model because Liquibase may define one or two extra changesets for particular operations. The renaming of columns is a good example of this since there is no way that liquibase can figure out on its own that a renaming has occurred, and by default will just generate a dropColumn + addColumn combination.
- The current setup was tested using Java 8. In order to work with higher version of Java (Java 9+) you will most likely need to alter the versions of both the maven and the liquibase-hibernate plugins, on top of also adding some extra dependencies to your pom file.
Liquibase is a powerful tool that allows developers to have a more granular control over the evolution of their application model’s schema. It provides useful and flexible features to ease up the burden of downgrading and upgrading databases schemas, especially in production environments.
Nevertheless it takes some time to get used to it and it’s likely better to use it on an already existing database once the application starts to handle real data and the options provided by auto-ddl start becoming increasingly risky.
We have only scratched the surface here and had a look on the basics using a very simple example, but you will find out that Liquibase provides much more than what we have seen here. Now that you understand the fundamentals behind this database version control tool please feel free to investigate further at Liquibase.org
Liquibase | Database Version Control
Database Version Control
Database Version Controlwww.liquibase.org
Liquibase | Docs | Maven
Liquibase can be controlled via a Maven plug-in which can be obtained from the central Maven repository. You can find…
Liquibase Hibernate Integration. Contribute to liquibase/liquibase-hibernate development by creating an account on…
Chapter 5. Ant Tools
The hibernate-tools.jar file, available from tools.hibernate.org, contains the core code for Hibernate Tools™. It is…
Would you like to know more? Do you need our help? Contact Us!