Continuous Delivery for Java Applications
If writing working code is a challenge then writing working deployable code is a really big challenge. If you’ve been in the software business for more than a few months, you’ve heard statements like “it works fine on my machine” or “it worked in {Environment A} so I don’t know why it doesn’t work in {Environment B}”? As responsible engineers and developers, it’s our duty to deliver working, deployable code.
In fact, delivering code that cannot be reliably and repeatably deployed to target environments is not far from not delivering any code at all and is certainly not an enterprise-grade solution. Understanding and controlling the environment that an application is designed to be run within is a core part of every application development effort. In fact, it is key to delivering reliable, repeatable deployments. Yet, all too often we see deployment pushed to the end of a development cycle. We sometimes even see deployment outsourced to other teams that know nothing of the application the development team spent considerable time developing. These are clearly anti-patterns.
The DevOps approach merges software development with systems administration, erasing the line between developer and deployer which in turn enables development teams to move towards a Continuous Delivery approach to coding deploying and releasing software. One very important pattern that has emerged from DevOps and CD thinking has come to be known as the Dancing Skeleton pattern. As we apply it within our group, each new software project starts with the installation and configuration of Continuous Integration, Build, and Deployment tools. We prioritize this work first, building a skeleton application that follows our target CI/CB/CD workflow. Moving the setup of the entire infrastructure to the beginning of each project’s lifecycle allows us to fully realize the Agile vision of moving each software feature from design to development to testing to deployment. Indeed, we fully automate unit testing, code quality analysis, code coverage analysis, integration testing, deployment and even documentation generation as our first priority. Over the past year, we have selected tools that have allowed us to perform each of these functions and then have standardized their usage through a combination of systems and process-level automation.
The Environment
For our current projects, most of our software development is being done in Java, an established language with a huge base of open-source libraries available to it. Of course, managing all those library dependencies can be a deployment nightmare, which is why we employ a dependency management system. Our dependency management system choice was Maven. We have found Maven to be the Swiss Army Knife of Java build management. It is based on a plugin framework so it is easily extensible. It has a massive contributor base which provides incredible amounts of functionality through simple configuration, and most popular software libraries that one would want to include have Maven packages generally available for them. It is, for these reasons, the cornerstone of our Java-based continuous delivery environment.
Each developer in our teams runs the IDE of their choice on the OS of their choice. We require only that agreed-upon versions Java and Maven be installed on the developers’ machines. Like all other projects at Medidata, we use Git as source code repository, with our central repo being hosted by GitHub.com. Each project is started from a basic Maven config file known as a POM file. This basic config file references a centrally stored configuration (we call it the SuperPOM) that includes code quality, unit testing, BDD testing, code coverage, code documentation, etc. configuration info. Thus, when a new developer joins an existing project or when we start a new project, the SuperPOM sets up a development environment that includes all this for us.
Using Maven, we have incorporated a number of tools that allow us to develop quality code. We use CheckStyle to analyze our code against a set of style criteria that we manage through our SuperPOM. These criteria include checks for proper headers and JavaDoc style comments, proper formatting, Java coding best practices and code quality metrics. We use JaCoCo to analyze and report on code coverage metrics, which allows us to set minimum standards and fail builds when we don’t meet them. We use Maven plugins to run JUnit unit tests and Cucumber scenarios as well.
Once code is checked into Git and pushed to GitHub, we use Jenkins to manage Continuous Build and Integration. Using Maven and our SuperPOM, Jenkins pulls down code and executes the appropriate commands to build and deploy on the Jenkins server. If the app builds successfully, passes all tests, and meets all coverage and quality standards, Jenkins tags the commit in GitHub to indicate that it is a candidate for deployment to our AWS cloud environment. At Medidata, we use a highly customized version of the Capistrano remote server automation tool to automatically deploy our code to AWS EC2 instances. We call this customized deployment tool Medistrano. Right now, the push to AWS EC2 using Medistrano is manual in the sense that someone needs click the Execute Deployment button within the application, but we are working on a Jenkins extension that will do this for us, monitoring the outcome and reporting that back to us as well.
We also use Jenkins and Maven to realize another goal - Continuous Validation. Using the Maven Site Plugin we are able to generate all documentation required to support our formal validation process. With each successful build within our Validation environment, Maven generates a web site that houses all our documentation and Jenkins copies that documentation up our central Validation Portal. This gives a completely DRY (Don’t Repeat Yourself) deployment - we write all our documentation to a certain directory in our GitHub repo, Maven runs all our tests and generates dynamic content for us and Jenkins moves all those documents to our Validation Portal.
Finally, some of you have used package management systems in the past may be saying “Wait a minute, what do you do if the package dependency you have isn’t available from a central Maven repo?” Others may be saying, “We don’t want to rely on the Internet for the download and install of critical package dependencies!” The last piece of the puzzle for us has been the setup of an internal Maven package management system. We selected Artifactory for this purpose. The free version of this tool handles all of our Maven needs while the licensed version will allow us to extend support to .NET, Ruby, RPM and Debian packages. We leverage this internal repo to close remaining gaps in our automation process. With Artifactory, we are able to generate Maven packages for proprietary software (e.g. Microsoft SQL Server JDBC drivers) as well as internally coded utility libraries. By relying on Maven’s version management plugin and its integration with Artifactory, we are easily able to manage multiple production versions of shared resources. In addition, we leverage a Jenkins integration with Artifactory to allow us to maintain synchrony between development versions of our libraries.
The most amazing thing about this environment is that has cost the company absolutely nil in terms of software license fees. Every tool that we have employed to achieve this has been completely free - as in freedom and as in free beer. It is a truly satisfying experience when, at the end of a feature deployment cycle, we simply hit the Execute button in Medistrano and watch the code deploy and the application start running. In fact, I’ll even buy the beer for that.