Best DevOps Practices: Axial

Hi all! We're continuing our Best Practices interviews with companies that are known for their innovative approaches in "DevOps." The second short interview is with Matt Story, Director of Engineering at Axial.

"Devops" is a term loosely thrown around these days. What does it mean to you?

DevOps is a holistic programatic approach to systems engineering and operations that emphasizes programmatic solutions for provisioning, deployment, configuration management and testing rather than doing it by hand.

What's the most important thing an engineering organization can do to cultivate 'systems thinking' in its engineers? 

We try to educate every engineer on the strengths and weaknesses of their underlying operating system and hardware. To make sure we're hiring engineers with a systems focus, we always ask systems questions of candidates, and place a high premium on a solid understanding of UNIX systems for every role.

One other thing we've found key is to require each engineer to package their own software before it's "done". This means that the automated build/deploy is everyone's responsibility, and means you can never "throw it over the fence" to someone in ops.

Do you primarily run your apps/services in or out of the cloud? Why?

We are 100% hosted on AWS. We've chosen to stay there because they allow us to run leaner, meaning nobody at Axial wastes their time dealing with hardware or ISP vendors, nobody has to spend a weekend putting drives on sleds, putting sleds into 2U boxes, or putting 2U boxes into the rack. All the time we would have spent doing that, we spend automating our build/deploy systems and supporting engineers.

Where have you introduced automation into your systems engineering process? What were the results?

We've automated our software build and deploy, and our provisioning. After completing this project we went from engineers needing 1 - 3 days to get their box back-up-and-running after destroying it to engineers needing 5 - 10 minutes. Engineers are more productive, and can recover easily from mistakes, allowing them to move faster and experiment freely in their own environment.

How does your team/organization approach automated testing?

This is a work in progress. We're just starting to build a functional test suite using Selenium, and we've built a few unit tests for critical portions of our platform. We're definitely not there yet, but we're starting to invest in getting there.

What do you think automation is especially *not* suited for?

Acceptance testing. At some point you need a human to look at a feature before it goes out, poke at it, and determine that it makes sense and is ready to be released to the wild.

What are your perspectives on continuous integration tools? Are you using any?

As our testing becomes more robust we're planning to move from daily releases to continuous ones. We're not quite yet ready to start evaluating tools yet.

What profile of individual makes the best systems engineer? (assuming automation is key)

Great systems engineers are extremely hard to find. Systems engineering is part distributed programming, part engineer support. To be great at automated deployment, configuration and dependency management you need a keen understanding of the hardware/virtual machines, OS, and a set of heuristics that help you avoid the race conditions inherent to distributed systems. To be great at supporting engineers you need to be patient, but not a push-over. Finding a great programmer, with the requisite systems knowledge, who is also patient and plays well with engineers, but knows how to say no gracefully is a big challenge.

What tools/training do you suggest a more traditional systems administrator acquire in order keep his/her skills relevant (assuming systems automation will increasingly become the norm)?

It used to be that if you knew section 1 of the UNIX manual and had a strong understanding of the TCP/IP model, you were qualified to be a sys admin. As the tools for going fully automated become increasingly robust, however, mastery of a general purpose scripting language like Python or Ruby is becoming key. In addition to learning one of these, SysAdmins should be familiarizing themselves with an automation framework (pick one of Chef, Puppet, Ansible or CFEngine).

Which engineering teams/individuals do you watch for cutting edge tips, tools, and architectures when it comes to systems automation?

Etsy and Netflix always seem to be doing great work on automation. Etsy's deployinator, and Netflix's Chaos Monkey were a breath of fresh air and a shot across the bow of traditional systems engineering when they were released, and they are both still doing interesting work today.