Rand has talked about the technical debt that is impacting our ability to grow and deliver new products. We knew we’d have to bite that bullet at some point, but sometimes it’s not a clean bite…you’ve got to gnaw away at it until you finally break through.
To that end, we created an 18-month roadmap to pay back that technical debt, and have worked out the stepping stones needed for each team to chip away at that proverbial bullet. It’s going to take a lot of hard work and some of our funding to help get us there, with the ultimate goals of giving you, our customers, greater value, enabling further growth, and getting to 99.9% uptime. We’ll update you as we take each step along the way. But for now, take a look at the roadmap as we see it.
Get to 99.9% Uptime
The first step on the road to success is upgrading system operations. We’re focusing our efforts here on hardening our network infrastructure and increasing system redundancy and monitoring, with the following key goals:
- Better and redundant equipment: We’re implementing the network at our own co-location facility in a way that allows us to grow and is not as vulnerable to equipment failures. We are also moving off hosted servers, load balancers, and switches in favor of our own equipment. The new equipment is much higher quality, and will be duplicated here in Seattle and at our colocation site in Herndon, Virginia.
- Rigorous monitoring: I love that we have enthusiastic customers willing to tweet when one of our systems is down, but that is not the normal way to monitor systems! Our system administrators are implementing monitoring not only on our servers, but also on the jobs, queues, and a plethora of other things that keep our service running. Increased monitoring will help us catch problems before the servers go down, and hopefully head off problems like the latest rankings outage before they affect our customers.
The Tech Ops Team
|
|||||
Mark |
David |
Stephen |
Jacob |
Nicholas |
|
|
|
|
|
|
|
Fay |
Dave K |
New System/ |
New DBA |
|
|
The Tech Ops Stepping Stones
Deliver Our Largest, Freshest, Most Reliable Index
In parallel to this systems work, we are also working on our applications reliability and scalability. The Big Data team’s work includes:
- More reliable data processing: We’re moving our processing out of the cloud and onto our own hardware.
- Fix things right: We now have the luxury of the time and a little cash in the bank to do things right. We’re not going to cobble together a hack that will get us over the hump today, but will come back to bite us tomorrow.
- Improve the index: Our goal is to triple our index size and release more frequently, getting back to our May 2012 index size, while also increasing freshness…with the ultimate goal of creating an index every 7-10 working days.
The Big Data Team
The Big Data Stepping Stones
Make Everything Bullet-proof
The Production Engineering Team (PE) is knee-deep in the bowels of the production systems: reviewing code, suggesting where new or more hardware could be used, and making things more maintainable and bullet-proof in general. PE has already implemented code changes to our core systems over the last few weeks to address some of the current sticking points. Some of the things this team is working on:
- New servers: We’re in the process of standing up over 200 new servers.
- Reducing complexity: We’re reducing the types of databases and queuing systems we run on. We’re picking systems that either we can support or that have dependable support to help us reach our goal of 99.9% uptime. Between data storage/retrieval and queuing, we have 7 (that I know of) different types of systems. We aim to get down to one queuing system and two or three different database types.
For more information on these recent fixes, check out the blog post Where are My Rankings?
The Production Engineering Team
The Production Engineering Stepping Stones
Net New Development
The Net New Development Team is working on implementing on new product features. Shhhhh!
The Net New Development Team
New Net Stepping Stones
Rock the Marketing Website
Inbound Engineering is the team focused on the Marketing website. The team goals are:
- Create new services: Create the Common Email service, the new Moz Authorization service, and the front end for Q&A.
- Upgrade billing: Upgrade our billing infrastructure for more reliable payment processing.
- Upgrade the website: Build additional functionality into the marketing website.
Inbound Engineering Team
|
|
|
|
|
|
Casey |
Dudley |
Devin |
New PHP |
New PHP |
New PHP |
Inbound Stepping Stones
Make Tweets Sing
The Followerwonk team is working on advancing the customer experience and digging deeper into Twitter and what makes Tweets sing. We’re going to use split-testing to specific goals to measure customer experience, which will help us decide on designs and features that our customers like the best.
Followerwonk Team
|
|
|
|
Peter |
Galen |
Marc |
Amy |
Followerwonk Stepping Stones
Test and Document
In lockstep with these teams, our test and doc folks are adding testing and documentation that will improve quality and communication across the company. These teams are still small, but are already having a big impact. We have already seen an improvement in our last index release, where testing contributed to it going out with no issues.
Test and Docs Team
Docs Roadmap
Test Roadmap
Sharing Our Success
As we take each step along our technical roadmap we will share our accomplishments, turning these planned stepping stones green over the next 18 months. As we gnaw away at our technical debt, we hope you’ll start seeing benefits from the changes along the way. Stay tuned!