Site Reliability Engineer

Tumblr New York, NY (relocation offered)

Job Description

Tumblr has a vastly growing network.  Join us in managing our dozen routers, 4000+ switch ports, /20 and /44 netblocks.

What You'll Do:

  • Manage the availability, scalability and performance of Tumblr platforms
  • Create the tools and infrastructure leveraged by the rest of the Tumblr engineering teams
  • Diagnose and repair network, application, and hardware bottlenecks
  • Test and tune network, hardware, and software configurations to maximize performance
  • Deploy and manage monitoring and diagnostic tools
  • Guide our product and platform teams to keep new features fast and stable

Skills & Requirements

What We’re Looking For:

  • Experience scaling high-traffic web sites
  • Experience with Unix systems administration including solid scripting skills in Ruby, PHP or Python
  • Expertise in data structures and algorithms
  • Expertise in troubleshooting large-scale distributed systems
  • Smarts, humility, and equal willingness to learn and teach
  • A sense of ownership, initiative, and drive

Tools We Like:

  • Nginx, Varnish and HAProxy
  • Memcached and Redis
  • MySQL (InnoDB)
  • Puppet
  • PHP5 at its furthest extent
  • git and GitHub
  • Ruby, Scala and PHP
  • Asynchronous services and queues
  • Hadoop, Pig, ZooKeeper, and other Java/JVM projects
  • Nagios/Icinga, OpenTSDB

About Tumblr

Tumblr is home to the most creative people in the world, and we’re looking for equally inventive people to join our team in New York City. Get ready to define the next generation of creative tools, and make beautiful ways for our community to share, discover, and connect with the stuff they love. We work in an open, positive, and collaborative environment at our headquarters near Union Square. Founded in 2007, Tumblr has more than 167 million blogs and 74 billion posts made by passionate creators from all over the world.

Joel Test score: 9 out of 12

The Joel Test is a twelve-question measure of the quality of a software team.

  • Do you use source control?
  • Can you make a build in one step?
  • Do you make daily builds?
  • Do you have a bug database?
  • Do you fix bugs before writing new code?
  • Do you have an up-to-date schedule?
  • Do you have a spec?
  • Do programmers have quiet working conditions?
  • Do you use the best tools money can buy?
  • Do you have testers?
  • Do new candidates write code during their interview?
  • Do you do hallway usability testing?

How to apply

If interested (and qualified), please apply here:


view all job listings view all Tumblr job listings

Site Reliability Engineer at Tumblr - Linux