Downsizing to SSDs

System management can be a big deal. At Etsy, we DBAs have been feeling the pain of getting spread too thin. You get a nice glibc vulnerability and have to patch and reboot hundreds of servers. There goes your plans for the week.

We decided last year to embark on a 2016 mission to get better performance, easier management and reduced power utilization through a farm reduction in server count for our user generated, sharded data.

Continue reading

Source of Truth or Source of Madness?

This year at Etsy, we spun up a “Database Working Group” that talks about all things data. It’s made up of members from many teams: DBA, core development, development tools and data engineering (Hadoop/Vertica). At our last two meetings, we started talking about how many “sources of information” we have in our environment. I hesitate to call them “sources of truth” because in many cases, we just report information to them, not action data based on them. We spent a session whiteboarding all of of these sources and drawing the relationships between them. It was a bit overwhelming to actually visualize the madness.

Continue reading

KeyError: ‘/dev/sda’

At Etsy, we have a nice, clean, streamlined build process. We have a command for setting up RAID, and another for OS installation. OS installation comes with automagic for LDAP, Chef roles, etc.

We came across an odd scenario today when a co-worker was building a box that gave the following error:

Traceback (most recent call first):
File “/usr/lib/anaconda/storage/”, line 1066, in allocatePartitions
disklabel = disklabels[_disk.path]
File “/usr/lib/anaconda/storage/”, line 977, in doPartitioning
allocatePartitions(storage, disks, partitions, free)
File “/usr/lib/anaconda/storage/”, line 274, in doAutoPartition
File “/usr/lib/anaconda/”, line 210, in moveStep
rc = stepFunc(self.anaconda)
File “/usr/lib/anaconda/”, line 126, in gotoNext
File “/usr/lib/anaconda/”, line 233, in currentStep
File “/usr/lib/anaconda/”, line 602, in run
(step, instance) = anaconda.dispatch.currentStep()
File “/usr/bin/anaconda”, line 1131, in <module>
KeyError: ‘/dev/sda’
It suggests a problem with setting up partitions on /dev/sda, where we would put the boot partition. I knew it seemed familiar but I couldn’t recall the solution, and Google, while usually wonderful, got us to a Red Hat Support article behind a paywall.  A few other results suggested the boot order was incorrect. The OS was thinking the drives were out of order. Being a Dell box, I checked the virtual drive order, which in my experience always has matched the boot order:
Screen Shot 2015-12-29 at 3.02.57 PM.png
After the anaconda failure, I went into another terminal to a prompt and checked /proc/partitions. Sure enough, we started at sdb, not sda. Then it hit me. There were 4 people viewing the console in iDRAC, so what if someone else had mounted a virtual disk and that was /dev/sda? Sure enough:
Screen Shot 2015 12 29 at 3.23.07 PM.png
Deleting the virtual media session, rebooting and starting the OS install again proved out everything worked fine.
The bonus humor here is that this isn’t the first time we’ve run into this. Hopefully after posting this, Google will index this page and point us to the answer a bit quicker next time.

Operationalizing TokuDB

In my previous post, I talked about implementing multi-threaded replication (MTR) using Percona Server 5.6. The server pairs that are utilizing MTR are also exclusively using the TokuDB storage engine.

I find TokuDB to be a fascinating engine. I can tell I will need to re-watch our Dbhangops session where Tim Callaghan talked about the differences between B-Tree and Fractal Tree indexes. There’s also a session on how compression works in TokuDB and they continue to innovate with read-free replication.

As with all new technology, there is a learning curve to understanding a new component or system. I thought it appropriate to try to document my experiences on operationalizing TokuDB into our environment. This is no where near comprehensive as I just don’t have enough experience with it yet to know the deeper intricacies of the engine.

Continue reading

XFS and EXT4 Testing Redux

In my concluded testing post, I declared EXT4 my winner vs XFS for my scenario. My coworker, @keyurdg, was unwilling to let XFS lose out and made a few observations:

  • XFS wasn’t *really* being formatted optimally for the RAID stripe size
  • XFS wasn’t being mounted with the inode64 option which means that all of the inodes are kept in the first 2TB. (Side note: inode64 option is default in newer kernels but not on CentOS 6’s 2.6.32)
  • Single threaded testing isn’t entirely accurate because although replication is single threaded, the writes are collected in InnoDB and then writes it to disk using multiple threads governed by innodb_write_io_threads.

Armed with new data, I have – for real – the last round of testing.

Continue reading