Operationalizing TokuDB

In my previous post, I talked about implementing multi-threaded replication (MTR) using Percona Server 5.6. The server pairs that are utilizing MTR are also exclusively using the TokuDB storage engine.

I find TokuDB to be a fascinating engine. I can tell I will need to re-watch our Dbhangops session where Tim Callaghan talked about the differences between B-Tree and Fractal Tree indexes. There’s also a session on how compression works in TokuDB and they continue to innovate with read-free replication.

As with all new technology, there is a learning curve to understanding a new component or system. I thought it appropriate to try to document my experiences on operationalizing TokuDB into our environment. This is no where near comprehensive as I just don’t have enough experience with it yet to know the deeper intricacies of the engine.


Out of the box, TokuDB’s server variables have very sane defaults and I left things this way on purpose to try to get a feel for a “defaults baseline”.


After a week of using the engine, we started seeing “low available memory” alerts. A bit of digging led me to Shlomi Noach’s blog post where he tried TokuDB, As he mentions, by default, TokuDB relies on the OS file cache for compressed data and for uncompressed, a configured memory limit (tokudb_cache_size) that defaults to 50%. Having 50% to TokuDB + our default 80% innodb_buffer_pool_size made it pretty obvious that we were just overallocated. There is a configuration option to switch from buffered to direct IO, in which you could practice a more standard InnoDB configuration of setting buffer pool to 80% of installed RAM. For our case, I adjusted our InnoDB memory requirements down to about 5%, left the default 50% for uncompressed and everything else for buffered reads from the OS.


At some point, I might consider changing tokudb_data_dir to something other than the MySQL data directory only because I have OCD issues with how TokuDB stores its files on the file system. In a typical InnoDB file-per-table setup, you would have something like /var/lib/mysql/my_db/my_table.ibd (and my_table.frm) file. For TokuDB, the frm file is still present, but you have /var/lib/mysql/_my_db_my_table_$type_$hash.tokudb where $type is main, status or key_$indexname. This is purely my own OCD, but I would have preferred these files live inside of each database directory and not at the top level of the data directory. I also see I wasn‘t the only one with this issue.


Working in the MySQL world for so long, I’ve seen my share of “backup” methods: rsync, cold tar, DRBD, mysqldump and LVM snapshots. Most recently, I’ve had the pleasure of working with Percona Xtrabackup, which is what Etsy uses to facilitate online backups and it makes the process of taking and restoring backups extremely simple. As TokuDB is a different engine, the method that Xtrabackup uses to follow changed InnoDB pages simply doesn’t work.

The TokuDB FAQ entry on backups covers the possibilities here rather nicely. I opted to use mylvmbackup to take consistent LVM snapshots and send them over to our backup server directly. I wrote a quick patch to mylvmbackup that allows tar+compression to go over SSH instead of relying on local backup+rsync or rsync directly. Rsync directly copies all the files individually and while this is pretty convenient, it feels cleaner to me to have all the files together in one file.


On the surface, monitoring TokuDB is still basically monitoring MySQL. We still want to ensure the service is up and responding, replication is connected and caught up and make sure that Threads_connected is not getting too close to max_connections. We have some additional sanity checking we do on all MySQL hosts like keeping an eye out for duplicate indexes and verifying that schemas and grants are in sync with what we expect them to be.

We also have several performance indicators that we use for InnoDB, such has history list length, buffer pool hit rate, buffer pool waits and log waits. We use Ganglia for collecting several metrics about MySQL and then use a Nagios plugin to look at these values for alerting purposes. The metrics are collected using a slightly modified version of the Ganglia mysql collector which (among other things) does what we’ve all had to do at some point: write some script to parse SHOW ENGINE INNODB STATUS.


Thankfully, TokuDB is way more sane here (although yes, several of the key values from SEIS are available on global status now). SHOW ENGINE TOKUDB STATUS presents clean output, similar to SHOW GLOBAL STATUS, with Name/Status pairs that can easily be pulled into Ganglia. There’s also a clean way they are presented, in that many of them are prefixed with an identifier, such as checkpoint, cachetable, ft (fractal tree), memory, filesystem, etc. This makes it very easy to visually identify similar statuses. Without a better understanding of the engine, I didn’t find anything identifying any important values from this list that might help me indicate trouble. Tracking some of the success vs fail values might provide some benefit but I think it’s too early to tell.


The I_S database also contains a few tables. The TokuDB_file_map contains all the files used for each database/table. There are also tables to help with lock visualization in the txn, locks/lock_waits tables. On a high level review, I had a hard time finding anything that my workload depended upon me to monitor.


The global status counters contain a lot of internal information that might help under specific workloads, but nothing that was immediately apparent.


While not a subject of automated monitoring, TokuDB adds an insanely nice touch to SHOW PROCESSLIST whereby the State column actually gives you estimated row counts affected by statements such as UPDATE. I’m not the only one who loves this idea.

Backup Monitoring

Having only used Percona Xtrabackup in our environment, adding a new backup method in LVM necessitated a bit of additional monitoring. We added the Nagios lvm-snapshot plugin from the Percona Monitoring Plugins to ensure we don’t have any stale snapshot volumes laying around, or that they don’t become too full during backups. One of the last steps for mylvmbackup is to print LVM usage during the backup process. I think this would be a good value to shove into a time series store like Graphite so we could keep an eye on how we are trending but rely on monitoring to make sure it doesn’t impact our ability to take clean backups.


I’m very excited about what TokuDB can bring about for us. As one who finds themselves disk space constrained more than IOPS, the compression along will be a big benefit. I’m looking forward to being able to dive a bit deeper on the monitoring aspect as we start to push it harder.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s