Category Archives: tuning

XFS and EXT4 Testing Concluded

I had a few more suggestions thrown out at me before I could wrap this one up.

  • Try disabling the RAID controller read-ahead
  • Try a few custom options to XFS
  • Try RAID-10

First, my final “best” state benchmarks for comparison:

FS  Raid Size Mount Options Transfer/s Requests/s Avg/Request 95%/Request
xfs 6 4T noatime,nodiratime,nobarrier 28.597Mb/sec 1830.24 0.51ms 2.06ms
ext4 6 4T noatime,nodiratime,nobarrier 32.583Mb/sec 2085.33 0.46ms 1.89ms

Disabling the read-ahead was an interesting thought.

FS RAID Size Mount Options Transfer/s Requests/s Avg/Request 95%/Request
xfs 6 4T noatime,nodiratime,nobarrier 28.704Mb/sec 1837.07 0.50ms 2.04ms
ext4 6 4T noatime,nodiratime,nobarrier 32.715Mb/sec 2093.75 0.46ms 1.88ms

It didn’t seem to make any real difference however.


The second suggestion was to use modified XFS options (mkfs.xfs -f -d sunit=128,swidth=$((512*8)),agcount=32 /dev/sdb2).

FS RAID Size Mount Options Transfer/s Requests/s Avg/Request 95%/Request
xfs 6 4T noatime,nodiratime,nobarrier 26.376Mb/sec 1688.07 0.55ms 2.18ms

It’s hard to tell, but it seems these actually degraded performance.


The last test was to switch to RAID-10. This would reduce overall storage capacity to 72TB, but given our requirements, this really shouldn’t cause any problem for the project. RAID-10 should have a significant boost to write performance.

FS RAID Size Mount Options Transfer/s Requests/s Avg/Request 95%/Request
xfs 10 36T noatime,nodiratime,nobarrier 32.808Mb/sec 2099.72 0.46ms 1.80ms
ext4 10 36T noatime,nodiratime,nobarrier 54.112Mb/sec 3463.17 0.28ms 1.11ms

These numbers back up the improvement to write speed, but XFS still lags behind at larger volume sizes.

Since I am had to reconfigure the array, I wanted to try the larger volume size (36T) above and then a smaller size (2T) to try to reproduce my earlier results showing XFS to perform better at lower volume size.

FS RAID Size Mount Options Transfer/s Requests/s Avg/Request 95%/Request
xfs 10 2.2T noatime,nodiratime,nobarrier 60.066Mb/sec 3844.2 0.25ms 1.00ms
ext4 10 2.2T noatime,nodiratime,nobarrier 64.766Mb/sec 4145.01 0.23ms 0.90ms

This was by far the best test results I had seen and has doubled the results from the original async test.


Testing conclusions

  • XFS seems to be very sensitive to partition size
  • In all but one case, EXT4 performed better on the random read-write tests
  • Know your other caveats of both file systems before picking the one for you
Advertisements

More EXT4 vs XFS IO Testing

Following my previous post, I got some excellent feedback in the forms of comments, tweets and other chat. In no particular order:

  • Commenter Tibi noted that ensuring I’m mounting with noatime, nodiratime and nobarrier should all improve performance.
  • Commenter benbradley pointed out a missing flag on some of my sysbench tests which will necessitate re-testing.
  • Former co-worker @preston4tw suggests looking at different IO schedulers. For all tests past, I used deadline which seems to be best, but re-testing with noop could be useful.
  • Fellow DBA @kormoc encouraged me to try many smaller partitions to limit the number of concurrent fsyncs.

There seem to be plenty of options here that should allow me to re-try my testing with a slightly more consistent method. The consistent difference seems to be in the file system, EXT4 vs XFS, with XFS performing at about half the speed of EXT4.

Continue reading

Bloating Your Buffers

There’s one thing that has always bugged me when I review configuration files. People always seem to want to set certain buffer settings to ridiculously high values without really understanding what they’re doing.

Case in point: sort_buffer_size.

The misconception seems to be that setting this value to a high number will always improve a servers performance because everything works better with more memory, right? For some variables (key_buffer, innodb_buffer_pool) maybe, but the entire length of a sort buffer is allocated when ORDER BY is used. In practice, I find it best to leave it at default and watch sort_merge_passes in your status counter to determine if you need to adjust this value. Also, adjust in small amounts.