Wednesday, August 26, 2009

file io tracing on centos ( part 1)

I'm now working in a CentOS Linux environment, and since my Solaris days are over, I'm feeling a bit handicapped by not having DTrace and iosnoop. It still amazes me that monitoring MyISAM table data and index IO is not more common, especially in heavy duty environments.

My search for an analogous tracing system on Linux lead me to strace. I have been playing with this a little and although I am disappointed overall, I have been able to figure out how to gather what I need. It isn't pretty.

So far, my favored invocation of strace on an active mysqld process is :

strace -q -tt -e trace=open,close,read,write,pread64,pwrite64 -s 256 -f -v -p 2>&1

strace by default directs it's output to STDERR, unless you specify a file ( -o ). I wanted the output to go to STDOUT so I can pipe it into a script that will roll up the output and report by the minute. My goal will be to print an example output something like :
(blogger doesn't usually honor my formatting..not sure how this will look)

time file MbReadMin ReadMin MbWriteMin WriteMin
==== ========================
13:04 /mnt/mysql1/customers.MYI 26 134 0 0
13:04 /mnt/mysql1/sales_summary.MYD 0 0 15 124



This involves a bit of parsing of the output depending on the syscall. The open() syscall gives the actual file name and the "id" , which can be cross referenced with the pread64() and pwrite64() functions, where I am getting the size.

The problem is that you can only get the file name if the table was not open before you invoke strace. And beware, this is a cpu hog.

Saturday, August 15, 2009

parting shots

This was my last week working at Wiland Direct and as a parting shot, I really wanted to get this project out of the way. The adventures started on Saturday with the rebuilding of a data warehouse server. The goal was to convert the existing ZFS file system to UFS, and standardize the disk layouts of the MyISAM tables to a new scheme that was more friendly to our day to day processes. This dw server, one of 5, was an experiment with ZFS. My beef with ZFS was that I could not trace table level data and index IO using Dtrace ( http://opensolaris.org/jive/thread.jspa?threadID=70175 ) .



The game plan was pretty simple:

  • Halt Processing
  • Backup
  • Validate backup
  • Shutdown MySql
  • Capture "before" catalog of all database files. ( my backup system does this).
  • Remove existing file systems and database files.
  • Create new file systems.
  • Touch and symbolic link "empty" database files to the default data directory.
  • Restore ( cp -p honors symbolic links)
  • Validate restore.

This went well. And it was so much fun, that we decided to do the remaining 4 servers, at the same time, on Tuesday night. It was a long day, but I went fishing during the backup phase and caught this guy:






When all was said and done, 5 transaction servers, with 53,000 tables consuming 2.5 Tera-bytes of MyISAM data and index were backed up, rebuilt, and restored.
-----------------------------------------------------

Friday was my last day before starting a new job at Lijit Networks. The many kind wishes from my former colleagues leaves me truly humbled. This was a really great gig with some awesome folks.