Wednesday, January 02, 2008

SQL_MODE and MySQL Data Quality

As my former boss will attest, I have a reputation for being a bit of a data quality zealot. The storage of data that is unfit for use leads to many problems, but I suppose that’s another subject for another day.

It’s tough enough to manage data quality problems introduced by source code errors, system failures, and requirements misunderstandings…But a default installation of MySQL introduces a new and exciting way to give us data quality evangelists fits: It allows unfit data to be inserted in the database. That’s the bad news. The good news is that by making a simple configuration change you can prevent this, and override the setting when you don’t care.

In a default MySQL installation, the value of the SQL_MODE system variable is set to ‘’. This allows you to force inserts and updates that may violate the intended design of the table. This point is more philosophical than technical, but in a mission critical database I believe that your first line of defense against poor data is requiring well formed data manipulation statements.

Here is an example:

Create table jd_test ( id int NOT NULL , name varchar(30) NOT NULL ) ;

Insert into jd_test ( id ) values ( 1) ;

Select * from jd_test ;

In my opinion, this insert is ill-formed since it is missing non-optional columns in the specification. Therefore, I prefer that the statement fail and return an error to the application. In this example, with SQL_MODE = ‘’, MySQL returns a warning but the data still gets inserted with an empty string assigned to the name column.

Try that example with a DATETIME column and you will see similar behavior.

Recommendations:

First of all, check the value of SQL_MODE to see what it is set at. This is defined at both the global ( server-wide) and session level. Some third party utilities ( like SQLYog) set the session level value to ‘’, overriding the server value.

Set the global value of the SQL_MODE to ‘TRADITIONAL’ in the server configuration file ( /etc/my.cnf). Hold it! Be careful here because if your MySQL database serves an existing production system, then you may have to test the effects of this in a development or test environment first to see what breaks. For new databases, start off with this setting on your development, testing, and production systems. It’s best to be in the habit of working with this setting before your system goes live in production.


Encourage your developers to explicitly check the session value of this variable :

Select @@session.sql_mode


Discourage development of application code that specifically overrides the value of SQL_MODE at the session level with the exception of clearly defined processes requiring it.

When developing stored procedures and triggers, explicitly set the SQL_MODE to ‘TRADITIONAL’ at the top of the source code file before the “CREATE…” statement. These server objects enforce the value of SQL_MODE in effect at the time of creation.

Set session SQL_MODE = ‘TRADITIONAL’ ;
CREATE PROCEDURE myproc()
…..




Recommended Reading:

Section 5.2.6 “SQL Modes” of the Mysql 5.1 Reference Manual

No comments: