Originally, I assumed this had something to do with jdbc. However, since I can re-produce the issue without any SQL, I'm sure this has nothing to do with the version of the MySQL Connector/J jdbc driver. I tried the 5.0.8 version of the driver and I observed the same behavior. I couldn't even get my transform to work correctly with the 5.1.12 version of the connector -- it does not recognize column aliases in my SQL query.
Now for the rest of the story:
My comparison of numeric data was between 2 fields from two data streams, initiated by two separate SQL table inputs.
- The first data stream is from a "raw" table. From the input step it is passed through a "sort rows" and then a "group by" step to aggregate the numeric values.
- The second data stream is from a "rollup" table where the raw data is summarized.
The two streams are then merged ( by a unique id) and compared in order to validate the data in the rollup.
At this point, the problem seems more related to the metadata of the fields. I found two resolutions to choose from:
Use "group by" in SQL. The data types of the output numerics are magically set to BigNumber.
Place a "Select/Rename Values" step after the sort and before the "Group By" step to coerce the metadata of the fields to be of BigNumber type.
Personally, I prefer the second option because I like to extract the data as quickly as possible. Pentaho can handle sorting and grouping of somewhat large datasets just fine.
I have a new example transform here: numeric_compare_filter_values_try2.ktr
Here is a picture of the example transform: