Wednesday, August 19, 2009

Netezza TwinFin: A step towards a potential acquisition?

Ever since Netezza became a public company, every once and a while someone tries to start a rumor that Netezza is on the verge of being acquired (likely started by people who want a quick return on their Netezza stock buy). These rumors usually involve a company like Oracle buying Netezza, which never made a lot of sense to me, since Oracle has their own DBMS product and has very little reason to buy a much smaller competitor like Netezza and maintain two lines of code that target the same market. This is why it wasn’t surprising that Microsoft chose to acquire DATAllegro instead of Netezza, even though Netezza was much farther along than DATAllegro and had a larger customer base. DATAllegro essentially left the DBMS engine in tact, with its key technological assets sitting on top of the DBMS, turning many single-node Ingres instances into a large, shared-nothing, MPP DBMS. Since DATAllegro used a nice, modular architecture, Microsoft was able to replace Ingres with SQL Server, and use DATAllegro’s technology to turn SQL Server into a MPP DBMS without significant modifications to the core SQL Server DBMS engine (see Microsoft’s Project Madison).

But two events now make me wonder if Netezza might actually end up being acquired by a vendor that currently sells a competing DBMS product (likely either IBM with DB2 or HP with NeoView).

First, there was the release Oracle Database Machine. Oracle openly admits that the Oracle Database Machine frequently gets a factor of between 10 and 70 performance improvement relative to previous Oracle offerings (i.e. Oracle RAC) on scan-heavy analytical workloads. But the center of the Oracle Database Machine is …. Oracle RAC! So how does it get the order of magnitude performance improvement relative to RAC? By connecting RAC (using Infiniband) to a shared-nothing storage layer (Exadata) that can perform database scans at extremely high speeds and do some basic database operations like tuple selection and projection. Since scan-oriented queries are limited by the speed with which the scan can occur, simply connecting RAC to a storage layer that can do scans really well yields significant improvement.

Perhaps Netezza’s greatest asset is its ability to achieve high performance on table scans. By using FPGAs to perform decompression, selection, and projection as data is read off of disk, Netezza is able to perform scans faster than what competitors (at least row-store competitors) can do on commodity hardware. If the Oracle Database Machine is successful (Larry Ellison said at a recent earnings call that it "is shaping up to be our most exciting and successful new product introduction in Oracle’s 30 year history"), I would expect its competitors to follow suit --- and connect their DBMS engines to a high performance storage layer the way Oracle did with Exadata.

Second, Netezza’s recent move to re-architect their appliance via TwinFin (announced a few weeks ago) is a clear embrace of commodity hardware components. Before this redesign, Netezza was a monolithic appliance. As detailed by ComputerWeekly, if you wanted to upgrade storage or processing capacity, you had to wait for the next Netezza release and replace the whole appliance with the Netezza’s next generation. Now, the core part of the Netezza technology can be placed in the “sidecar” expansion slot in the standard IBM BladeServer family of servers. This allows customers to upgrade the IBM blades independently of the Netezza technology.

Looking at it a different way: the technology behind Netezza’s stellar scan performance can now be found in a nice modular component, the “DB Accelerator” card, that can be placed in standard expansion slots in blade servers. The move towards a more modular architecture is reminiscent of the DATAllegro architecture that allowed Microsoft to replace Linux with Windows and Ingres with SQL Server and keep the majority of the rest of the DATAllegro technology. DATAllegro was sold for $275 million to Microsoft when it only had 3-4 customers.

Netezza’s current market cap is currently $550 million and it has orders of magnitude more customers than DATAllegro did (and is currently profitable). Hence it seems like a prime candidate for an acquisition. Its recent architectural redesign allow it to be acquired even by a company with a competing data warehouse product, since its core technology can be used in the storage layer as a drop in accelerator for table scans and used in a similar way that Oracle uses Exadata. IBM seems like a natural fit given their close partnership on TwinFin. Otherwise HP seems like an option since NeoView seems like it is having trouble getting off of the ground. Time will tell, but I will no longer ignore Netezza acquisition rumors the way I once did.

Tuesday, August 4, 2009

Netezza's competitors open fire

It's fairly unusual to see a company openly attack a competitor in a public forum. I don't know the exact reason, but I presume it has something to do with the old dictum "there's no such thing as bad publicity", so giving a competitor free publicity of any kind (even the negative sort) is deemed a bad idea. Or maybe it's because the attack might backfire --- if the attack is not made using solid reasoning, the company might come off looking foolish. Or maybe it's because attacking a competitor is, in a way, a tacit validation of their position as an equal --- small, insignificant companies would be ignored; an attack is acknowledging that the two companies see each other often in competitive situations, and might encourage a potential customer to consider the competitor in the same POC when they might have not done so otherwise.

This is why the back and forth between Netezza and its competitors has been so jarring. By my estimation, it started when Larry Ellison positioned the new Oracle Exadata release back in September 2008 against Netezza, questioning Netezza's fault tolerance, DW functionality, and DBMS know-how (http://www.rittmanmead.com/2008/09/24/live-blogging-from-the-larry-ellison-keynote-oow-2008/). This was then responded to by Netezza, who became absolutely obsessed with Oracle Exadata, attacking them in multiple postings on their Data Liberators blog (see here, here, here, and here) even resorting to name calling ("Oracle Exaggerdata").

In the last few days, the attacks on Netezza have increased in intensity. First, I came across an Oracle blog post that basically claimed that Oracle Exadata is better than Netezza along every possible dimension: storage, CPU power, memory, interconnect, load performance, query performance, and architecture.

Then, I came across an Aster Data blog post which made wild claims such as Netezza's new release is an indication that Netezza regrets building their DBMS around FPGAs and is now desperately trying to abandon ship and switch to mainstream CPUs.

Be careful reading both of these attacks. I find them both full of FUD and disagree with much of the premise of both of them. Oracle's blog post omits comparing Netezza and Oracle along perhaps the two most important dimensions: price/performance and total cost of ownership. Oracle brags that their database machine uses a 20Gb infiniband interconnect while Netezza only uses 1Gb ethernet. But presumably the price of the expensive interconnect gets passed on to the customer --- it could easily be argued that Netezza's use of 1GB ethernet is an indication that their architecture might be superior --- Oracle needs infiniband to connect the storage and computation layers of their system; Netezza's ability to push computation to the data allows them to avoid having to include the high cost interconnect. I would guess that Netezza's price/performance is significantly superior to Oracle's, but trying to calculate the price of Oracle's database machine is far too complicated to put some meat behind this statement. Furthermore, Netezza's superior total cost of ownership relative to Oracle is common knowledge, I would be surprised to see someone argue otherwise.

I also find the Aster Data post full of FUD. Claiming that Netezza is trying to abandon their FPGA approach is ridiculous (in my opinion). They have invested a huge amount into doing decompression, projections, selections, and other DBMS operations inside the FPGA, and there are performance advantages in doing so. The redesign of their architecture was necessary to be able to improve caching of data in memory (to improve repeated scans of the same table) and to add more commodity components to their system, allowing them to take better advantage of upgrades to the disk and CPU technology they incorporate.

Though the Oracle and Aster Data reactions are misleading, I'm still a little worried about Netezza. Like everyone else, I was looking forward to their "big" announcement at TDWI, and was disappointed when I found out what it was. Sure, the internal architectural redesign is big news to Netezza internally, but ultimately, all it means to the customers is that Netezza can now do some things that its competitors already can do. Sure lower prices and a better ability to handle mixed workloads are nice, but I was expecting something a little more radical. I guess the lesson to be learned is that it is never a good idea to prepare people for a big announcement --- it just leaves lots of potential for disappointment.