Sunday, 21 February 2010

The Transportation Business

A lot of the action we get involved with CloudTran relates to finance, investment banking and so on. But we met with a media company last week, who have a setup that seems just as suitable for CloudTran and GigaSpaces.

The company collects data about consumer responses to promotions every day - about 10 million data points - then slice and dice this into usable information for the consumer product manufacturers.They're a UK company, with a stable business based on applications in Java and coping quite happily at the moment ... but now having a little think about business and technical strategy in the longer term. This brought a few things to mind.

Globalisation and Consolidation

Consumer information collection and processing? Sounds like a business that could go global - if I were CocaCola, I'd want to get analytics across countries, regions or continents from a single supplier. If the business can possibly go global, the management has to decide if they want to play in that space. (And if they don't, they may soon find someone else occupying that high ground.)

So ... what would it take to go global? Globalisation in this area is likely to lead to consolidation of worldwide processing through a single system in order to save operational costs. This means there could be more processing runs, many of them upwards of 100 million data points.

The 2 hour day

The crunch point for this company is a small processing window, in their case the 2 hours each night where all the raw data is in and has to be turned into saleable information.

So the maximum power of the system is used for 2 hours per day - 8% usage. That matches typical CPU utilisation averages of around 5-10% in the worldwide server base.

For large enterprises, the answer to making their estate more efficient is virtualisation, which can move your system utilization up to the 40-50% range. For SMEs, the answer is the cloud. Buy two hours of a few machines - maybe $50 per night - for heavy processing power, then release them. Compared to the value of the information being provided, surely the cost must be trivial. And the "-as-a-service" model means that management, provision, backups and all that stuff is done by the provider, simplifying the SMEs own IT function.

Red-hot Oracle

Given that the company's Oracle database machine is already red-hot during peak processing, positioning for growth means looking at the data processing structure. The current process consists of a number of select+calculate+save cycles - reading then writing to the database disk.

In thinking about this, I was struck by how long the 'back-end database' architecture has survived, and how the disk vs. electronics equation has changed.

My company evaluated databases back in 1982-3 and ended up buying Oracle. At that point, the hot CPU was Motorola 68000 at 1 Mips, typically with 256kb memory. The hot disk was the Shugart/Seagate ST-412 - 10MB capacity, transfer rate of 5Mbps (million bits/sec) and 85ms average access time (on the Seagate site ... or 15-30ms in Wikipaedia!).

Nowadays, for testing we have a number of fairly hot Intel i7 920, 4 cores - which I reckon is at least 5,000X faster than the 68000 - with 8GB memory (=16,000X). The hard disk is 250Gb (25,000X), transfer rate of 150MB/s - million bytes/sec - or 1.2Gbps (240X), and access time of 13ms. (probably 6.5X).

ThenNowMultiplier (X)
CPU1 Mips5,000 Mips5,000
Disk Capacity10MB250GB25,000
Transfer rate10MB1.2Gbps240
Access time85ms13ms6.5

So - disk capacity has kept up with CPUs and memory, but disk transfer rates are well down by comparison ... and access times have fallen off the pace dramatically. And furthermore, the discrepancy is going to keep getting worse: in the next 10 years, disk seek time and IO rates won't improve greatly whereas CPU and memory will improve by 10-100 times. There just aren't the advances in mechanical devices there are in electronics.

For an analytics application like this, paying for expensive licenses to pump information in and out of the database through a bunged-up pipe when it's irrelevant to the business - that's just a waste. It is much less effective than loading the in-coming data and doing the analysis in-memory.

Easy Scalability

What I was trying (!) to say in the previous section was, the company's database-centric architecture is almost certain to hit the wall when the volumes go up by 10-100X - these sort of changes in quantity usually trigger a qualitative change. Especially if the DB machine is
already smoking.

The combination of GigaSpaces' scalable data partitioning with CloudTran's automatic distribution and collection of data across machines meansthat more nodes, more memory and more processors can easily be deployed on the application.

As CloudTran allows you to easily create a number of deployments and keep them in step with the app, it might be worthwhile to have different size deployments on hand, and spin up large and small versions for processing depending on the size of data-set to be processed.

Data Grid = Compute Grid

A common approach to using grids is to have separate data and compute grids.

However, this company's application can hugely benefit from putting related data into a single machine; when you spread this out, the separate grids become a single data+compute grid. This means that the data to perform a distinct step in the calculation is mostly, if not completely, within the memory of the CPU running a calculation.

Our rule of thumb is that if the cost of getting a piece of data from an in-memory transactional store is 1 unit, then across the network the cost is 50-100 units, and from a database 50,000-100,000 units.

By spreading the data around, the data+compute grid also can apply many processors to the data. Multiply the above numbers by 10 or 100 nodes working on the problem and soon you're talking about real money.

Organising this with CloudTran takes no effort - the co-location of the data, and scatter-gather of information from other nodes, is all done automatically. The calculations are split up, with each node working on its sub-process, using its local data.


We mentioned that the typical processing cycle is select+calculate+commit. In an in-memory architecture, the select will be done in-memory, as of course will the calculate. This leaves the "commit cycle".

If the information being committed is simply intermediate calculations, then in an in-memory architecture you can skip that step entirely. However, there will be cases in long-running calculations where you won't have the time to start over if anything goes wrong: in that case, you really do want to commit this information to a database.

CloudTran makes it possible to commit results to a transaction buffer machine very quickly (i.e. small numbers of milliseconds for small transactions) and have the processing cycle move onto the next phase while the previous one commits. In other words, we overlap the commit of one stage with the in-memory processing of the next. If the slowest part is the information analysis, using CloudTran to commit in parallel means that committing costs nothing (in the critical path sense).

And What About Flash?

Now that I've moaned about hard disks, does flash ruin the whole argument? Well, for analytics applications, the answer is probably not. If in any given phase of processing (select or commit) your database machine is smoking today, then adding flash drives won't help at all because the bottleneck is the CPU. You won't be able to create a scalable solution with a "database tier" architecture.

What is the database tier doing with its CPU cycles? The database is doing some "real" processing steps, such as merging indexes and sorting for SELECT. Then there are the overhead steps. On the storage side, this means packing rows into storage pages on disk (e.g. Oracle blocks), constructing indexes and mapping them to storage. On the comms side, the overhead steps are interfacing through JDBC, serialising the response and then the processing for TCP/IP.

You may be able to improve the "real" processing steps - but to scale up by 10-100X, you'll almost certainly hit the "overhead" wall. This is why we prefer to build on an integrated data+compute tier.

Stephen Foskett has a complementary analysis of flash and cloud storage issues.

IT - Data and Processing

A while back, I visited a large IT company's headquarters with my boss. As we walked towards reception, he looked up at the huge building and said "Guess how many people work here?". I forget my answer - his was "About a third of them".

Ever since I started in enterprise IT, I have been struck by how many IT components don't do real work either. So much shipping data around goes on. If you're looking for an customer's personal information and orders, the real data might be 2KB - but you'll probably end up shifting a many megabytes around various components to get to it. Then the real processing is usually trivial - a few hundred instructions.

As Peter Drucker would have said, right now we're in the transportation business - we should be in the information business.

1 comment:

  1. Great post, Matthew! It's time for all of us IT infrastructure folks to start thinking about the bigger (and really biggest) picture of performance and scalability - we're all linked up through the stack!