Category Archives: TwitYourl

TwitYourl, Twitter, and Treemaps from Panopticon

This is a treemap visualization (from Panopticon) hooked up to some static data as crunched by our TwitYourl project, our first foray into Cloud Event Processing together.

Cloud Event Processing – Where For Art Thou oh CEP?

In a recent post, Louis Lovas of Progress Apama explains why the first generation CEP vendors don’t have many, if any, cloud deployments.  Here’s a quote from his post:

Typical of event processing applications that do things are those in Capital Markets like algorthmic trading, pricing and market making. These applications perform some business function, often critcal in nature in their own right. Save connectivity to data sources and destinations, they are the key ingredient or the only ingredient to a business process.  In the algo world CEP systems tap into the firehose of data, and the data rates in these markets (Equities, Futures & Options, etc.) is increasing at a dizzying pace. CEP-based trading systems are focused on achieiving the lowest latency possible. Investment banks, hedge funds, and others in the arms race demand the very best in hardware and software platforms to shave microseconds off each trade. Anything that gets in the (latency) way is quickly shed.

Shall I hear more, or shall I speak at this? (Romeo’s reply…)

Louis goes on to describe additional possible reasons for the current lack of CEP in the cloud.  Amongst them he cites that many businesses have not migrated their key business processes to the Cloud, and that having those business processes in the Cloud is a requirement for CEP in the cloud.

Without citing potential evidence to the contrary, I think that Louis’s point is valid.  It’s valid because Progress Apama’s current customer base, like Streambase and Sybase, is predominately comprised of and focused upon Capital Markets participants actively engaged in high frequency, or algo, trading.  And I agree with Louis that it currently doesn’t make any sense to move algo trading to the cloud, no matter which, if any, of the 21 definitions of cloud you identify with.

Meanwhile, Back at the Ranch

We’re busy hooking up some big data visualization to our TwitYourl – and we’ll be sharing that shortly.  And although TwitYourl could hardly be explained as a key business process, using Cloud Event Processing based big data analytics can certainly help refine and focus those business processes.

If You Store My Tweets Today, I'll Gladly Look at Them Next Tuesday

Cloud Event Processing – Where’s The Data?

There are several things left to cover in our #TwitYourl proejct.  One of the most glaring absences so far is storage – where do we put all of these Tweets?  What happens if we’d like to make changes to our RuleBots and test them with historical data?  Right now, we can’t.

In keeping with some of the rules of our implementation, an OnRamp is not allowed to update a database directly.  This would tie the OnRamp to the database implementation and make our architecture more brittle than it has to be.  Also, it would mean that the OnRamp had to have specific information about where it was sending a message – this is not allowed.

Remember, we are architecting for the cloud – the idea of baked in, point to point, connections inside of our cloud should have you recoiling in horror by now.  So how can we update databases easily within our Cloud Event Processing framework?

Well, it would be really easy if we had some metadata that described all of the data, or events, that the OnRamps were publishing on the bus.  And it would be even neater if all of that metadata was available in a central database or directory.  This way, when the infrastructure that #TwitYourl is built on top of is running, all one would need to do is go to some management console, bring up the information about how we’d like to handle a particular event, click the ‘Store in a Database” button and be done with it.  Click, click, click and we’re storing events – automagically; somewhere in the cloud!

So, let’s do this for #TwitYourl – I vote for MongoDB.  I know that Cassandra seems to be a fairly popular choice this past week, but we’re going to buck that trend and hook up MongoDB inside of our cloud and we’re going to feed Mongo some yummy Tweets!

Sorry, I couldn’t resist.

But before feeding Mongo, we’re going to work on some visualization this week.  We’re running a little behind schedule here and we need to get caught up!

It's a New Week!

Well, getting code up and running is one thing.  Getting it deployed is another.  I have questionable skills in the former, and not so questionable skills in the latter!  In fact, there’s no question about my lack of skill in the latter whatsoever!  So it’s taking me a little extra time to deploy TwitYourl to the Rackspace cloud.  But I did get the cloud configured this weekend.  So that’s exciting.

Rackspace

After selecting Rackspace over EC2 for this project (Rackspace is #2, and seems to be gaining rapidly on EC2), Rackspace verified my credit card and sent me an email.  Within a couple of hours (and that’s because I was multitasking at the time), I was able to create a number of Ubuntu instances.  I wish that there was more to report than that – but there isn’t.  It just worked.  I installed a RabbitMQ hub, and was publishing/subscribing to messages using my new cloud instance (“Look ma!  Isn’t it shiny!?”) very quickly.

So This Week

I will be packaging up TwitYourl for deployment via Rackspace, getting all of the source up to github, and looking to integrate Panopticon’s Treemaps into the project as well.

Meanwhile, Back at the Ranch

As I’ve been getting this relatively small project to the point where we can actually show something publicly, I’ve been building a list of ‘Gee, wouldn’t it be nice if…”  in regards to cloud event processing.  What an IDE might look like, what capabilities a management console would provide, etc.  Once we’ve got TwitYourl up and running in the cloud, I will start discussing that list out here, in the open.

Suggestions – Keep Them Coming

I’ve been getting a lot of email, and a lot of traffic on the blog, and I think that’s a good thing.  I’ve received suggestions, requests for briefings, and queries as to my whereabouts in the near future.  Keep it up – I love to hear from the community, and the suggestions and feedback have been invaluable.  Thank you.

Just When I Thought I Was Done… Bring on the Visualization!

“They pulled me back in.” – The God Father.

I’ve received some interest/emails about TwitURL – our Map/Reduce as it applies to CEP (cloud event processing) project.  Seems that people would like to see the results of these processes visually.  Who can blame them, right?  So, I was thinking, how can I add a little sizzle to TwitURL?

Panopticon

Panopticon offers some pretty slick visualization capabilities – you can check them out here.  And based upon some feedback/requests, I’m going to hook up a heat map to the output of TwitURL.  The heat map will show which URL’s are the hottest-the most tweeted URL over a given time frame.  Once we deploy TwitURL in the cloud, we’ll use this app to see what’s going on in there.  Here’s a picture of the heat map: