Wednesday, May 11, 2011

On big events and big data



The "Big Data" phenomenon gains a lot of traction, interest, and related work in recent years.   The Internet and making everything in digital form has resulted in amounts of data beyond past imagination, and the rate of growth is amazing.   Mark Palmer in his Blog posting made the analog of data as sand, 



saying that  "If every grain of sand in the bucket was 1 byte of data, then:
  • The entire work of Shakespeare fills just one bucket of sand (about 5MB)
  • A fast financial market data feed (OPRA) fills a beach of sand in 24 hours (about 5TB) 
  • Google processes all the sand in the world every week (about 100PB)
  • We generate 60% more sand every year" 


Using this analogy - if all data in the world is a sand,  much of the sand is talking about facts, BTW - the fact that a fact appears as a data in the big data universe, does not say that this fact is in fact true.  

Events issue some of this data, but in many cases an event is the fact that a fact becomes true or false, and this fact is not really kept in the data.   

The "Dagstuhl grand challenge",  which is part of the event processing manifesto, is talking about an "event fabric", which will be the Internet equivalent of events instead of data, I guess that the quantities will be on the same cardinality, thus   it will have the same scalability challenge. The main difference is the type of processing -   event processing instead of queries/information retrieval.    Getting to an "event fabric" has indeed many challenges.  In DEBS 2011 there will be a tutorial about this grand challenge.   I'll write more about this challenge in the future. 


(and this is of course Schloss Dagstuhl) 

Tuesday, May 10, 2011

Watson meets DEBS 2011




DEBS 2011 will take place in the Yorktown Heights auditorium, the same site in which the famous recording to the Jeopardy! programs in which the computer has won against two human champions.  The Watson hardware resides very close to this auditorium.  With some effort we succeeded to bring Eddie Epstein, one of the senior persons behind Watson has agreed to be an invited speaker in DEBS 2011,  Eddie will also provide a demo of the system.    


The full program of DEBS 2011 will be advertised around this weekend, the list of accepted papers can be found on this Blog.   


It should be note that if you plan to be in DEBS -- plan to stay for the entire conference, besides the 33 presentations, the attractions will be spread among the days  - no day can be skipped!


Monday, July 11 will be the  tutorial  day, and in the late afternoon there will be the EPTS award granting event. 
Tuesday, July 12 will have the two industrial keynote speakers: Chris Bird and Don Ferguson (see here for more about the keynote speakers),  and in the later afternoon a reception, and after getting some alcohol, the "gong show",  in which the participants can express outrageous ideas about future features and utilization of event processing technology.   
Wednesday, July 13 will have the keynote talk of Johnnes Gehrke,  the demo and poster session including the grand challenge,  and the conference banquet with artistic performance, and the DEBS awards (best paper, best idea in the gong show, best demo and maybe more). 
Thursday, July 14  will have the keynote talk of Calton Pu, and the invited talk of Eddie Epstein on Watson.


None of these day should be missed -- plan accordingly!


The fifth day - July 15 will be the PhD workshop adjacent to the conference.    


See you all in Yorktown.  

Sunday, May 8, 2011

1000 contacts in LinkedIn


Today, two more connections were added to my LinkedIn contacts list, and the number of contacts went up from 998 to 1000,  the 1000th contact is my IBM Haifa Research Lab colleague, Ran Ettinger.    LinkedIn is the social network in which I have most contacts, I also have twitter account followed by 236, and Facebook account with 96 friends, there are also some more social networks I am registered in somehow, but I am not active in any of them.


Among the 1000 members of the contact list 331 are (or have been at some point) IBM employees,   the two other high-tech companies with most contacts are: Google (16) and Microsoft (10),    There are also some contacts from my second universe, the academic world:  Technion (29) followed by Tel-Aviv University (8).
Some are friends and classmates from high-school,    Out of the 1000,  423 live in Israel,  131 in the Greater NY area,  69 from the SFO Bay area, and 46 from the UK, and then from many other countries in all continents.    I am also registered in about 40 LinkedIn groups -- from Temple University alumni, to the Haifa high-tech community,  and of course all event processing related groups, and many others.   


I have never asked anybody to recommend me on LinkedIn,  one person did in response to the fact that I've recommended him.  I have recommended six persons at their request, but never knew whether it is effective.


That's enough statistics for today.