Thursday, August 15, 2013

On machine learning as means for decision velocity

Chris Taylor has written in the HBR Blog a piece that advocates the idea that machine learning should be used to handle the main issue of big data - decision velocity.  I have written recently on decision latency, which according to some opinions - real-time analytics will be the next generation of what big data is about.
Chris' thesis is that the amount of data is substantially increasing with the Internet of Things, and thus one cannot get a decision manually in viewing all relevant data,  there will also not be enough data scientists to look at the data.   Machine learning which is goal oriented and not hypothesis asserting oriented will take this role.     I agree that machine learning will take a role in the solution, but here are some comments about the details:

Currently machine learning is off-line technology, case sensitive, and cannot be the sole source for decisions.


It is off-line technology, systems have to be trained, and typically it looks at historical data in perspective and learns trends and patterns using statistical reasoning methods.  There are cases of applying continuous learning, which again done mostly off-line, but is incrementally updated on-line.    When a pattern is learned it needs to be detected in real-time on streaming data, and here technology like event processing is quite useful, since what it does is indeed detect that predefined patterns occur on streaming data.  These predefined patterns can be achieved by machine learning.    The main challenge will be the online learning -- when the patterns need change, how fast this can be done in learning techniques.  There are some attempts at real-time machine learning (see presentation about Tumra as an example), but it is not a mature technology yet.

Case sensitive means that there is no one-size-fits-all solution for machine learning, and for each case the models have to be established in a very specific way for that case.  Thus, the shortage in data scientists will be replaced by shortage of statisticians,  there are not enough skills around to build all these systems, thus the state of the art need to be improved to make the machine learning process itself more automated.

Last but not least - I have written before that get decisions merely based on history is like driving a car by looking at the rear mirror.  Conclusion from historical knowledge should be combined with human knowledge and experience sometimes over incomplete or uncertain information.  Thus besides the patterns discovered by machine learning, a human expert may also insert additional patterns that should be considered, or modify the machine learning introduced patterns.




Tuesday, August 13, 2013

On event-driven, request-driven,stateful and stateless



This slide is taken from our DEBS 2013 tutorial, explaining what are the differences in thinking between the traditional request-driven way and the event-driven way.   It shows the differences by answering three questions.     This goes back to the differences between business rules and event processing, and old topic, on which I have written first time around 6 years ago!    One of the claims that I've heard several times is that the distinction between them is that business rules are stateless and event processing is stateful.     I think that the main difference is that business rules are treated as request driven,  the rule is activated on request and provides a response, while event driven logic is driven by event occurrence as shown in the slide above.

While it is true that there is correlation between event based/request based and stateful/stateless,  these are really orthogonal issues.

Event-driven logic can be stateless.  If we only wish to filter an event and trigger some action,  this can be stateless (most filters are indeed stateless), but it has all the characteristics of event-driven, including the fact that if the event is filtered out - no response is given.   

On the other hand -- a request-driven logic may be stateful, there are many instances of session oriented and other stateful request-response protocols.    One can also implement stateful rule engine in a request-response way, where invocation of rule is based on result of previous rules that are retained by the system.  

Bottom line:  stateful vs. stateless is not equivalent to event-driven vs. request-driven.