CrateDB SQL Database Puts IoT and Machine Data to Work


Space-Time Insight joined CrateDB in their launch of CrateDB 1.0, an open source SQL database that enables real-time analytics for machine data applications. We make extensive use of machine learning and streaming analytics, and CrateDB is particularly well-suited for the geospatial and temporal data we work with, including support for distributed joins. It allows us to write and query sensor data at more than 200,000 rows per second, and query terabytes of data. Typical relational databases can’t handle anywhere near the rate of ingestion that Crate can.

Crate handles and queries geospatial and temporal data particularly well. We also get image (BLOB) and text support, which is important for our IoT solutions, as they are often used to capture images on mobile devices in the field and provide two-way communication between people and machines. Crate is also microservice-ready — we’ve Dockerized our IoT cloud service, for example.

Finally, our SI Studio platform uses Java and SQL and expects an SQL interface, so choosing Crate made integration straightforward and allowed us to leverage existing internal skill sets.

Read more at, Space-Time Insight and The Register.


Analytics: The Valuable “A-ha” Moment


Businesses and other enterprises invest in advanced analytics and situational awareness solutions to improve operations.  The value of this investment typically equates to one or more of the following benefits: customer satisfaction and retention, competitive advantage, cost savings, productivity improvement, workflow consistency and business agility, as well as being more responsive to actual or likely problems, failures, service disruptions, and crises.  The value and benefits are often visible, measurable and aligned with the impetus for installing the solution.

Other ongoing secondary benefits and value from these solutions can come about by happenstance, arising organically when end-users have an “a-ha moment” and apply their system, data, and/or other resources to purposes other than what it was originally intended.  Another serendipitous way of finding value is by performing “what if” analyses.  These types of secondary benefits come about by finding value hidden in your data.

Hidden value

Finding value hidden in your data brings about substantial benefits because many small optimal decisions and corresponding favorable outcomes can equal or exceed gains from a single large-scope optimal decision or action.  This is especially true if the smaller optimal decisions are repeatable and support recurring situations.  Benefits such as these do indeed positively impact top-line revenues, bottom-line earnings, and the common key performance areas that I listed in the opening paragraph.

Consider that your data may be generated from multiple systems and stored in databases and repositories that are a tightly coupled component of those systems.  As an example, your purchase transaction data is in an order management system, your production data is in an ERP system, your financial data is in a general ledger system, your customer support interactions are in a CRM system and your customer loyalty program data is in yet another system.  Some of these systems may run in-house and others in the cloud.  They may even be located in different geographies, including different countries.  Hence your difficulty in seeing or readily finding value in all of your data.

There is correlation between most if not all of your data.  Each purchase has a correlation to the production data, the financial data, the loyalty data, and possibly the customer support data.  However, because the data is often in various formats, systems, repositories  and locations, it’s impossible for a human to sift through the data to find either overt or subtle correlations that reveal the value hidden in your data.  The situation is exacerbated by data streaming from Internet of Things (IoT) devices.

These same issues also make it challenging for data processing systems, including analytics and business intelligence (BI), to find value hidden in your data.  Data processing systems should be up to the challenge, given their strength – performing highly repetitive and iterative tasks involving large amounts of data.  However, finding value hidden in data, especially if that data is spread across systems and in silos, requires specific architectures and technologies to overcome these challenges.

Identifying correlated items and events is more likely to occur, and occur more often, by architectures and technologies that aggregate data from multiple sources and feed that data to advanced analytics, machine learning, pattern matching, statistical methods and complex models.  Such correlations lead to value that can be found and brought forward for decision-making and action.

Here are some examples of found and realized value that was hidden in data:

  • Awareness of not using the most economical way of transporting items, and adjusting accordingly
  • Identification of excess cargo capacity while optimizing shipping routes
  • Identification of assets most likely to fail by drilling-down into key performance indicators

If you have found and benefited from finding value hidden in your data, please share your “eureka” story with us.

For more on this topic of finding value hidden in your data, please see the article “Through the Looking Glass: Critical Asset Insight and Transparency Increases Operational Efficiencies & Customer Confidence.


Resolving Different Conclusions from the Same Data


In the era of big data analytics, there may still be room for human input and judgement.

A recent Harvard Business Review article discusses the very real likelihood of reaching different conclusions from the same data. The article recounts how multiple teams of analysts were given the same question to answer and the same data set to research. Of 29 teams working the problem, 20 found a statistically significant relationship that answered the question. Nine team found no significant relationship.

In the end, the teams “converged toward agreement” that there was “a small, statistically significant relationship,” the cause of which was “unknown.”

This phenomenon could be helpful. If you have the luxury of multiple teams, you can generate a more thorough investigation and debate. This phenomenon could also be bad, an endless sort of analysis paralysis.

Big data only magnifies this problem. Imagine multiple teams working with multiple data sets, each of which is relevant to the answer, but none of which is sufficient by itself.

How can you tackle this?

Aside from compromise or consensus answers, the article mentions averaging different conclusions as another possible approach.

In big data analytics, you might substitute multiple algorithms for multiple teams. Ensemble methodologies have gained strong traction recently. For example, the Netflix Prize was won by an ensemble methodology (RBM). It’s fair to say, ensembles of regression trees (BT) are the most popular methodology for classification. Amex, for example, uses BT for fraud and credit worthiness.

Outside the application of analytics, business considerations might provide additional, deciding constraints for sorting out multiple approaches. Feasibility, budget or timeline for implementation, safety, regulatory constraints and other considerations could be the deciding factor when choosing an algorithm. For example, a financial company could use a BT for training their analytics at scale, but once in production they may switch to using simple regression-based classification to stay in compliance with regulations.

Having data supporting your conclusions is usually better than having no data. Better yet is a thorough examination of methods behind your analytical approach to deriving and applying value from big data.


Situational Intelligence: Your Next Chief IoT Officer?


A recent Datamation blog post argues that gaining full value from the Internet of Things requires “a broadminded corporate vision, a radically new approach to product/service design, highly specialized technical skills, and fundamentally rethinking an organization’s go-to-market strategies.”

All this demands “a multidimensional perspective that spans the traditional corporate silos, and bridges the gap between business and technology.”

How are you supposed to achieve this multidimensional perspective that spans silos? According to Datamation, by appointing a Chief IoT Officer. This role would have three main functions:

  • Aligning technological components to corporate objectives
  • Capturing relevant data
  • Utilizing data to support operations and achieve corporate objectives

That sounds like a solid job description for a new C-suite role. But adding to the C-suite is a centralized approach, while the Internet of Things is a highly decentralized phenomena. Wouldn’t you want everyone in your organization working across silos, aligning technology to objectives, and capturing and applying relevant data to solve business problems?

If so, deriving value from the Internet of Things becomes a question of culture, not leadership.

Situational intelligence, by definition, correlates, analyzes and visualizes data from multiple data silos.

By making situational intelligence applications widely available across your organization, including in your boardroom, you can build a culture of spanning silos, applying data, and solving business problems. A new C-suite role doesn’t guarantee cultural change in your organization.



Invite Situational Intelligence to the Board Room


IBM says that it’s time to invite data scientists to the board room Fortune recently declared “The Algorithmic CEO.” A UK financial company, Deep Knowledge Ventures, has appointed an algorithm to its board of directors.

What is happening with analytics at the corporate board level?

According to a 2013 survey by Tata Consulting, three consistent challenges that executives face in realizing ROI from Big Data projects are

  • “Getting business units to share information across organizational silos”
  • “Building high levels of trust between the data scientists who present insights on Big Data and the functional managers”
  • “Determining what data to use for different business decisions”

These challenges have little or nothing to do with technology.

The first two are issues of organizational culture. The last pertains to business strategy.  Organizational culture and business strategy are squarely the purview of corporate officers and directors.

One could fault corporate officers and directors for being behind the Big Data curve. It’s not like this stuff just happened in the past two months.

What is new is the awareness of potential competitive advantage by applying analytics across silos of IT, operational, and external data, not just within silos. Corporate officers and directors (should) work above individual data silos, looking for advantage. They have not had powerful, easy tools for analytics across silos–until the advent of situational intelligence solutions.

A defining characteristic of situational intelligence solutions is the ability to access and correlate multiple sources of IT, operational, and external data into a single platform for analysis. Related to that is the ability to visualize the results of analysis for multiple types of users on multiple devices.

Thus, situational intelligence solutions are useful board-level tools for sharing information across silos, building trust across the organization and collaborating on what data can support the best business decisions. Maybe it’s time to invite situational intelligence to the board room.


In Analytics, There Are No Black Boxes


You can’t just blindly accept numbers—that’s not scientific. Yet when you rely on black-box analytics, you’re forced to accept that the logic of the unseen algorithms precisely matches your needs.

Opening up the black box gives you three types of confidence in the data that is driving your decisions.

  • Understanding how an analytical value is derived. Even if you couldn’t do the math yourself or were never a mathematician, you should know what inputs and logic went into creating the analytical results on which you are relying.
  • Modifying or creating analytics to fit your specific needs. Although you may be working with an established analytics vendor or product, you may need to modify existing algorithms or even create your own in order to meet your needs. For instance, you may want to add or change the inputs given to an algorithm, or change the weighting given to the inputs used.
  • Auditing the analytics process. If you are using analytics to make significant, data-driven decisions, odds are you will need to show your math at some time to someone: regulators, investors, board of directors, insurance companies. Black box analytics don’t give you this opportunity for auditability and transparency.

Open source analytics packages are increasingly the norm. R and Spark are two leading examples. Open source allows you to create, understand, modify, and audit analytics to match your specific needs and assure your stakeholders.