Data Science for Everyone (Part 2)

Moving on to the next topic – mostly related to data processing. It is important to understand that data processing and data science are two separate yet related entities. Data processing is almost critical to maturation of data science.

We previously identified two separate classes of data based decisions.

  1. “Discover” or understand data: This group requires somewhat traditional approaches to data processing. Generally speaking, data have to be sourced from a wide variety of applications and/or systems. These data tend to be in a wide array of formats (but tends to be mostly structured data). These formats make it difficult to process data. In the past, data warehouses were typically used for data discovery. Now with Big Data, a wider variety of toolsets are available for data processing.
  2. Decisions that repeat: This type of decision requires slightly different approach to data processing. Generally reporting/monitoring and alerting tools are required and should be used for repeating decisions based on well understood data. However, data warehouses/data lakes or other architectural approaches can be used as well. These type of decisions are also based on data in motion (as opposed to data at rest).

With this basic difference in data processing and data science in mind, it will be interesting to figure out data science approaches and what can be done to fulfill the promise of pure data based decision making.

I will summarize the data science segments (and a few solutions) in the next post. Stay tuned….

Data Science for Everyone (Part 1)

Data science seems like a brand new term but isn’t so. We have always had data science – typically defined as principles, processes and techniques to understand the world around us through analysis of data.

Sometimes, data analysis does not necessarily result into decision making. So what do we need to do to get become a data driven decision making organization? First step is to understand what is generally involved in data science and data driven decision making.

I would have to say that there are two types of data based decisions groups generally identified –

  1. “Discover” or understand data: This group is often ignored or is not identified as a key element by most organization. This probably comes from a place of hubris – “well, we know our data well!”. However, the new norm (and the fact that more data are available) is to continuously discover data.
  2. Decisions that repeat: This group is very popular candidate when it comes to data driven decisions. Customer churn is an age-old problem that has haunted even the best marketer.

During the past few years, we have seen tremendous improvements in technology and the natural rise of “Big Data”. So how can we make use of these advances, think analytically at a massive scale and process giant volumes of data on a daily basis?

I will summarize the data processing challenge (and a few solutions) in the next post. Stay tuned….

Dashboard Designs Principles using Jaspersoft

Jaspersoft is gaining ground rapidly and as users get accustomed to using Jaspersoft on a daily basis, the problem of designing optimal dashboards and/or visualizations becomes urgent.

Having designed dashboards and other BI artifacts for a number of years, I have come to adopt a few simple fundamental principles that have helped me a great deal.

The five core principles are described below:

  1. Data complexity: Generally it is important to identify the complexity of the data at the very beginning. The complexity of data is usually reliant on the source of record system as well as the use cases attached with the data. As an example, an accountant will be able to understand accounting data (and KPIs) a lot easily than an average joe. So if you are designing a dashboard for data sourced from accounting system, it is better to “simplify” the data for general consumption based on the user groups. This directly gets us to principle#2…
  2. User Expertise: You should make user expertise with the data the best evaluator for your dashboard design. I have often found, that depending on the end user expertise, sometimes even a simple combo chart with to Y axis is difficult to read for some users. The user expertise problem is sometimes multiplied by the volume of data and the refresh frequency, which gets us to principle#3…
  3. Data Refresh: Providing timestamp context to users as you design the dashboards is fairly important. Most organizations would like to see data refreshed in real time or near real time BUT a key consideration is to determine WHO is monitoring the data refreshes and towards what end?
  4. Screen Resolution: Screen size and resolution should play a critical role in your consideration as well. I have seen requirements from customers where dashboards needed to be part of shop floors, manufacturing plants, retail space etc. Clearly having 20 inch monitors would not work for these venues. Having access to more “real estate” makes our job of designing dashboards a little bit easier.
  5. Dashboard Delivery: Knowing the technology that you have to use for dashboard delivery is also important. Some technologies make it easier to distribute dashboards on mobile devices vs others that more geared towards desktop delivery.

Hope this provides you with a good starting point. Do not hesitate to reach out to us [email protected] if you have further questions.

A Business Case for Embedded Analytics

We have often heard the term embedded analytics. It has gained more traction since 2008/2009. The term often refers to end users having access to reports/dashboards and deeper analytics right “where they need it”. In other words, context is highly important when it comes to operational analytics.

Imagine a customer services representative sitting in a customer contact center. The representative gets an email or a call from a customer. The typical steps that this rep will follow after confirming the identity of the customer (either customer id, SSN or something else) is to ascertain the reason the customer is calling in about.

Today, customer interaction histories are maintained “forever”. All of a sudden, there are a lot of data to sift through and understand

  1. The customer interaction history: What has the customer bought in the past? What has he/she bought in the recent months? How valuable is this customer? What are some of the preferences the customer may have? Now imagine the same rep having to pull up another window on the browser, enter the customer “key” and generate the report that shows this interaction history. All this takes valuable time and affects customer experience.
  2. Customer predictions: This is harder to achieve in the context of the application/s the rep may use as part of the contact center portfolio. What might the customer be calling about? Are there opportunities to upsell/cross-sell while the rep has the customer online?

These simple use cases should throw a bright shining light on the need to embedded analytics. Now imagine the same rep “receives” a screen pop with all of the relevant information (past and future predictions) as the call/email arrives at his/her desk.

The agility provided by embedded analytics is tremendous and can directly affect revenues, cost savings and customer retention. These type of use cases exist in all industry verticals and we, at DataNinjas have been implementing solutions that fit the need for a number of years.

Our go-to tool for delivering BI has been Jaspersoft and with our deep expertise, we have been able to deliver these solutions to our customers in a cost-effective way.

If you have additional questions/queries, do not hesitate to reach out to us at [email protected]