Analytics & Big Data Solution Offerings

Maximize the value of your data.

Let's talk

Analytics & Big Data value add

Vision & Strategy (Enabling)
Data Warehouse Modernization - Vision
  • Migration from a traditional DW to a multi-purpose governed Data lake, including Architecture, functionality & Organization’s Adoption
  • High-Level roadmap
The Business Case supporting Data Lake and AI initiatives
  • Linking data lake and AI initiatives investments to business goals and objectives
  • Defining platforms and development high level costs and business benefits
Data Strategy
  • Governance, Management, Architecture, Security and Compliance
  • Processes, stakeholders, technologies, bimodal considerations, scalability and elasticity, lineage, quality, usage, logging, access rights, MDM, analytics, platforms, …
  • Priorities, target state, Vision & Roadmap
Data Governance & Architecture (Guiding)
Data Governance Strategy
  • Organization Data Policies covering accesses, availability, validity, integrity, compliance, privacy, coverage, lifecycle and data principles in multiple geographies
  • Data Management Framework providing guidance to stakeholders on processes and best practice
  • Mapping data to support the organization’s goals and objectives
Data Management
  • Operational data management using Policies, Principles and best practices
  • Data Quality Targets and monitoring to support Business goals and objectives
  • Master Data Management
Data Warehouse Modernization - Architecture & Governance
  • Data & Analytics Platform & Data Lake architecture
  • Data & Analytics Value Chain: Data Governance & Management
  • Migration process - Detailed roadmap
Master Data Management - Architecture & Governance
  • Target level of uniformity & accuracy, with stakeholders, for core data assets (customers, suppliers, products, etc.) with required processes and technologies
  • Identification of organization’s specific core data assets to master with contributing data sources
  • Strategy of unique ‘golden records’ or multiple perspectives given usage for each core data assets
Data Lake Architecture Definition of Data Lake Zones to cover the various types of analytics with bimodal usage, governance, data management, exploration and operationalization
Conceptual Enterprise Data Model (CEDM)
  • Enterprise wide and using business language
  • Organized in Data Domains and Sub Domains with Views
  • Business Dictionary with or without a Data Management framework
Dimensional Enterprise Data Model (DEDM)
  • Re-organization of the CEDM in Dimensions & Fact Tables
  • Enterprise-wide set of conformed dimensions and clear metrics
  • Used for both traditional DW & Datamarts and/or the consumption zone of a Data Lake
Data Quality Subset of Data Management addressing Integrity/accuracy, availability, accessibility, coverage, compliance, timeliness/currency
Data Warehouse & Data Mart ArchitectureData Architecture and data modeling + governance covering the Data Warehouse and the derived Datamarts in standard Inmon/Kimball fashion or Hybrid.
Data Value Chain (Executing)
Data Capture & Management for Operations
Master Data Management - Execution
  • Design, QA and Implementation of core data assets master automated processes (includes matching rules) & Manual processes/procedures with stakeholders (governor, steward and custodian)
  • MDM dashboard with roadmaps for addressing data issues
Collect & Organize for Analytics
IoT and other real-time Ingestion
  • The process of absorbing data in batches, micro batches and real-time using tools such as Kafka, NIFI, Storm, Spark Streaming, Sqoop, Pig, Oozie, ADF, Kinesis, Data Pipeline …
  • Ingestion code Development, QA and deployment
  • Securing data acquisition including IoT device management layer
ETL migration to Hadoop/Spark
  • Migrate ETL code developed in tools that now support code generation towards Hadoop/Spark preventing re-writing
  • License conversion strategy (includes exit strategy)
  • Staging data thru the various Data Lake zones (best practices) with parallel testing
Pattern base ETL
  • ETL/Ingestion is a very significant portion of the costs of the overall Data & Analytics Value Chain
  • Pattern Based ETL/Ingestion aims at reducing these costs by up to 50% using patterns such as Data Vaults with technologies supporting patterns based ETL generation
Design & Implement a Data Lake with proper level of Governance
  • A Data Lake could be a throw away, used for one specific purpose using non-sensitive data or it could be the persistent repository of all data of an organization & external data with varying degrees of sensitive data
  • In all variations of these two extremes, governance (including security, compliance and privacy) need to be considered in bimodal modes
ETL Is a type of data integration that refers to the three steps (extract, transform, load) used to blend data from multiple sources. Extract, Load, Transform (ELT) is an alternate but related approach designed to push processing down to the database for improved performance
Analyze (incl. ML/DL/NLP & AI)
Descriptive Is the interpretation of historical data to better understand changes that have happened in a business (the what). Descriptive analytics describes the past using a range of data to draw comparisons. Most commonly reported financial metrics are a product of descriptive analytics, e.g., year-over-year pricing changes, month-over-month sales growth, the number of users, or the total revenue per subscriber.
Diagnostic Is the interpretation of historical data to understand why it happened and rely usually on OLAP capabilities allowing slicing, drilling and pivoting. Additional capabilities such as but not limited to market basket analysis and forecasting have also been embedded to OLAP tools.
  • Determine patterns and predict future outcomes and trends (what should happen)
  • Using multiple techniques from data mining, statistics, modeling, machine learning, and artificial intelligence applied on existing data sets
Prescriptive Recommending the best path forward, based on multiple business constraints, objectives and metrics (what should I do) by applying mathematical and computational algorithms)
AI & Big Data – Executive advise This is a consulting mandate. The objective of this offering is to guide and educate CIOs and CDOs in their AI and Big Data endeavors. The target audience is organizations that have challenges in the positioning the variety of concepts, technologies, model frameworks, Hadoop distributions, cloud offerings and varying applications and specialization of AI, cognitive services, NLP, deep learning, and so on.
Rent a Data Scientist
  • Without Governance - using the expertise of a Data Scientist to build Models and Algorithms
  • With Governance - provide expertise of a Data Scientist to build Models and Algorithms, along with Data Management, Architecture, Governance and Project Management
Dashboards & Reports Visualizing information in the form of Dashboards and/or reports, including multi-dimensional
Models & Algorithms - deployment in process/application
  • Embedding models & algorithms (for predicting, recommending or classifying) in processes or applications
  • Includes the management of all these analytics assets (lifecycle, versions,…)
Callable AI
  • Using APIs to call models and algorithms
  • KPI offers Models & Algorithms, available to our clients thru one or more cloud providers via callable APIs
Monitoring of Data Ingestion and Integration Via dashboards and alerting, it monitors the ingestion and integration aspect of the Data & Analytics Value Chain to inform of the state of the analytics data
Monitoring of Models Performance Reusing recommendations and predictions produced thru various models, this monitoring intends to detect model degradation in order to retire or replace them with another version or brand new model
Data & Analytics Platforms (Providing Capabilities)
Data & Analytics Platforms technical architecture (includes multi clouds and/or premise)
  • Intended for various Data and Analytics platforms in a context of multi-clouds and/or on premises
  • Includes hyperconvergence infrastructure covering the distributed data plane and a management plane (Hortonworks, DataPlane, Nutanix, vmware…). Also includes SDLC consideration
  • Elastic processing capabilities, GPU based processing, containerized AI frameworks, …
Data & Analytics Platforms recommendations
  • The quantity of data & analytics platforms and their packaging is constantly increasing: o PAAS combinations are multiplying and the same is happening for SAAS o Integrated Analytics (Salesforce, SAP, …) is becoming more common
  • KPI Digital do Data & Analytics Platforms Technology Watch
Hadoop/Spark distribution Install
  • A multi-purpose, governed and secured Hadoop/Spark cluster: 6+ months of efforts
  • Single use Hadoop Cluster with non-sensitive data: a few days
Data & Analytics Platforms sizing
  • Sizing a governed Data Lake implies the understanding of data lake zones, replication strategy, compression used, ratio of processing to storage, special nodes, edges and admin nodes ratio, elasticity, scalability, kappa/lambda architecture, …
  • Translation of the above in number of nodes for dev/QA and production
Big Data – Architecture (including Lamda, Kappa, …) Adopting Kappa versus Lambda or other variations, has immediate effects in the overall Big Data Architecture. For example, Lambda approach uses a 3 layers mindset: a speed layer, a batch layer and a serving layer using the batch and speed layer to support queries with low latency
Securing a Hadoop/Spark platform Securing a Hadoop cluster includes: Kerberos, encryption at rest/in memory/in motion, access profiles, authentication, zones, remote VM, pentest, key management, metadata management, logging, policies…
Training KPI currently offers training in Cognos and will expand to cover additional areas of the KPI Analytics & Big Data Offering
style="vector-effect: non-scaling-stroke;">