Khiops © software for data mining

Orange Labs software for mining large databases

Predictive analytics software natively big and fast data

Presentation

content divider

Domain & Definition

Because proper knowledge of customers' behavior and market dynamics is now essential to master business opportunities, the use of efficient data mining solutions has become a key factor of success in all economic fields.

Data preparation is a very important stage of the data mining process. But it is often carried out manually, and requires the involvement of skilled statisticians.

Up to now, it was very difficult to have a software that speeds up information analysis, and preparation of data before modeling.

Highlights

Orange Labs has developed a data mining solution called Khiops that automates the descriptive and exploratory phases of data analysis, as well as preparation of data before the phase of modeling for supervised classification.

Khiops makes easier the analysis of the predictive value of the variables, as this phase can take up to 70% of the time spent on building a predictive model.

The scoring phase, that starts after the data preparation is achieved, builds an efficient predictive model by combining the information carried out by all the descriptive variables. This model can be deployed to score new data.

The Khiops solution includes the following components:

  • Khiops: back-end, main data preparation and scoring software
  • Khiops Visualization: to visualize results from Khiops component
  • Khiops Coclustering: back-end, to analyze correlation between variables using hierarchical coclustering
  • Khiops Covisualization: to visualize results from Khiops Coclustering component

Advantages

The results of the comparative benchmarks carried out with Khiops showed the great quality of the results of discretizations and value groupings, on all the following criteria:

  • Accuracy of prediction
  • Robustness (ratio of the accuracy between test and training)
  • Strong resistance to noise
  • High explanatory capacity (low number of intervals or groups)
  • Computational efficiency

It has already been applied in problems with up to tens of thousands of variables and hundreds of thousands of instances, and has proved its ability to find the most discriminating variables.

The most valuable characteristic of the data preparation method potentially resides in the robust explanation of the data it provides: the software method builds the most probable discretization-based, grouped-based or coclustering-based explanation of the data.

Benefits to users

Khiops will definitely save time in the construction of forecasting models, as it performs crunching tasks and data scoring. It helps searching for discriminating variables on very large data samples, analyzing the predictive value of attributes, and producing statistical reports that can easily be exported by copy-paste. It also includes an entirely automatic software to build scoring models, which has proved to be very efficient; it is very quick, even on very large datasets.

Khiops Coclustering is able to detect highly informative patterns by the mean of hierarchical coclustering models, suitable for the task of explanatory analysis. This novel type of statistical analysis provides insights in many domains, such as:

  • Market analysis: clusters of customers versus clusters of products
  • Text corpus analysis: clusters of texts versus clusters of words
  • Web log analysis: clusters of cookies versus clusters of web pages
  • Graph analysis: clusters of source versus target nodes

Who is it for ?

This application is licensed to companies and/or organizations which believe that availability of clean, formatted and ready-to-use information is crucial to undergo a rich and successful data mining process.

  • Retail
  • Telecom, Water, Energy
  • Banking, Financial
  • Airlines
  • Government/ Administrations
  • Health
  • Pharmaceuticals
  • Education/Research

Screenshots

content divider
Learn

Prepare and Learn

Build informative features and fit a reliable model to the data

Score

Score

Apply a model to new data and imbed the result into your decision process

  • “Khiops is an excellent automatic tool for predictive model creation. Its optimized algorithm avoids usual over-fitting issues present in other tools. Along with an interactive user interface, its performance is proven when the number of attributes or the volume of data to score is important.”

    Tanguy Le Nouvel/ Data Scientist Customer intelligence and data mining Manager / Micropole

  • “We were very impressed by the speed with which your software was able to build the model and make predictions, we had tried other software(s) on this problem and the results were roughly the same, but it took orders of magnitude more time to build the model. We were also equally impressed by the ease of installation/deployment demonstrated by your software.”

    Mukund Deshpande AVP- Head BI/Analytics Business Unit / Persistent

Registration

content divider
Please check privacy and terms conditions