Java Data Mining: Strategy, Standard, and Practice: A Practical Guide for architecture, design, and implementation

Mark F. Hornick, Erik Marcadé, Sunil Venkayala




Whether you are a software developer, systems architect, data analyst, or business analyst, if you want to take advantage of data mining in the development of advanced analytic applications, Java Data Mining, JDM, the new standard now implemented in core DBMS and data mining/analysis software, is a key solution component. This book is the essential guide to the usage of the JDM standard interface, written by contributors to the JDM standard.

The book discusses and illustrates how to solve real problems using the JDM API. The authors provide you with:

  • Data mining introduction—an overview of data mining and the problems it can address across industries; JDM’s place in strategic solutions to data mining-related problems;
  • JDM essentials—concepts, design approach and design issues, with detailed code examples in Java; a Web Services interface to enable JDM functionality in an SOA environment; and illustration of JDM XML Schema for JDM objects;
  • JDM in practice—the use of JDM from vendor implementations and approaches to customer applications, integration, and usage; impact of data mining on IT infrastructure; a how-to guide for building applications that use the JDM API.
  • Free, downloadable KJDM source code referenced in the book available here

    Table of Contents

    Guide to Readers

    Part I - Strategy
    1. Overview of Data Mining
    1.1. Why is data mining relevant today?
    1.2. Introducing Data Mining
    1.3. The Value of Data Mining
    1.4. Summary
    1.5. References

    2. Solving Problems in Industry
    2.1. Cross-industry data mining solutions
    2.2. Data Mining in Industries
    2.3. Summary
    2.4. References

    3. Data Mining Process
    3.1. A standardized data mining process
    3.2. Data Analysis and Preparation…a more detailed view
    3.3. Data mining modeling, analysis, and scoring processes
    3.4. The Role of databases and data warehouses in Data Mining
    3.5. Data mining in enterprise software architectures
    3.6. Advances in automated data mining
    3.7. Summary
    3.8. References

    4. Mining Functions and Algorithms
    4.1. Data mining functions
    4.2. Classification
    4.3. Regression
    4.4. Attribute Importance
    4.5. Association
    4.6. Clustering
    4.7. Summary
    4.8. References

    5. JDM Strategy
    5.1. What is the JDM strategy?
    5.2. Role of Standards
    5.3. Summary
    5.4. References

    6. Getting Started
    6.1. Business Understanding
    6.2. Data Understanding
    6.3. Data Preparation
    6.4. Modeling
    6.5. Evaluation
    6.6. Deployment
    6.7. Summary
    6.8. References

    Part II - Standard
    7. Java Data Mining Concepts
    7.1. Classification problem
    7.2. Regression problem
    7.3. Attribute importance
    7.4. Association rules problem
    7.5. Clustering problem
    7.6. Summary
    7.7. References

    8. Design of the JDM API
    8.1. Object Modeling of Data Mining Concepts
    8.2. Modular Packages
    8.3. Connection Architecture
    8.4. Object Factories
    8.5. URI for Datasets
    8.6. Enumerated Types
    8.7. Exceptions
    8.8. Discovering DME Capabilities
    8.9. Summary
    8.10. References

    9. Using the JDM API
    9.1. Connection Interfaces
    9.2. Using JDM Enumerations
    9.3. Using data specification interfaces
    9.4. Using classification interfaces
    9.5. Using Regression interfaces
    9.6. Using Attribute Importance interfaces
    9.7. Using Association interfaces
    9.8. Using Clustering interfaces
    9.9. Summary
    9.10. References

    10. XML Schema
    10.1. Overview
    10.2. Schema Elements
    10.3. Schema Types
    10.4. Using PMML with the JDM Schema
    10.5. Use cases for JDM XML Schema and Documents
    10.6. Summary
    10.7. References

    11. Web Services
    11.1. What is a Web Service?
    11.2. Service Oriented Architecture (SOA)
    11.3. JDM Web Service (JDMWS)
    11.4. Enabling JDM Web Services using JAX-RPC
    11.5. Summary
    11.6. References

    Part III - Practice
    12. Practical Problem Solving
    12.1. Business Scenario 1: Targeted Marketing Campaign
    12.2. Business Scenario 2: Understanding Key Factors
    12.3. Business Scenario 3: Using Customer Segmentation
    12.4. Summary
    12.5. Bibliography

    13. Building Data Mining Tools using JDM
    13.1. Data mining tools
    13.2. Administrative Console
    13.3. User Interface to build and save a model
    13.4. User Interface to test model quality
    13.5. Summary

    14. Getting Started with JDM Web Services
    14.1. A Web Service client in PhP
    14.2. A Web Service client in Java
    14.3. Summary
    14.4. References

    15. Impacts on IT Infrastructure
    15.1. What does Data Mining require from IT?
    15.2. Impacts on computing hardware
    15.3. Impacts on data storage hardware
    15.4. Data access
    15.5. Backup and recovery
    15.6. Scheduling
    15.7. Workflow
    15.8. Summary
    15.9. References

    16. Vendor implementations
    16.1. Oracle Data Mining
    16.2. KXEN (Knowledge eXtraction ENgines)
    16.3. Process for new Vendors
    16.4. Process for new JDM users
    16.5. Summary
    16.6. References

    Part IV. Wrapping Up
    17. Evolution of Data Mining Standards
    17.1. Data Mining Standards
    17.2. Java Community Process
    17.3. Why so many standards?
    17.4. Where data mining standards have been and where will they go?
    17.5. Directions for data mining standards
    17.6. Summary
    17.7. References

    18. Preview of Java Data Mining 2.0
    18.1. Transformations
    18.2. Time Series
    18.3. Apply for Association
    18.4. Feature Extraction
    18.5. Statistics
    18.6. Multi-target Models
    18.7. Text Mining
    18.8. Summary
    18.9. References

    19. Summary

    App. A. Further Reading
    App. B. Glossary