數(shù)據(jù)挖掘

出版時間:2001-5  出版社:高等教育出版社  作者:Jiawei Han  頁數(shù):550  字數(shù):762000  
Tag標簽:無  

前言

  20世紀末,以計算機和通信技術為代表的信息科學和技術,對世界的經(jīng)濟、軍事、科技、教育、文化、衛(wèi)生等方面的發(fā)展產(chǎn)生了深刻的影響,由此而興起的信息產(chǎn)業(yè)已經(jīng)成為世界經(jīng)濟發(fā)展的支柱。進入21世紀,各國為了加快本國的信息產(chǎn)業(yè),加大了資金投入和政策扶持?! 榱思涌煳覈畔a(chǎn)業(yè)的進程,在我國《國民經(jīng)濟和社會發(fā)展第十個五年計劃綱要》中,明確提出“以信息化帶動工業(yè)化,發(fā)揮后發(fā)優(yōu)勢,實現(xiàn)社會生產(chǎn)力的跨越式發(fā)展?!毙畔a(chǎn)業(yè)的國際競爭將日趨激烈。在我國加入WTO后,我國信息產(chǎn)業(yè)將面臨國外競爭對手的嚴峻挑戰(zhàn)。競爭成敗最終將取決于信息科學和技術人才的多少與優(yōu)劣?! ≡?0世紀末,我國信息產(chǎn)業(yè)雖然得到迅猛發(fā)展,但與國際先進國家相比,差距還很大。為了趕上并超過國際先進水平,我國必須加快信息技術人才的培養(yǎng),特別要培養(yǎng)一大批具有國際競爭能力的高水平的信息技術人才,促進我國信息產(chǎn)業(yè)和國家信息化水平的全面提高。為此,教育部高等教育司根據(jù)教育部呂福源副部長的意見,在長期重視推動高等學校信息科學和技術的教學的基礎上,將實施超前發(fā)展戰(zhàn)略,采取一些重要舉措,加快推動高等學校的信息科學和技術等相關專業(yè)的教學工作。在大力宣傳、推薦我國專家編著的面向21世紀和“九五”重點的信息科學和技術課程教材的基礎上,在有條件的高等學校的某些信息科學和技術課程中推動使用國外優(yōu)秀教材的影印版進行英語或雙語教學,以縮短我國在計算機教學上與國際先進水平的差距,同時也有助于強化我國大學生的英語水平。

內(nèi)容概要

  本書闡述了數(shù)據(jù)挖掘(通常稱為數(shù)據(jù)庫知識發(fā)現(xiàn))的概念、方法和應用。從強調(diào)數(shù)據(jù)分析入手,介紹了數(shù)據(jù)庫和數(shù)據(jù)挖掘的概念,指出數(shù)據(jù)挖掘是對大型數(shù)據(jù)庫、數(shù)據(jù)構件庫和其他大型信息資源中標識知識含義的那些類型的自動的或便捷的提取,并通過一個通用的框架回顧了當前的市場可供產(chǎn)品。數(shù)據(jù)挖掘是一個跨學科的知識領域,汲取了數(shù)據(jù)庫技術、人工智能、機器學習、神經(jīng)網(wǎng)絡、統(tǒng)計學、模式識別、知識庫系統(tǒng)、知識獲取、信息檢索、高性能計算、數(shù)據(jù)可視化等方面的成果,本書內(nèi)容從數(shù)據(jù)庫的視角,描述了數(shù)據(jù)挖掘系統(tǒng)的原型、結構、特征、方法,重點講解了數(shù)據(jù)挖掘的可行性、實用性、有效性和大型數(shù)據(jù)庫中模型發(fā)現(xiàn)的可測量性等問題。本書逐章講解了數(shù)據(jù)分類、預測、聯(lián)結和分組的概念和技術,這些專題都配有實例,對各類問題都分別列舉了最佳算法,并對怎樣運用技術給出了經(jīng)過實踐檢驗的實用型規(guī)則。這種講述方式?jīng)Q定了本書的可讀性強,能夠使讀者從中學到數(shù)據(jù)挖掘領域的知識,了解產(chǎn)業(yè)最新動向。本書適用于計算機科學系的學生、應用軟件開發(fā)人員、商業(yè)領域的專家和相關知識領域的科技研究人員。   內(nèi)容:1. 數(shù)據(jù)挖掘簡介 2. 數(shù)據(jù)構件庫和數(shù)據(jù)挖掘中的在線分析處理技術 3. 數(shù)據(jù)處理 4. 數(shù)據(jù)挖掘原型、語言和系統(tǒng)結構 5. 概念描述:特征與對比 6. 大型數(shù)據(jù)庫中的挖掘聯(lián)結規(guī)則 7. 分類和預測 8. 分組分析9. 挖掘復合數(shù)據(jù)類型 10. 數(shù)據(jù)挖掘應用及趨勢 附錄一 微軟公司數(shù)據(jù)挖掘的對象鏈接和嵌入數(shù)據(jù)庫 附錄二 數(shù)據(jù)庫挖掘器簡介

作者簡介

Jiawei Han is director of the Intelligent Database Systems research Laboratory and professor in the School of Computing Science at Simon Fraser University.Well dnown for his research in the areas of data mining and data-base systems,he has served on progr

書籍目錄

ForewordPrefaceChapter1 Introduction 1.1 What Motivated Data Mining? Why Is It Important? 1.2 So,What Is Data Mining? 1.3 Data Mining-On What Kind of Data? 1.4 Data Mining Functionalities-What Kinds of Patterns Can Be Mined? 1.5 Are All of the Patterns Interesting? 1.6 Classification of Data Mining Systems 1.7 Major Issues in Data Mining  1.8 Summary Exercises Bibliographic NotesChapter2 Data Warehouse and LOAP Technology for Data Mining 2.1 What Is a Data Warehouse? 2.2 A Multidimensional Data Model 2.3 Data Warehouse Architecture 2.4 Data Warehouse Implementation 2.5 Further Development of Data Cube Technology 2.6 From Data Warehousing to Data Mining 2.7 Summary  Exercises Bibliographic NotesChapter3 Data Preprocessing 3.1 Why Preprocess the Data? 3.2 Data Cleaning 3.3 Data Integration and Transformation 3.4 Data Reduction 3.5 Discretization and Concept Hierarchy Generation 3.6 Summary  Exercises Bibliographic NotesChapter4 Data Mining Primitives,Languages,and System ArchitecturesChapter5 Concept Description:Characterization and ComparisonChapter6 Mining Association Rules in Large DatabasesChapter7 Classification and PredictionChapter8 Cluster AnalysisChapter9 Mining Comples Types of DataChapter10 Applications and Trends in Data MiningAppendix A Introduction to Microsoft’s OLE DB for Data MiningAppendix B An Introduction to BDMiner BibliographyIndex

章節(jié)摘錄

  In this section, we examine a number of different data stores on which mining can be performed. In principle, data mining should be applicable to any kind of information repository. This includes relational databases, data warehouses, transactional databases, advanced database systems, flat files, and the World Wide Web. Advanced database systems include object-oriented and object-relational databases, and specific application-oriented databases, such as spatial databases, time-series databases, text databases, and multimedia databases. The challenges and techniques of mining may differ for each of the repository systems.  Although this book assumes that readers have primitive knowledge of information systems, we provide a brief introduction to each of the major data repository systems listed above. In this section, we also introduce the fictitious All Electronics store, which will be used to illustrate concepts throughout the text. A database system, also called a database management system (DBMS), consists of a collection of interrelated data, known as a database, and a set of software pro-grams to manage and access the data. The software programs involve mechanisms for the definition of database structures; for data storage; for concurrent, shared, or distributed data access; and for ensuring the consistency and security of the information stored, despite system crashes or attempts at unauthorized access.  A relational database is a collection of tables, each of which is assigned a unique name. Each table consists of a set of attributes (columns or fields) and usually stores a large set of tuples (records or rows). Each tuples in a relational table represents an object identified by a unique key and described by a set of attribute values. A semantic data model, such as an entity-relationship (ER) data model, which models the database as a set of entities and their relationships, is often constructed for relational databases.  Consider the following example.

圖書封面

圖書標簽Tags

評論、評分、閱讀與下載


    數(shù)據(jù)挖掘 PDF格式下載


用戶評論 (總計1條)

 
 

  •   封面是漢字內(nèi)容卻是英文的暈
 

250萬本中文圖書簡介、評論、評分,PDF格式免費下載。 第一圖書網(wǎng) 手機版

京ICP備13047387號-7