成功加入购物车
[印]查凯莱巴蒂 著 / 人民邮电出版社 / 2009-02 / 平装
售价 ¥ 4.00
品相 九品
优惠 满包邮
延迟发货说明
上书时间2024-05-18
卖家超过10天未登录
Web数据挖掘:超文本数据的知识发现
本书是信息检索领域的名著,深入讲解了从大量非结构化Web数据中提取和产生知识的技术。书中首先论述了Web的基础(包括Web信息采集机制、Web标引机制以及基于关键字或基于相似性搜索机制),然后系统地描述了Web挖掘的基础知识,着重介绍基于超文本的机器学习和数据挖掘方法,如聚类、协同过滤、监督学习、半监督学习,最后讲述了这些基本原理在Web挖掘中的应用。本书为读者提供了坚实的技术背景和最新的知识。
本书是从事数据挖掘学术研究和开发的专业人员理想的参考书,同时也适合作为高等院校计算机及相关专业研究生的教材。
SoumenChakrabarti,Web搜索与挖掘领域的知名专家,ACMTransactionsontheWeb副主编。加州大学伯克利分校博士,目前是印度理工学院计算机科学与工程系副教授。曾经供职于IBMAlmaden研究中心,从事超文本数据库和数据挖掘方面的工作。他有丰富的实际项目开发经验,开发了多个Web挖掘系统,并获得了多项美国专利。
INTRODUCTION1.1CrawlingandIndexing1.2TopicDirectories1.3ClusteringandClassification1.4HyperlinkAnalysis1.5ResourceDiscoveryandVerticalPortals1.6Structuredvs.UnstructuredDataMining1.7BibliographicNotesPARTⅠINFRASTRUCTURE2CRAWLINGTHEWEB2.1HTMLandHTTPBasics2.2CrawlingBasics2.3EngineeringLarge-ScaleCrawlers2.3.1DNSCaching,Prefetching,andResolution2.3.2MultipleConcurrentFetches2.3.3LinkExtractionandNormalization2.3.4RobotExclusion2.3.5EliminatingAlready-VisitedURLs2.3.6SpiderTraps2.3.7AvoidingRepeatedExpansionofLinksonDuplicatePages2.3.8LoadMonitorandManager2.3.9Per-ServerWork-Queues2.3.10TextRepository2.3.11RefreshingCrawledPages2.4PuttingTogetheraCrawler2.4.1DesignoftheCoreComponents2.4.2CaseStudy:Usingw3c-1ibwww2.5BibliographicNotes3WEBSEARCHANDINFORMATIONRETRIEVAL3.1BooleanQueriesandtheInvertedIndex3.1.1StopwordsandStemming3.1.2BatchIndexingandUpdates3.1.3IndexCompressionTechniques3.2RelevanceRanking3.2.1RecallandPrecision3.2.2TheVector-SpaceModel3.2.3RelevanceFeedbackandRocchiosMethod3.2.4ProbabilisticRelevanceFeedbackModels3.2.5AdvancedIssues3.3SimilaritySearch3.3.1Handling"Find-Similar"Queries3.3.2EliminatingNearDuplicatesviaShingling3.3.3DetectingLocallySimilarSubgraphsoftheWeb3.4BibliographicNotesPARTⅡLEARNINGSIMILARITYANDCLUSTERING4.1FormulationsandApproaches4.1.1PartitioningApproaches4.1.2GeometricEmbeddingApproaches4.1.3GenerativeModelsandProbabilisticApproaches4.2Bottom-UpandTop-DownPartitioningParadigms4.2.1AgglomerativeClustering4.2.2Thek-MeansAlgorithm4.3ClusteringandVisualizationviaEmbeddings4.3.1Self-OrganizingMaps(SOMs)4.3.2MultidimensionalScaling(MDS)andFastMap4.3.3ProjectionsandSubspaces4.3.4LatentSemanticIndexing(LSI)4.4ProbabilisticApproachestoClustering4.4.1GenerativeDistributionsforDocuments4.4.2MixtureModelsandExpectationMaximization(EM)4.4.3MultipleCauseMixtureModel(MCMM)4.4.4AspectModelsandProbabilisticLSI4.4.5ModelandFeatureSelection4.5CollaborativeFiltering4.5.1ProbabilisticModels4.5.2CombiningContent-BasedandCollaborativeFeatures4.6BibliographicNotes5SUPERVISEDLEARNING5.1TheSupervisedLearningScenario5.2OverviewofClassificationStrategies5.3EvaluatingTextClassifiers5.3.1Benchmarks5.3.2MeasuresofAccuracy5.4NearestNeighborLearners5.4.1ProsandCons5.4.2IsTFIDFAppropriate?5.5FeatureSelection5.5.1GreedyInclusionAlgorithms5.5.2TruncationAlgorithms5.5.3ComparisonandDiscussion5.6BayesianLearners5.6.1NaiveBayesLearners5.6.2Small-DegreeBayesianNetworks5.7ExploitingHierarchyamongTopics5.7.1FeatureSelection5.7.2EnhancedParameterEstimation5.7.3TrainingandSearchStrategies5.8MaximumEntropyLearners5.9DiscriminativeClassification5.9.1LinearLeast-SquareRegression5.9.2SupportVectorMachines5.10HypertextClassification5.10.1RepresentingHypertextforSupervisedLearning5.10.2RuleInduction5.11BibliographicNotes6SEMISUPERVISEDLEARNING6.1ExpectationMaximization6.1.1ExperimentalResults6.1.2ReducingtheBeliefinUnlabeledDocuments6.1.3ModelingLabelsUsingManyMixtureComponents……PARTⅢAPPLICATIONS
展开全部
图2
图3
配送说明
...
相似商品
为你推荐
开播时间:09月02日 10:30