Publisher: O’Reilly Media
Language: English
ISBN: 0596521979
Paperback: 524 pages
Data: Jun 2009
Format: PDF
Description: Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The Definitive Guide provides you with the key for unlocking the wealth this data holds. Hadoop is ideal for storing and processing massive amounts of data, but until now, information on this open source project has been lacking — especially with regard to best practices. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. This book helps you: Use the Hadoop Distributed File System (HDFS) for storing large datasets, and running distributed computations over those datasets, using MapReduce Become familiar with Hadoop’s data and IO building blocks for compression, data integrity, serialization, and persistence Discover common pitfalls and advanced features for writing real-world MapReduce programs Use Pig, a high-level query language for large-scale data processing Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud Use HBase, Hadoop’s database for structured and semi-structured data
Download & Buy:
Google Book
Amazon
0 comments
Post a Comment