Skip to content
You are not logged in |Login  
     
Limit search to available items
Record:   Prev Next
Resources
More Information
Bestseller
BestsellerE-book
Author Pasupuleti, Pradeep.

Title Pig design patterns : simplify Hadoop programming to create complex end-to-end enterprise big data solutions with Pig / Pradeep Pasupuleti.

Publication Info. Birmingham, UK : Packt Pub., [2014]
©2014

Item Status

Description 1 online resource (1 volume) : illustrations.
text file
Series Community experience distilled
Community experience distilled.
Note Includes index.
Contents Cover; Copyright; Credits; Foreword; About the Author; Acknowledgments; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Setting the Context for Design Patterns in Pig; Understanding design patterns; The scope of design patterns in Pig; Chapter 2: Hadoop demystified -- a quick reckoner; The enterprise context; Common challenges of distributed systems; The advent of Hadoop; Hadoop under the covers; Understanding the Hadoop Distributed File System; HDFS design goals; Working of HDFS; Understanding MapReduce; Understanding how MapReduce works; The MapReduce internals.
Pig -- a quick introUnderstanding the rationale of Pig; Understanding the relevance of Pig in the enterprise; Working of Pig -- an overview; Firing up Pig; The use case; Code listing; The dataset; Understanding Pig through the code; Pig's extensibility; Operators used in code; The EXPLAIN operator; Understanding Pig's data model; Primitive types; Complex types; Summary; Chapter 2: Data Ingest and Egress Patterns; The context of data ingest and egress; Types of data in the enterprise; Ingest and egress patterns for multistructured data; Considerations for log ingestion.
The Apache log ingestion patternBackground; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The Custom log ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The image ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for the NoSQL data; MongoDB ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results.
Additional informationThe HBase ingress and egress pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for structured data; The Hive ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; The ingress and egress patterns for semi-structured data; The mainframe ingestion pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; XML ingest and egress patterns.
BackgroundMotivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; JSON ingress and egress patterns; Background; Motivation; Use cases; Pattern implementation; Code snippets; Results; Additional information; Summary; Chapter 3: Data Profiling Patterns; Data profiling for Big Data; Big Data profiling dimensions; Sampling considerations for profiling Big Data; Sampling support in Pig; Rationale for using Pig in data profiling; The data type inference pattern; Background; Motivation; Use cases; Pattern implementation; Code snippets; Pig script; Java UDF.
Summary Pig makes Hadoop programming simple, intuitive, and fun to work with. It removes the complexity from Map Reduce programming by giving the programmer immense power through its flexibility. What used to be extremely lengthy and intricate code written in other high level languages can now be written in almost one tenth of the size using its easy to understand constructs. Pig has proven to be the easiest way to learn how to program Hadoop clusters, as evidenced by its widespread adoption. This comprehensive guide enables readers to readily use design patterns to simplify the creation of complex da.
Local Note eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - North America
Subject Apache Hadoop.
Apache Hadoop.
Big data -- Data processing.
Big data.
Genre/Form Electronic books.
Other Form: Print version: Pasupuleti, Pradeep. Pig Design Patterns. Birmingham : Packt Publishing, ©2014 9781783285556
ISBN 9781783285563 (electronic book)
1783285567 (electronic book)
1783285567
1783285559
9781783285556