Title Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop / Steve Hoffman.

Publication Info.

Birmingham, UK : Packt Publishing, 2015.

Online access
Online eBook via EBSCO. Access restricted to current Rider University students, faculty, and staff.
Instructions for reading/downloading the EBSCO version of this eBook

Item Status

Edition	Second edition.
Description	1 online resource (1 volume) : illustrations.
	text file
Series	Community experience distilled
	Community experience distilled.
Note	Includes index.
Contents	Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Overview and Architecture; Flume 0.9; Flume 1.X (Flume-NG); The problem with HDFS and streaming data/logs; Sources, channels, and sinks; Flume events; Interceptors, channel selectors, and sink processors; Tiered data collection (multiple flows and/or agents); The Kite SDK; Summary; Chapter 2: A Quick Start Guide to Flume; Downloading Flume; Flume in Hadoop distributions; An overview of the Flume configuration file; Starting up with ""Hello, World!""; Summary.
	Chapter 3: ChannelsThe memory channel; The file channel; Spillable Memory Channel; Summary; Chapter 4: Sinks and Sink Processors; HDFS sink; Path and filename; File rotation; Compression codecs; Event Serializers; Text output; Text with headers; Apache Avro; User-provided Avro schema; File type; SequenceFile; DataStream; CompressedStream; Timeouts and workers; Sink groups; Load balancing; Failover; MorphlineSolrSink; Morphline configuration files; Typical SolrSink configuration; Sink configuration; ElasticSearchSink; LogStash Serializer; Dynamic Serializer; Summary.
	Chapter 5: Sources and Channel SelectorsThe problem with using tail; The Exec source; Spooling Directory Source; Syslog sources; The syslog UDP source; The syslog TCP source; The multiport syslog TCP source; JMS source; Channel selectors; Replicating; Multiplexing; Summary; Chapter 6: Interceptors, ETL, and Routing; Interceptors; Timestamp; Host; Static; Regular expression filtering; Regular expression extractor; Morphline interceptor; Custom interceptors; The plugins directory; Tiering flows; The Avro source/sink; Compressing Avro; SSL Avro flows; The Thrift source/sink.
	Using command-line AvroThe Log4J appender; The Log4J load-balancing appender; The embedded agent; Configuration and startup; Sending data; Shutdown; Routing; Summary; Chapter 7: Putting it All Together; Web logs to searchable UI; Setting up the web server; Configuring log rotation to the spool directory; Setting up the target -- Elasticsearch; Setting up Flume on collector/relay; Setting up Flume on the client; Creating more search fields with an interceptor; Setting up a better user interface -- Kibana; Archiving to HDFS; Summary; Chapter 8: Monitoring Flume; Monitoring the agent process.
	MonitNagios; Monitoring performance metrics; Ganglia; Internal HTTP server; Custom monitoring hooks; Summary; Chapter 9: There Is No Spoon -- the Realities of Real-time Distributed Data Collection; Transport time versus log time; Time zones are evil; Capacity planning; Considerations for multiple data centers; Compliance and data expiry; Summary; Index.
Summary	If you are a Hadoop programmer who wants to learn about Flume to be able to move datasets into Hadoop in a timely and replicable manner, then this book is ideal for you. No prior knowledge about Apache Flume is necessary, but a basic knowledge of Hadoop and the Hadoop File System (HDFS) is assumed.
Local Note	eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - North America
Subject	Apache Hadoop.
	Apache Hadoop.
	Electronic data processing -- Distributed processing.
	Electronic data processing -- Distributed processing.
	File organization (Computer science)
	File organization (Computer science)
Genre/Form	Electronic books.
Added Title	Design and implement a series of Flume agents to send streamed data into Hadoop
Other Form:	Print version: Hoffman, Steve. Apache Flume : Distributed Log Collection for Hadoop. Birmingham : Packt Publishing, ©2015 9781784392178
ISBN	9781784399146 (electronic book)
	1784399140 (electronic book)
	1784399140
	1784392170
	9781784392178

Title Apache Flume: distributed log collection for Hadoop : design and implement a series of Flume agents to send streamed data into Hadoop / Steve Hoffman.

Item Status

Library Links

Search Tools