Skip to content
You are not logged in |Login  
     
Limit search to available items
Record 10 of 30
Record:   Prev Next
Resources
More Information
Bestseller
BestsellerE-book
Author Banerjee, Kyle, author.

Title The data wrangler's handbook : simple tools for powerful results / Kyle Banerjee.

Publication Info. Chicago : ALA Neal-Schuman, 2019.

Item Status

Description 1 online resource
text file
Bibliography Includes bibliographical references and index.
Summary "Data manipulation and analysis are far easier than you might imagine - in fact, using tools that come standard with your desktop computer, you can learn how to extract, manipulate, and analyze data (and metadata) of any size and complexity. In this handbook, data wizard Banerjee will familiarize you with easily digestible but powerful concepts that will enable you to feel confident working with data. With his expert guidance, you'll learn how to use a single-word command to sort files of any size by any criteria, identify duplicates, and perform numerous other common library tasks; understand data formats, delimited text and CSV files, XML, JSON, scripting, and other key components of data; undertake more sophisticated tasks such as comparing files, converting data from one format to another, reformatting values, combining data from multiple files, and communicating with APIs (Application Programming Interfaces); and save time and stress through simple techniques for transforming text, recognizing symbols that perform important tasks, a Regular Expression cheat sheet, a glossary, and other tools"-- Provided by publisher.
Contents Cover -- Title Page -- Copyright Page -- Contents -- List of Figures and Tables -- Acknowledgments -- Introduction -- Chapter 1. Getting Started with the Command Line -- Finding the Command Line -- Mac -- Windows -- Meet the Command Line -- Chapter 2. Command Line Concepts -- Two Powerful Symbols -- Direct Output to a File (Greater than Symbol) -- Direct Output to Another Program (Pipe Symbol) -- Command Substitution -- Regular Expressions-The Swiss Army Knife for Data -- Literal Characters -- Special Characters -- Wildcard Characters -- Logical Operators -- Grouping -- Scripting
Chapter 3. Understanding Formats, by David Forero -- Chapter 4. Simplify Complicated Problems -- Isolating Specific Data Elements -- Converting Data into Formats That Are Easier to Work With -- Chapter 5. Delimited Text -- CSV (Comma Separated Values) -- Commas and Quotation Marks in CSV Files -- Multiline Fields in CSV Files -- Multivalued Fields in Delimited Files -- Chapter 6. XML -- So What Is XML, Really? -- What Makes XML So Useful? -- Why Is XML So Easy? -- DOM (Document Object Model) -- XPath -- XSLT (eXtensible Stylesheet Language Transformations) -- Working with Large XML Files
Working with Complex XML Files -- XmlStarlet -- Installing XmlStarlet -- Converting XML Documents -- Chapter 7. JSON (JavaScript Object Notation) -- Chapter 8. Scripting -- Variables -- Arguments -- Conditional Execution -- Loops -- Chapter 9. Solving Common Problems -- Viewing Large Files -- Locating Files That Contain Particular Data -- Finding Files with Specific Characteristics -- Working with Internal Metadata -- Working with APIs -- Combining Data from Different Sources -- Other Tasks -- Chapter 10. Conclusions -- One-Line Wonders -- Locating, Viewing, and Performing Basic File Operations
Combine Information from Multiple Files into a Single File -- Combine Three Files, Each Consisting of a Single Column, into a Three-Column Table -- Extract 1,000 Random Lines or Records from a File -- Find Files with Specific Characteristics -- Find All Lines in All Files in the Current Directory as Well as All Subdirectories Containing a Regular Expression -- Identify All Files in Current Directories and Subdirectories That Contain a Value -- List All Files in Current Directory and Subdirectories over a 100 MB in Order of Decreasing Size
List the Names, Pixel Dimensions, and File Sizes of All Files in the Current Directory and Subdirectories in Tab Delimited Format -- Print Line Number of File That Match Occurred On -- Split Large Files into Smaller Chunks with Each File Breaking on a Line -- View 200 Characters Starting at Position 385621 in a File -- View Lines 4369-4374 of a File -- Retrieving and Sending Information over a Network -- Retrieve a Document from the Web and Send It to a File -- Send an XML Document to an API Requiring HTTP Authentication -- Sorting, Counting, Deduplication, and File Comparison
Local Note eBooks on EBSCOhost EBSCO eBook Subscription Academic Collection - North America
Subject Database design.
Database design.
Data structures (Computer science)
Data structures (Computer science)
Information retrieval.
Information retrieval.
File conversion (Computer science)
File conversion (Computer science)
Genre/Form Electronic books.
Electronic books.
Other Form: Print version: Banerjee, Kyle. The data wrangler's handbook Chicago : ALA Neal-Schuman, 2019. 9780838919095 (DLC) 2019024258
ISBN 9780838919132 (PDF)
0838919138
0838919103
9780838919118 (Kindle)
0838919111
9780838919101 (electronic book)
9780838919095 (paper ; permanent paper)