DynamoDB Cookbook

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1784391093

Category: Computers

Page: 266

View: 4342

Over 90 hands-on recipes to design Internet scalable web and mobile applications with Amazon DynamoDB About This Book Construct top-notch mobile and web applications with the Internet scalable NoSQL database and host it on cloud Integrate your applications with other AWS services like AWS EMR, AWS S3, AWS Redshift, and AWS CloudSearch etc. in order to achieve a one-stop application stack Step-by-step implementation guide that provides real-world use with hands-on recipes Who This Book Is For This book is intended for those who have a basic understanding of AWS services and want to take their knowledge to the next level by getting their hands dirty with coding recipes in DynamoDB. What You Will Learn Design DynamoDB tables to achieve high read and write throughput Discover best practices like caching, exponential back-offs and auto-retries, storing large items in AWS S3, storing compressed data etc. Effectively use DynamoDB Local in order to make your development smooth and cost effective Implement cost effective best practices to reduce the burden of DynamoDB charges Create and maintain secondary indexes to support improved data access Integrate various other AWS services like AWS EMR, AWS CloudSearch, AWS Pipeline etc. with DynamoDB In Detail AWS DynamoDB is an excellent example of a production-ready NoSQL database. In recent years, DynamoDB has been able to attract many customers because of its features like high-availability, reliability and infinite scalability. DynamoDB can be easily integrated with massive data crunching tools like Hadoop /EMR, which is an essential part of this data-driven world and hence it is widely accepted. The cost and time-efficient design makes DynamoDB stand out amongst its peers. The design of DynamoDB is so neat and clean that it has inspired many NoSQL databases to simply follow it. This book will get your hands on some engineering best practices DynamoDB engineers use, which can be used in your day-to-day life to build robust and scalable applications. You will start by operating with DynamoDB tables and learn to manipulate items and manage indexes. You will also discover how to easily integrate applications with other AWS services like EMR, S3, CloudSearch, RedShift etc. A couple of chapters talk in detail about how to use DynamoDB as a backend database and hosting it on AWS ElasticBean. This book will also focus on security measures of DynamoDB as well by providing techniques on data encryption, masking etc. By the end of the book you'll be adroit in designing web and mobile applications using DynamoDB and host it on cloud. Style and approach An easy-to-follow guide, full of real-world examples, which takes you through the world of DynamoDB following a step-by-step, problem-solution based approach.

Mastering DynamoDB

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1783551968

Category: Computers

Page: 236

View: 456

If you have interest in DynamoDB and want to know what DynamoDB is all about and become proficient in using it, this is the book for you. If you are an intermediate user who wishes to enhance your knowledge of DynamoDB, this book is aimed at you. Basic familiarity with programming, NoSQL, and cloud computing concepts would be helpful.

DynamoDB Applied Design Patterns

Author: Uchit Vyas,Prabhakaran Kuppusamy

Publisher: Packt Publishing Ltd

ISBN: 1783551909

Category: Computers

Page: 202

View: 1269

If you are an intermediate to advanced DynamoDB developer looking to learn the best practices associated with efficient data modeling, this book is for you.

Getting Started with Amazon Redshift

Author: Stefan Bauer

Publisher: Packt Publishing Ltd

ISBN: 1782178090

Category: Business & Economics

Page: 154

View: 7638

Getting Started With Amazon Redshift is a step-by-step, practical guide to the world of Redshift. Learn to load, manage, and query data on Redshift.This book is for CIOs, enterprise architects, developers, and anyone else who needs to get familiar with RedShift. The CIO will gain an understanding of what their technical staff is working on; the technical implementation personnel will get an in-depth view of the technology, and what it will take to implement their own solutions.

Handbook of Research on Big Data Storage and Visualization Techniques

Author: Segall, Richard S.,Cook, Jeffrey S.

Publisher: IGI Global

ISBN: 1522531432

Category: Computers

Page: 917

View: 7231

The digital age has presented an exponential growth in the amount of data available to individuals looking to draw conclusions based on given or collected information across industries. Challenges associated with the analysis, security, sharing, storage, and visualization of large and complex data sets continue to plague data scientists and analysts alike as traditional data processing applications struggle to adequately manage big data. The Handbook of Research on Big Data Storage and Visualization Techniques is a critical scholarly resource that explores big data analytics and technologies and their role in developing a broad understanding of issues pertaining to the use of big data in multidisciplinary fields. Featuring coverage on a broad range of topics, such as architecture patterns, programing systems, and computational energy, this publication is geared towards professionals, researchers, and students seeking current research and application topics on the subject.

Amazon EC2 Cookbook

Author: Sekhar Reddy,Aurobindo Sarkar

Publisher: Packt Publishing Ltd

ISBN: 1785282255

Category: Computers

Page: 194

View: 5418

Over 40 hands-on recipes to develop and deploy real-world applications using Amazon EC2 About This Book Design and build applications using Amazon EC2 and a range of supporting AWS tools Find highly effective solutions to your AWS Cloud-based application development, deployment, and infrastructural issues A comprehensive set of recipes to implement your product's functional and non-functional requirements Who This Book Is For This book is targeted at Cloud-based developers who have prior exposure to AWS concepts and features. Some experience in building small applications and creating some proof-of-concept applications is required. What You Will Learn Select and configure the right EC2 instances Create, configure, and secure a Virtual Private Cloud Create an AWS CloudFormation template Use AWS Identity and Access Management to secure access to EC2 instances Configure auto-scaling groups using CloudWatch Choose and use the right data service such as SimpleDB and DynamoDB for your cloud applications Access key AWS services using client tools and AWS SDKs Deploy AWS applications using Docker containers In Detail Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides flexible and resizable compute capacity in the cloud. The main purpose of Amazon EC2 is to make web-scale cloud computing easier for the developers. It offers developers and companies the raw building blocks like load balancers, object stores and virtual machines running on general hardware (that is, Amazon runs a multitude of hardware components but presents them as a generic utility to its users) with accessible APIs in order to create scalable software products This book covers designing, developing, and deploying scalable, highly available, and secure applications on the AWS platform. By following the steps in the recipes, you will be able to effectively and systematically resolve issues related to development, deployment, and infrastructure for enterprise-grade cloud applications or products. This book starts with helping you choose and configure the right EC2 instances to meet your application-specific requirements. The book then moves on to creating a CloudFormation template and will teach you how to work with stacks. You will then be introduced to using IAM services to configure users, groups, roles, and multi-factor authentication. You will also learn how to connect AD to AWS IAM. Next, you will be using AWS data services and accessing other AWS services including Route 53, Amazon S3, and AWS SES (Amazon Simple Email Service). Finally, you will be deploying AWS applications using Docker containers. Style and approach This book contains a rich set of recipes that cover not only the full spectrum of real-world cloud application development using Amazon EC2, but also the services and security of the applications. The book contains easy-to-follow recipes with step-by-step instructions to leverage EC2 within your applications.

Learning Apache Flink

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1786467267

Category: Computers

Page: 280

View: 6758

Discover the definitive guide to crafting lightning-fast data processing for distributed systems with Apache Flink About This Book Build your expertize in processing real-time data with Apache Flink and its ecosystem Gain insights into the working of all components of Apache Flink such as FlinkML, Gelly, and Table API filled with real world use cases Exploit Apache Flink's capabilities like distributed data streaming, in-memory processing, pipelining and iteration operators to improve performance. Solve real world big-data problems with real time in-memory and disk-based processing capabilities of Apache Flink. Who This Book Is For Big data developers who are looking to process batch and real-time data on distributed systems. Basic knowledge of Hadoop and big data is assumed. Reasonable knowledge of Java or Scala is expected. What You Will Learn Learn how to build end to end real time analytics projects Integrate with existing big data stack and utilize existing infrastructure Build predictive analytics applications using FlinkML Use graph library to perform graph querying and search. Understand Flink's - "Streaming First" architecture to implementing real streaming applications Learn Flink Logging and Monitoring best practices in order to efficiently design your data pipelines Explore the detailed processes to deploy Flink cluster on Amazon Web Services(AWS) and Google Cloud Platform (GCP). In Detail With the advent of massive computer systems, organizations in different domains generate large amounts of data on a real-time basis. The latest entrant to big data processing, Apache Flink, is designed to process continuous streams of data at a lightning fast pace. This book will be your definitive guide to batch and stream data processing with Apache Flink. The book begins with introducing the Apache Flink ecosystem, setting it up and using the DataSet and DataStream API for processing batch and streaming datasets. Bringing the power of SQL to Flink, this book will then explore the Table API for querying and manipulating data. In the latter half of the book, readers will get to learn the remaining ecosystem of Apache Flink to achieve complex tasks such as event processing, machine learning, and graph processing. The final part of the book would consist of topics such as scaling Flink solutions, performance optimization and integrating Flink with other tools such as ElasticSearch. Whether you want to dive deeper into Apache Flink, or want to investigate how to get more out of this powerful technology, you'll find everything you need inside. Style and approach This book is a comprehensive guide that covers advanced features of the Apache Flink, and communicates them with a practical understanding of the underlying concepts for how, when, and why to use them.

Hadoop Blueprints

Author: Anurag Shrivastava,Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1783980311

Category: Computers

Page: 316

View: 7401

Use Hadoop to solve business problems by learning from a rich set of real-life case studies About This Book Solve real-world business problems using Hadoop and other Big Data technologies Build efficient data lakes in Hadoop, and develop systems for various business cases like improving marketing campaigns, fraud detection, and more Power packed with six case studies to get you going with Hadoop for Business Intelligence Who This Book Is For If you are interested in building efficient business solutions using Hadoop, this is the book for you This book assumes that you have basic knowledge of Hadoop, Java, and any scripting language. What You Will Learn Learn about the evolution of Hadoop as the big data platform Understand the basics of Hadoop architecture Build a 360 degree view of your customer using Sqoop and Hive Build and run classification models on Hadoop using BigML Use Spark and Hadoop to build a fraud detection system Develop a churn detection system using Java and MapReduce Build an IoT-based data collection and visualization system Get to grips with building a Hadoop-based Data Lake for large enterprises Learn about the coexistence of NoSQL and In-Memory databases in the Hadoop ecosystem In Detail If you have a basic understanding of Hadoop and want to put your knowledge to use to build fantastic Big Data solutions for business, then this book is for you. Build six real-life, end-to-end solutions using the tools in the Hadoop ecosystem, and take your knowledge of Hadoop to the next level. Start off by understanding various business problems which can be solved using Hadoop. You will also get acquainted with the common architectural patterns which are used to build Hadoop-based solutions. Build a 360-degree view of the customer by working with different types of data, and build an efficient fraud detection system for a financial institution. You will also develop a system in Hadoop to improve the effectiveness of marketing campaigns. Build a churn detection system for a telecom company, develop an Internet of Things (IoT) system to monitor the environment in a factory, and build a data lake – all making use of the concepts and techniques mentioned in this book. The book covers other technologies and frameworks like Apache Spark, Hive, Sqoop, and more, and how they can be used in conjunction with Hadoop. You will be able to try out the solutions explained in the book and use the knowledge gained to extend them further in your own problem space. Style and approach This is an example-driven book where each chapter covers a single business problem and describes its solution by explaining the structure of a dataset and tools required to process it. Every project is demonstrated with a step-by-step approach, and explained in a very easy-to-understand manner.

Oracle Database 11gR2 Performance Tuning Cookbook

Over 80 Recipes to Help Beginners Achieve Better Performance from Oracle Database Applications

Author: Ciro Fiorillo

Publisher: Packt Publishing Ltd

ISBN: 1849682615

Category: Computers

Page: 521

View: 905

In this book you will find both examples and theoretical concepts covered. Every recipe is based on a script/procedure explained step-by-step, with screenshots, while theoretical concepts are explained in the context of the recipe, to explain why a solution performs better than another. This book is aimed at software developers, software and data architects, and DBAs who are using or are planning to use the Oracle Database, who have some experience and want to solve performance problems faster and in a rigorous way. If you are an architect who wants to design better applications, a DBA who is keen to dig into the causes of performance issues, or a developer who wants to learn why and where the application is running slow, this is the book for you. Basic knowledge of SQL language is required and general knowledge of the Oracle Database architecture is preferable.

Salesforce CRM Admin Cookbook

Author: Paul Goodey

Publisher: Packt Publishing Ltd

ISBN: 1849684251

Category: Computers

Page: 266

View: 1699

This book is written in a Cookbook-style format and provides you with immediately useable recipes that extend the functionality of Salesforce CRM and solves real-world problems encountered within the Salesforce CRM application.The recipes in this Cookbook contain proven, step-by-step instructions along with detailed screenshots.This Cookbook has been designed so that you can read it chapter by chapter, starting with recipes that provide enhancements to the user interface, and finishing with recipes that cover data and systems integration. You can also refer to the list of recipes and choose to access them in no particular order. Either method allows you to rapidly implement solutions in your organization that extend and enhance the functionality of Salesforce CRM for your users.This book is for Salesforce administrators and developers who want to quickly incorporate enhanced functionality and extend the power of Salesforce CRM.Whether you are a Salesforce novice or a more experienced administrator, this book provides practical, step-by-step instructions in the use of hidden features, advanced user interface techniques, and solutions for process automation, plus data and systems integration. Not only are standard Salesforce CRM features covered, such as workflow and approval processes, validation rules, and formula fields, but you will also be exposed to further technologies that include HTML, Javascript, CSS, Apex, and Visualforce.

The JHipster Mini-Book

Author: Matt Raible

Publisher: Lulu.com

ISBN: 132963814X

Category: Computers

Page: 162

View: 8930

The things you need to do to set up a new software project can be daunting. First, you have to select the back-end framework to create your API, choose your database, set up security, and choose your build tool. Then you have to choose the tools to create your front end: select a UI framework, configure a build tool, set up Sass processing, configure your browser to auto-refresh when you make changes, and configure the client and server so they work in unison. If you're building a new application using Spring Boot and Angular, you can save days by using JHipster. JHipster generates a complete and modern web app, unifying: - A high-performance and robust Java stack on the server side with Spring Boot - A sleek, modern, mobile-first front-end with Angular and Bootstrap - A robust microservice architecture with the JHipster Registry, Netflix OSS, the ELK stack, and Docker - A powerful workflow to build your application with Yeoman, Webpack, and Maven/Gradle

IBM Cognos 8 Report Studio Cookbook

Author: Abhishek Sanghani

Publisher: Packt Publishing Ltd

ISBN: 1849680353

Category: Computers

Page: 252

View: 9744

Written in cookbook style, this book offers learning and techniques through recipes. It contains step-by-step instructions for Report Studio 8 users to author effective reports. The book is designed in such a way that you can refer to things chapter by chapter, and read them in no particular order. You will see a new fictional business case in each recipe that will relate to a real-life problem and then you will learn how to crack it in Report Studio. If you are a Business Intelligence or MIS Developer (programmer) working on Cognos Report Studio who wants to author impressive reports by putting to use what this tool has to offer, this book is for you. You could also be a Business Analyst or Power User who authors his own reports and wants to look beyond the conventional features of Report Studio 8. This book assumes that you can do basic authoring, are aware of the Cognos architecture, and are familiar with Studio.

MySQL Cookbook

Solutions for Database Developers and Administrators

Author: Paul DuBois

Publisher: "O'Reilly Media, Inc."

ISBN: 1449374158

Category: Computers

Page: 866

View: 6351

MySQL’s popularity has brought a flood of questions about how to solve specific problems, and that’s where this cookbook is essential. When you need quick solutions or techniques, this handy resource provides scores of short, focused pieces of code, hundreds of worked-out examples, and clear, concise explanations for programmers who don’t have the time (or expertise) to solve MySQL problems from scratch. Ideal for beginners and professional database and web developers, this updated third edition covers powerful features in MySQL 5.6 (and some in 5.7). The book focuses on programming APIs in Python, PHP, Java, Perl, and Ruby. With more than 200+ recipes, you’ll learn how to: Use the mysql client and write MySQL-based programs Create, populate, and select data from tables Store, retrieve, and manipulate strings Work with dates and times Sort query results and generate summaries Use stored routines, triggers, and scheduled events Import, export, validate, and reformat data Perform transactions and work with statistics Process web input, and generate web content from query results Use MySQL-based web session management Provide security and server administration

AWS Administration Cookbook

Author: Lucas Chan,Rowan Udell

Publisher: Packt Publishing Ltd

ISBN: 1787121526

Category: Computers

Page: 394

View: 3414

Build, automate, and manage your AWS-based cloud environments About This Book Install, configure, and administer computing, storage, and networking in the AWS cloud Automate your infrastructure and control every aspect of it through infrastructure as code Work through exciting recipes to administer your AWS cloud Who This Book Is For If you are an administrator, DevOps engineer, or an IT professional who is moving to an AWS-based cloud environment, then this book is for you. It assumes familiarity with cloud computing platforms, and that you have some understanding of virtualization, networking, and other administration-related tasks. What You Will Learn Discover the best practices to achieve an automated repeatable infrastructure in AWS Bring down your IT costs by managing AWS successfully and deliver high availability, fault tolerance, and scalability Make any website faster with static and dynamic caching Create monitoring and alerting dashboards using CloudWatch Migrate a database to AWS Set up consolidated billing to achieve simple and effective cost management with accounts Host a domain and find out how you can automate health checks In Detail Amazon Web Services (AWS) is a bundled remote computing service that provides cloud computing infrastructure over the Internet with storage, bandwidth, and customized support for application programming interfaces (API). Implementing these services to efficiently administer your cloud environments is a core task. This book will help you build and administer your cloud environment with AWS. We'll begin with the AWS fundamentals, and you'll build the foundation for the recipes you'll work on throughout the book. Next, you will find out how to manage multiple accounts and set up consolidated billing. You will then learn to set up reliable and fast hosting for static websites, share data between running instances, and back up your data for compliance. Moving on, you will find out how to use the compute service to enable consistent and fast instance provisioning, and will see how to provision storage volumes and autoscale an application server. Next, you'll discover how to effectively use the networking and database service of AWS. You will also learn about the different management tools of AWS along with securing your AWS cloud. Finally, you will learn to estimate the costs for your cloud. By the end of the book, you will be able to easily administer your AWS cloud. Style and approach This practical guide is packed with clear, practical, instruction-based recipes that will enable you to use and implement the latest features of AWS.

Python and AWS Cookbook

Author: Mitch Garnaat

Publisher: "O'Reilly Media, Inc."

ISBN: 144930544X

Category: Computers

Page: 76

View: 2742

This book focuses on Elastic Compute Cloud (EC2) and Simple Storage Service (S3) for developers writing in Python.

Hadoop Real-World Solutions Cookbook

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1784398004

Category: Computers

Page: 290

View: 9800

Over 90 hands-on recipes to help you learn and master the intricacies of Apache Hadoop 2.X, YARN, Hive, Pig, Oozie, Flume, Sqoop, Apache Spark, and Mahout About This Book Implement outstanding Machine Learning use cases on your own analytics models and processes. Solutions to common problems when working with the Hadoop ecosystem. Step-by-step implementation of end-to-end big data use cases. Who This Book Is For Readers who have a basic knowledge of big data systems and want to advance their knowledge with hands-on recipes. What You Will Learn Installing and maintaining Hadoop 2.X cluster and its ecosystem. Write advanced Map Reduce programs and understand design patterns. Advanced Data Analysis using the Hive, Pig, and Map Reduce programs. Import and export data from various sources using Sqoop and Flume. Data storage in various file formats such as Text, Sequential, Parquet, ORC, and RC Files. Machine learning principles with libraries such as Mahout Batch and Stream data processing using Apache Spark In Detail Big data is the current requirement. Most organizations produce huge amount of data every day. With the arrival of Hadoop-like tools, it has become easier for everyone to solve big data problems with great efficiency and at minimal cost. Grasping Machine Learning techniques will help you greatly in building predictive models and using this data to make the right decisions for your organization. Hadoop Real World Solutions Cookbook gives readers insights into learning and mastering big data via recipes. The book not only clarifies most big data tools in the market but also provides best practices for using them. The book provides recipes that are based on the latest versions of Apache Hadoop 2.X, YARN, Hive, Pig, Sqoop, Flume, Apache Spark, Mahout and many more such ecosystem tools. This real-world-solution cookbook is packed with handy recipes you can apply to your own everyday issues. Each chapter provides in-depth recipes that can be referenced easily. This book provides detailed practices on the latest technologies such as YARN and Apache Spark. Readers will be able to consider themselves as big data experts on completion of this book. This guide is an invaluable tutorial if you are planning to implement a big data warehouse for your business. Style and approach An easy-to-follow guide that walks you through world of big data. Each tool in the Hadoop ecosystem is explained in detail and the recipes are placed in such a manner that readers can implement them sequentially. Plenty of reference links are provided for advanced reading.

AWS Certified Solutions Architect Official Study Guide

Associate Exam

Author: Joe Baron,Hisham Baz,Tim Bixler,Biff Gaut,Kevin E. Kelly,John Stamper,Sean Senior

Publisher: John Wiley & Sons

ISBN: 1119138558

Category: Computers

Page: 504

View: 7740

Compute Basics -- Securely Using an Instance -- The Lifecycle of Instances -- Options -- Instance Stores -- Amazon Elastic Block Store (Amazon EBS) -- Elastic Block Store Basics -- Types of Amazon EBS Volumes -- Protecting Data -- Summary -- Exam Essentials -- Exercises -- Review Questions -- Chapter 4 Amazon Virtual Private Cloud (Amazon VPC) -- Introduction -- Amazon Virtual Private Cloud (Amazon VPC) -- Subnets -- Route Tables -- Internet Gateways -- Dynamic Host Configuration Protocol (DHCP) Option Sets -- Elastic IP Addresses (EIPs) -- Elastic Network Interfaces (ENIs) -- Endpoints

Mastering DynamoDB

Author: Tanmay Deshpande

Publisher: Packt Publishing Ltd

ISBN: 1783551968

Category: Computers

Page: 236

View: 6793

If you have interest in DynamoDB and want to know what DynamoDB is all about and become proficient in using it, this is the book for you. If you are an intermediate user who wishes to enhance your knowledge of DynamoDB, this book is aimed at you. Basic familiarity with programming, NoSQL, and cloud computing concepts would be helpful.

DynamoDB

EVERYTHING YOU NEED to KNOW about AMAZON WEB SERVICE's NoSQL DATABASE

Author: Derek Rangel

Publisher: N.A

ISBN: 9781517635084

Category:

Page: 80

View: 383

This book is an exploration of DynamoDB in detail. It begins by explaining what DynamoDB is, where it is used, and how it works. The next step is a guide on how to get started with DynamoDB. The setting up of the environment for DynamoDB is explored. You will learn how to set up DynamoDB local and the DynamoDB for Amazon which is provided online. The process of creating tables in DynamoDB is examined in detail, and you will master how to do it. The process of inserting data into DynamoDB is also described, along with the DynamoDB API. You will know learn to handle both the HTTP requests and the HTTP responses. The formatting of the HTTP body is also covered. This book will guide you on how to format JSON data in DynamoDB. You will also learn how to handle errors by catching them in DynamoDB. The operations which are supported in DynamoDB areexplored, including the ones for creating and deleting tables in DynamoDB, the one for getting an item in DynamoDB, and others. After reading this book, you will know how to perform these operations in DynamoDB. The book will also guide you on how to get the data that you insert into DynamoDB. Updating of the data as well as deleting is then explained

Apache Spark 2.x Cookbook

Author: Rishi Yadav

Publisher: Packt Publishing Ltd

ISBN: 1787127516

Category: Computers

Page: 294

View: 7991

Over 70 recipes to help you use Apache Spark as your single big data computing platform and master its libraries About This Book This book contains recipes on how to use Apache Spark as a unified compute engine Cover how to connect various source systems to Apache Spark Covers various parts of machine learning including supervised/unsupervised learning & recommendation engines Who This Book Is For This book is for data engineers, data scientists, and those who want to implement Spark for real-time data processing. Anyone who is using Spark (or is planning to) will benefit from this book. The book assumes you have a basic knowledge of Scala as a programming language. What You Will Learn Install and configure Apache Spark with various cluster managers & on AWS Set up a development environment for Apache Spark including Databricks Cloud notebook Find out how to operate on data in Spark with schemas Get to grips with real-time streaming analytics using Spark Streaming & Structured Streaming Master supervised learning and unsupervised learning using MLlib Build a recommendation engine using MLlib Graph processing using GraphX and GraphFrames libraries Develop a set of common applications or project types, and solutions that solve complex big data problems In Detail While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data. Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark. Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting. Style and approach This book is packed with intuitive recipes supported with line-by-line explanations to help you understand Spark 2.x's real-time processing capabilities and deploy scalable big data solutions. This is a valuable resource for data scientists and those working on large-scale data projects.