Array ( [category_name] => bigdata-and-hadoop [error] => [m] => [p] => 0 [post_parent] => [subpost] => [subpost_id] => [attachment] => [attachment_id] => 0 [name] => [static] => [pagename] => [page_id] => 0 [second] => [minute] => [hour] => [day] => 0 [monthnum] => 0 [year] => 0 [w] => 0 [tag] => [cat] => 940 [tag_id] => [author] => [author_name] => [feed] => [tb] => [paged] => 1 [meta_key] => [meta_value] => [preview] => [s] => [sentence] => [title] => [fields] => [menu_order] => [embed] => [category__in] => Array ( ) [category__not_in] => Array ( ) [category__and] => Array ( ) [post__in] => Array ( ) [post__not_in] => Array ( ) [post_name__in] => Array ( ) [tag__in] => Array ( ) [tag__not_in] => Array ( ) [tag__and] => Array ( ) [tag_slug__in] => Array ( ) [tag_slug__and] => Array ( ) [post_parent__in] => Array ( ) [post_parent__not_in] => Array ( ) [author__in] => Array ( ) [author__not_in] => Array ( ) [ignore_sticky_posts] => [suppress_filters] => [cache_results] => 1 [update_post_term_cache] => 1 [lazy_load_term_meta] => 1 [update_post_meta_cache] => 1 [post_type] => [posts_per_page] => 10 [nopaging] => [comments_per_page] => 50 [no_found_rows] => [order] => DESC [orderby] => date )
Bigdata and Hadoop

Analytics of Things and it’s significance to IoT

The idea of the Internet of Things (IoT) is revolutionary and in the future it is expected to find its way into everyone’s lives. Everyone, sooner or later, is going to be impacted by the IoT. Different surveys or studies project a future with a huge number of interconnected devices. For example, the ABI Research

Bigdata and Hadoop

Learn How To Develop And Test Pig Scripts For Data Processing

In the first part of Pig tutorial we explained how Pig fits in the Hadoop ecosystem as tool for performing data extraction, transformation and loading (ETL). In that tutorial installation and configuration of Pig was demonstrated. We also reviewed some operations that can be performed on data. For a review of those concepts please refer

Bigdata and Hadoop

Learn How To Write Advanced Queries To Manipulate Data Using Hive

In previous Hive tutorials we have looked at installing and configuring Hive, data modeling and use of partitions to improve query response time. For a review of these concepts please refer to learn how to set up Hive, creating effective data models in Hive and use of partitioning tutorials. A good level of understanding of

Bigdata and Hadoop

Learn How To Use Partitioning In Hive To Improve Query Performance

In previous Hive tutorials we have have looked at Hive as the Hadoop project that offers data warehousing features. Installing and configuring Hive was demonstrated. Guidelines on best practices when creating data models were also discussed. If you are new to these concepts please refer to setting up Hive and creating effective data models in

Bigdata and Hadoop

Learn How To Use Hbase Shell To Create Tables, Query Data, Update And Delete Data

In previous Hbase tutorials we looked at how to install Hbase and develop suitable data models. In this tutorial we will build on those concepts to demonstrate how to perform create read update delete (CRUD) operations using the Hbase shell. If you have not installed Hbase please refer to Hbase base tutorial to learn how

Bigdata and Hadoop

Learn How To Simplify Management Of A Hadoop Cluster Using Ambari

Within the Hadoop ecosystem Apache Ambari was developed to provide a simple way of managing Hadoop clusters using a web based interface. Cluster management services that are possible using Ambari are provisioning clusters, management of clusters and their monitoring. To help in provisioning clusters an installation wizard to install Hadoop on a desired number of

Bigdata and Hadoop

Learn How To Process Data Interactively And In Batch Using Apache Tez Framework

Within Hadoop, MapReduce has been the widely used approach to process data. In this approach data processing happens in batch mode that can take minutes, hours or days to get results. MapReduce is useful when waiting for a long period for query results is not problematic. However when you need to get query results in

Bigdata and Hadoop

Learn How To Use Apache Oozie To Schedule Hadoop Jobs

Within the Hadoop ecosystem Oozie provides services that enable jobs to be scheduled. With job scheduling you are able to organize multiple jobs into a single unit that is run sequentially. The types of jobs that are supported are MapReduce, Pig, Hive, Sqoop, Java programs shell scripts. To support scheduling Oozie avails two types of

Bigdata and Hadoop

Learn How To Coordinate Hadoop Clusters Using Zookeeper

Hadoop was designed to be a distributed system that scales up to thousands of nodes. Even with a few hundred node cluster managing all those servers is not easy. Problems that can kill your application such as deadlocks, inconsistency and race conditions arise. Zookeeper was developed to solve challenges that arise when administering a large

Bigdata and Hadoop

Learn How To Create Topologies In Storm To Process Data

In part 1 of this tutorial key concepts that are used in Storm were discussed. In that tutorial it was explained Storm topologies are expressed as directed acyclic graphs. The nodes on the graphs are either bolts or spouts. Spouts represent the source of data streams, for example a twitter spout is used to acquire