PROVE IT !!If you know it then PROVE IT !! Skill Proficiency Test

Latest

Hadoop vs Teradata : When to Use Which

While there are certain use cases that are distinct to Hadoop or the data warehouse, there is also overlap where either technology could be effective. The following table is a good starting place for helping to decide which platform to use based on your requirements.   Requirement Data Warehouse   Hadoop Low latency, interactive reports, and OLAP YES ANSI 2003 SQL compliance ...

Mastering the Five Stages of Analytics Maturity

What Is Analytics Maturity? Most organizations use analytics across their operations, but there are still many that have misconceptions about how well they utilize analytics to inform their business decisions,” explains Venkat Viswanathan, founder and chairman of LatentView Analytics. LatentView developed a free online tool, the Analytics Maturity Self-Assessment, allowing individuals to answer a survey about ...

Hadoop Architecture

mapreduce

Apache Hadoop is an open-source software framework for storage and large-scale processing of data-sets on clusters of commodity hardware. There are mainly five building blocks inside this runtime envinroment (from bottom to top): the cluster is the set of host machines (nodes). Nodes may be partitioned in racks. This is the hardware part of the ...

Anatomy of a MapReduce Job

In MapReduce, a YARN application is called a Job. The implementation of the Application Master provided by the MapReduce framework is called MRAppMaster. Timeline of a MapReduce Job This is the timeline of a MapReduce Job execution: Map Phase: several Map Tasks are executed Reduce Phase: several Reduce Tasks are executed Notice that the Reduce Phase may start ...

SQL vs HIVE vs PIG

PIGHIVE

  SQL HIVE PIG SQL is the oldest data analysis option among the three and its ability to update itself in line with growing user expectations make it relevant even today. Sql is Stuctured query language which is mainly used for OLTP that is day to day operations. SQL doesn't support petabytes or terabytes of data as Hive ...

Hive vs Pig

Pig and Hive are the two key components of the Hadoop ecosystem. But why pig and why hive is always a question which strikes back many hadoop newbies, Pig hadoop and Hive hadoop have a similar goal- they are tools that ease the complexity of writing complex java MapReduce programs. However, when to use Pig Latin ...

WebHCat Reference

WebHCat Reference: DDL Resources This is an overview page for the WebHCat DDL resources. The full list of WebHCat resources . For information about HCatalog DDL commands, see HCatalog DDL. For information about Hive DDL commands, see Hive Data Definition Language. Object Resource (Type) Description DDL Command ddl (POST) Perform an HCatalog DDL command. Database ddl/database (GET) List HCatalog databases. ddl/database/:db (GET) Describe an HCatalog database. ddl/database/:db (PUT) Create ...