---Advertisement---
Apache Pig Hadoop Machine Learning Modern Data Platforms

Apache Pig Interview Question-Answer

By Smart Answer

Updated on:

---Advertisement---

Q.1 You can run Pig in batch mode using __________ .

       A. Pig shell command

       B. Pig Latin statements

       C. Pig scripts

       D. All of the options

Ans : Pig scripts


Q.2 Which of the following is correct about Pig?

       A. Pig may generate a different number of Hadoop jobs given a particular script, dependent on the amount/type of data that is being processed

       B. Pig replaces the MapReduce core with its own execution engine

       C. When doing a default join, Pig will detect which join-type is probably the most efficient

       D. Pig always generates the same number of Hadoop jobs given a particular script, independent of the amount/type of data that is being processed

Ans : Pig may generate a different number of Hadoop jobs given a particular script, dependent on the amount/type of data that is being processed


Q.3 Pig Latin statements are generally organized in ____________.

       A. A series of “transformation” statements to process the data

       B. A DUMP statement to view results or a STORE statement to save the results

       C. A LOAD statement to read data from the file system

       D. All of the options

Ans : All of the options


Q.4 Which of the following is false about Pig operators?

       A. To run Pig in mapreduce mode, you need access to a Hadoop cluster and HDFS installation

       B. The DISPLAY operator will display the results to your terminal screen

       C. To run Pig in local mode, you need access to a single machine

       D. All of the options

Ans : The DISPLAY operator will display the results to your terminal screen


Q.5 Command to run pig in local mode?

       A. pig-x local

       B. pig -x tez-local

       C. pig

       D. None of the options

Ans : pig-x local


Q.6 Which of the following is correct?

       A. Pig is an execution engine that utilizes the MapReduce core in Hadoop

       B. Pig is an execution engine that compiles Pig Latin scripts into HDFS

       C. Pig is an execution engine that replaces the MapReduce core in Hadoop

       D. Pig is an execution engine that compiles Pig Latin scripts into database queries

Ans : Pig is an execution engine that utilizes the MapReduce core in Hadoop


Q.7 Pig Latin statements are generally organized in ____________.

       A. A LOAD statement to read data from the file system

       B. A series of “transformation” statements to process the data

       C. A DUMP statement to view results or a STORE statement to save the results

       D. All of the options

Ans : All of the options


Q.8 Interactive mode of Pig is ____________

       A. grunt

       B. FS

       C. HDFS

       D. None of the options

Ans : grunt


Q.9 Which mode does PigUnit work by default?

       A. tez

       B. mapreduce

       C. local

       D. None of the options

Ans : local


Q.10 Which of the following is an entry in jobconf ?

       A. pig.feature

       B. pig.job

       C. pig.input.dirs

       D. None of the options

Ans : pig.input.dirs


Q.11 Which of the following helps to enable pig scripts?

       A. PigUnitX

       B. PigXUnit

       C. PigUnit

       D. None of the options

Ans : PigXUnit


Q.12 You are asked to find the unique names in the file. Which operator will you choose?

       A. filter, distinct

       B. filter

       C. foreach, distinct

       D. foreach

Ans : foreach, distinct


Q.13 Which of the following is used to deal with metadata?

       A. LoadCaster

       B. LoadPushDown

       C. LoadMetadata

       D. All of the options

Ans : LoadMetadata


Q.14 Which function is used to return hdfs files to ship to distributed cache.

       A. getShipFiles()

       B. setUdfContextSignature()

       C. relativeToAbsolutePath()

       D. getCacheFiles()

Ans : getShipFiles()


Q.15 top() is used to find the top data in the group.

       A. True

       B. False

Ans : True


Q.16 What is Piggybank?

       A. It’s a framework

       B. It’s a platform

       C. It’s a repository

       D. None of the options

Ans : It’s a repository


Q.17 PigLatin is __________ while SQL is declarative.

       A. procedural

       B. functional

       C. declarative

       D. None of the options

Ans : procedural


Q.18 Pig uses:-

       A. Lazy evaluation

       B. pipeline splits

       C. ETL

       D. All of the options

Ans : All of the options


Q.19 Which of the following is not a scalar data type?

       A. long

       B. int

       C. float

       D. Map

Ans : Map


Q.20 Which operator is used to view the schema of a table?

       A. DUMP

       B. DESCRIBE

       C. STORE

       D. EXPLAIN

Ans : DESCRIBE


Q.21 Which of the following is true about Pig?

       A. Pig works with data from many sources

       B. LoadPredicatePushdown is same as LoadMetadata.setPartitionFilter

       C. getOutputFormat() is called by Pig to get the InputFormat used by the loader

       D. None of the options

Ans : Pig works with data from many sources


Q.22 There is no connection between aggregate functions and group.

       A. True

       B. False

Ans : True


Q.23 Which of the operator is used to used to show values to keys used in Pig?

       A. show

       B. declare

       C. DESCRIBE

       D. set

Ans : set


Q.24 Which of the command is used to run pig script in grunt shell?

       A. run

       B. All of the options

       C. fetch

       D. declare

Ans : run


Q.25 Which of the following is used for debugging?

       A. exec

       B. execute

       C. error

       D. throw

Ans : exec


Q.26 Which of the following is used to view the map reduce execution steps?

       A. DESCRIBE

       B. explain

       C. declare

       D. show

Ans : explain


Q.27 Which of the following is not true?

       A. To implement a task, the number of lines of code in Pig and Hadoop are roughly the same

       B. Code written for the Pig engine is directly compiled into machine code

       C. Pig makes use of Hadoop job chaining

       D. None of the options

Ans : Pig makes use of Hadoop job chaining


Q.28 Pig Latin statements are generally organized in ____________.

       A. A DUMP statement to view results or a STORE statement to save the results

       B. A series of “transformation” statements to process the data

       C. A LOAD statement to read data from the file system

       D. All of the options

Ans : All of the options


Q.29 pig -x tez-local will enable____ mode in Pig.

       A. Mapreduce

       B. tez

       C. local

       D. None of the options

Ans : Mapreduce


Q.30 Which of the following is the default mode ?

       A. Mapreduce

       B. Local

       C. Tez

       D. None of the options

Ans : Mapreduce


Q.31 The data can be loaded with or without defining the schema.

       A. True

       B. False

Ans : True


Q.32 Which of the following says Hadoop provides does Pig break?

       A. All values associated with a single key are processed by the same Reducer

       B. The Combiner (if defined) may run multiple times, on the Map-side as well as the Reduce-side

       C. Task stragglers due to slow machines (not data skew) can be sped up through speculative execution

       D. Calls to the Reducer’s reduce() method only occur after the last Mapper has finished running

Ans : All values associated with a single key are processed by the same Reducer


Smart Answer

---Advertisement---

Related Post

Apache Oozie Interview Question-Answer

Q.1 Which of the following is true about oozie?        A. Oozie is an Open Source        B. Oozie is available under Apache license ...

Apache Flume Interview Question-Answer

Q.1 Apache Flume 1.3.0 is the fourth release under the auspices of Apache of the so-called ________ codeline.        A. NG        B. ND ...

Apache Sqoop Interview Question-Answer

Q.1 Data can be imported in maximum ______ file formats.        A. 1        B. 2        C. 3       ...

Sqoop, Flume and Oozie Interview Question-Answer Part – 1

Q.1 Which of the following options are the characteristics of Sqoop? 1.transfer legacy systems into Hadoop. 2.parallel processing        A. Option 1        B. ...

Leave a Comment