PROVE IT !!If you know it then PROVE IT !! Skill Proficiency Test

HiveServer2 Clients

Beeline – Command Line Shell

Example

% bin/beeline 
Hive version 0.11.0-SNAPSHOT by Apache
beeline> !connect jdbc:hive2://localhost:10000 scott tiger
!connect jdbc:hive2://localhost:10000 scott tiger 
Connecting to jdbc:hive2://localhost:10000
Connected to: Hive (version 0.10.0)
Driver: Hive (version 0.10.0-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
0: jdbc:hive2://localhost:10000> show tables;
show tables;
+-------------------+
|     tablename     |
+-------------------+
| Alien             |
+-------------------+
1 rows selected (1.079 seconds)

You can also specify the connection parameters on command line. This means you can find the command with the connection string from your UNIX shell history.

Beeline Commands

Command

Description

!<SQLLine command> List of SQLLine commands available at http://sqlline.sourceforge.net/.

Example: !quit exits the Beeline client.

Beeline Hive Commands

Hive specific commands (same as Hive CLI commands) can be run from Beeline, when the Hive JDBC driver is used.

Use “;” (semicolon) to terminate commands. Comments in scripts can be specified using the “--” prefix.

Command

Description

reset Resets the configuration to the default values.
set <key>=<value> Sets the value of a particular configuration variable (key).
Note: If you misspell the variable name, Beeline will not show an error.
set Prints a list of configuration variables that are overridden by the user or Hive.
set -v Prints all Hadoop and Hive configuration variables.
add FILE[S] <filepath> <filepath>*
add JAR[S] <filepath> <filepath>*
add ARCHIVE[S] <filepath> <filepath>*                           Adds one or more files, jars, or archives to the list of resources in the distributed cache. See Hive Resources for more information.
add FILE[S] <ivyurl> <ivyurl>* 
add JAR[S] <ivyurl> <ivyurl>* 
add ARCHIVE[S] <ivyurl> <ivyurl>*
As of Hive 1.2.0, adds one or more files, jars or archives to the list of resources in the distributed cache using an Ivy URL of the form ivy://group:module:version?query_string. See Hive Resources for more information.
list FILE[S]
list JAR[S]
list ARCHIVE[S]
Lists the resources already added to the distributed cache. See Hive Resources for more information. (As of Hive 0.14.0: HIVE-7592).
list FILE[S] <filepath>*
list JAR[S] <filepath>*
list ARCHIVE[S] <filepath>*
Checks whether the given resources are already added to the distributed cache or not. See Hive Resources for more information.
delete FILE[S] <filepath>*
delete JAR[S] <filepath>*
delete ARCHIVE[S] <filepath>*
Removes the resource(s) from the distributed cache.
delete FILE[S] <ivyurl> <ivyurl>* 
delete JAR[S] <ivyurl> <ivyurl>* 
delete ARCHIVE[S] <ivyurl> <ivyurl>*
As of Hive 1.2.0, removes the resource(s) which were added using the <ivyurl> from the distributed cache. See Hive Resources for more information.
reload As of Hive 0.14.0, makes HiveServer2 aware of any jar changes in the path specified by the configuration parameter hive.reloadable.aux.jars.path (without needing to restart HiveServer2). The changes can be adding, removing, or updating jar files.
dfs <dfs command> Executes a dfs command.
<query string> Executes a Hive query and prints results to standard output.

Beeline Command Options

The Beeline CLI supports these command line options:

Option

Description

-u <database URL> The JDBC URL to connect to.

Usage: beeline -u db_URL

-r Reconnect to last used URL (if a user has previously used !connect to a URL and used !save to a beeline.properties file).

Usage: beeline -r

Version: 2.1.0 (HIVE-13670)

-n <username> The username to connect as.

Usage: beeline -n valid_user

-p <password> The password to connect as.

Usage: beeline -p valid_password

Optional password mode:

Starting Hive 2.2.0 (HIVE-13589) the argument for -p option is optional.

Usage : beeline -p [valid_password]

If the password is not provided after -p BeeLine will prompt for the password while initiating the connection. When password is provided BeeLine uses it initiate the connection without prompting.

-d <driver class> The driver class to use.

Usage: beeline -d driver_class

-e <query> Query that should be executed. Double or single quotes enclose the query string. This option can be specified multiple times.

Usage: beeline -e "query_string

Support to run multiple SQL statements separated by semicolons in a single query_string: 1.2.0 (HIVE-9877)
Bug fix (null pointer exception): 0.13.0 (HIVE-5765)
Bug fix (–headerInterval not honored): 0.14.0 (HIVE-7647)
Bug fix (running -e in background): 1.3.0 and 2.0.0 (HIVE-6758); workaround available for earlier versions

-f <file> Script file that should be executed.

Usage: beeline -f filepath

Version: 0.12.0 (HIVE-4268)
Note: If the script contains tabs, query compilation fails in version 0.12.0. This bug is fixed in version 0.13.0 (HIVE-6359).
Bug fix (running -f in background): 1.3.0 and 2.0.0 (HIVE-6758); workaround available for earlier versions

-i (or) –init <file or files> The init files for initialization

Usage: beeline -i /tmp/initfile

Single file:

Version: 0.14.0 (HIVE-6561)

Multiple files:

Version: 2.1.0 (HIVE-11336)

-w (or) –password-file <password file> The password file to read password from.

Version: 1.2.0 (HIVE-7175)

-a (or) –authType <auth type> The authentication type passed to the jdbc as an auth property

Version: 0.13.0 (HIVE-5155)

–property-file <file> File to read configuration properties from

Usage: beeline --property-file /tmp/a

Version: 2.2.0 (HIVE-13964)

–hiveconf property=value Use value for the given configuration property. Properties that are listed in hive.conf.restricted.list cannot be reset with hiveconf (see Restricted List and Whitelist).

Usage: beeline --hiveconf prop1=value1

Version: 0.13.0 (HIVE-6173)

–hivevar name=value Hive variable name and value. This is a Hive-specific setting in which variables can be set at the session level and referenced in Hive commands or queries.

Usage: beeline --hivevar var1=value1

–color=[true/false] Control whether color is used for display. Default is false.

Usage: beeline --color=true

(Not supported for Separated-Value Output formats. See HIVE-9770)

–showHeader=[true/false] Show column names in query results (true) or not (false). Default is true.

Usage: beeline --showHeader=false

–headerInterval=ROWS The interval for redisplaying column headers, in number of rows, when outputformat is table. Default is 100.

Usage: beeline --headerInterval=50

(Not supported for Separated-Value Output formats. See HIVE-9770)

–fastConnect=[true/false] When connecting, skip building a list of all tables and columns for tab-completion of HiveQL statements (true) or build the list (false). Default is true.

Usage: beeline --fastConnect=false

–autoCommit=[true/false] Enable/disable automatic transaction commit. Default is false.

Usage: beeline --autoCommit=true

–verbose=[true/false] Show verbose error messages and debug information (true) or do not show (false). Default is false.

Usage: beeline --verbose=true

–showWarnings=[true/false] Display warnings that are reported on the connection after issuing any HiveQL commands. Default is false.

Usage: beeline --showWarnings=true

–showDbInPrompt=[true/false] Display the current database name in prompt. Default is false.

Usage: beeline --showDbInPrompt=true

Version: 2.2.0 (HIVE-14123)

–showNestedErrs=[true/false] Display nested errors. Default is false.

Usage: beeline --showNestedErrs=true

–numberFormat=[pattern] Format numbers using a DecimalFormat pattern.

Usage: beeline --numberFormat="#,###,##0.00"

–force=[true/false] Continue running script even after errors (true) or do not continue (false). Default is false.

Usage: beeline--force=true

–maxWidth=MAXWIDTH The maximum width to display before truncating data, in characters, when outputformat is table. Default is to query the terminal for current width, then fall back to 80.

Usage: beeline --maxWidth=150

–maxColumnWidth=MAXCOLWIDTH The maximum column width, in characters, when outputformat is table. Default is 50 in Hive version 2.2.0+ (see HIVE-14135) or 15 in earlier versions.

Usage: beeline --maxColumnWidth=25

–silent=[true/false] Reduce the amount of informational messages displayed (true) or not (false). It also stops displaying the log messages for the query from HiveServer2 (Hive 0.14 and later) and the HiveQL commands (Hive 1.2.0 and later). Default is false.

Usage: beeline --silent=true

–autosave=[true/false] Automatically save preferences (true) or do not autosave (false). Default is false.

Usage: beeline --autosave=true

–outputformat=[table/vertical/csv/tsv/dsv/csv2/tsv2] Format mode for result display. Default is table. See Separated-Value Output Formats below for description of recommended sv options.

Usage: beeline --outputformat=tsv

Version: dsv/csv2/tsv2 added in 0.14.0 (HIVE-8615)

truncateTable=[true/false] If true, truncates table column in the console when it exceeds console length.

Version: 0.14.0 (HIVE-6928)

–delimiterForDSV= DELIMITER The delimiter for delimiter-separated values output format. Default is ‘|’ character.

Version: 0.14.0 (HIVE-7390)

–isolation=LEVEL Set the transaction isolation level to TRANSACTION_READ_COMMITTED
or TRANSACTION_SERIALIZABLE.
See the “Field Detail” section in the Java Connection documentation.Usage: beeline --isolation=TRANSACTION_SERIALIZABLE
–nullemptystring=[true/false] Use historic behavior of printing null as empty string (true) or use current behavior of printing null as NULL (false). Default is false.

Usage: beeline --nullemptystring=false

Version: 0.13.0 (HIVE-4485)

–incremental=[true/false] Defaults to false. When set to false, the entire result set is fetched and buffered before being displayed, yielding optimal display column sizing. When set to true, result rows are displayed immediately as they are fetched, yielding lower latency and memory usage at the price of extra display column padding. Setting --incremental=true is recommended if you encounter an OutOfMemory on the client side (due to the fetched result set size being large).
–help Display a usage message.

Usage: beeline --help

JDBC Connection URL Format

The HiveServer2 URL is a string with the following syntax:

jdbc:hive2://<host1>:<port1>,<host2>:<port2>/dbName;initFile=<file>;sess_var_list?hive_conf_list#hive_var_list

where

  • <host1>:<port1>,<host2>:<port2> is a server instance or a comma separated list of server instances to connect to (if dynamic service discovery is enabled). If empty, the embedded server will be used.
  • dbName is the name of the initial database.
  • <file> is the path of init script file (Hive 2.2.0 and later). This script file is written with SQL statements which will be executed automatically after connection. This option can be empty.
  • sess_var_list is a semicolon separated list of key=value pairs of session variables (e.g., user=foo;password=bar).
  • hive_conf_list is a semicolon separated list of key=value pairs of Hive configuration variables for this session
  • hive_var_list is a semicolon separated list of key=value pairs of Hive variables for this session.

SQuirrel SQL Client

  1. Download, install and start the SQuirrel SQL Client from the SQuirrel SQL website.
  2. Select ‘Drivers -> New Driver…’ to register Hive’s JDBC driver that works with HiveServer2.
    1. Enter the driver name and example URL:
      Name: Hive
      Example URL: jdbc:hive2://localhost:10000/default
  3. Select ‘Extra Class Path -> Add’ to add the following jars from your local Hive and Hadoop distribution.
    HIVE_HOME/lib/hive-jdbc-*-standalone.jar
    HADOOP_HOME/share/hadoop/common/hadoop-common-*.jar

    Version information

    Hive JDBC standalone jars are used in Hive 0.14.0 onward (HIVE-538); for previous versions of Hive, use HIVE_HOME/build/dist/lib/*.jarinstead.

    The hadoop-common jars are for Hadoop 2.0; for previous versions of Hadoop, use HADOOP_HOME/hadoop-*-core.jar instead.

  4. Select ‘List Drivers’. This will cause SQuirrel to parse your jars for JDBC drivers and might take a few seconds. From the ‘Class Name’ input box select the Hive driver for working with HiveServer2:
    org.apache.hive.jdbc.HiveDriver
  5. Click ‘OK’ to complete the driver registration.
  6. Select ‘Aliases -> Add Alias…’ to create a connection alias to your HiveServer2 instance.
    1. Give the connection alias a name in the ‘Name’ input box.
    2. Select the Hive driver from the ‘Driver’ drop-down.
    3. Modify the example URL as needed to point to your HiveServer2 instance.
    4. Enter ‘User Name’ and ‘Password’ and click ‘OK’ to save the connection alias.
    5. To connect to HiveServer2, double-click the Hive alias and click ‘Connect’.

Add a Comment

Your email address will not be published. Required fields are marked *