Monday, January 13, 2014

Setup MySQL as Hive metastore

The default Derby database only allows one active user at a time. You may get the error message:
Another instance of Derby may have already booted the database /home/.../metastore_db
in your log if you want to access it using multiple users or JDBC/ODBC connections.

The most popular solution is to use MySQL as the underlying metastore for Hive. It's pretty simple.

1. install mysql

2. download and copy mysql-connector-java-xxxxx.jar to $HIVEHOME/lib folder

3. create hive user/password:
mysql -u root
CREATE USER 'hiveuser'@'pool-08' IDENTIFIED BY 'hivepass';
GRANT ALL PRIVILEGES ON *.* TO 'hiveuser'@'pool-08' WITH GRANT OPTION;


4. add the following configuration into hive-site.xml:

  hive.metastore.local
  true



  javax.jdo.option.ConnectionURL
  jdbc:mysql://pool-08:3306/hive?createDatabaseIfNotExist=true



  javax.jdo.option.ConnectionDriverName
  com.mysql.jdbc.Driver



  javax.jdo.option.ConnectionUserName
  hiveuser



  javax.jdo.option.ConnectionPassword
  hivepass


5. // Starting Hive Metastore
hive --service metastore &

// Starting Hive Server
hive --service hiveserver &

If you get the error:
org.apache.thrift.transport.TTransportException: Could not create ServerSocket on address 0.0.0.0/0.0.0.0:10000.
That means some earlier process already using the port. Check the service and kill it:
sudo lsof -i -P | grep -i "listen"
kill -9 pid

Then retry and you should be fine.