spark-SQL连接Hive | 编程电脑技术交流

2022年6月19日00:24:17 发表评论 2,823 views

server端（biddata166主机hive/conf目录下hive-site.xml）：

<configuration> 
  <property> 
    <name>hive.metastore.warehouse.dir</name>  
    <value>/user/hive/warehouse</value> 
  </property>  
  <property> 
    <name>javax.jdo.option.ConnectionURL</name>  
    <value>jdbc:mysql://bigdata166:3306/metastore?serverTimezone=UTC</value> 
  </property>  
  <property> 
    <name>javax.jdo.option.ConnectionDriverName</name>  
    <value>com.mysql.jdbc.Driver</value> 
  </property>  
  <property> 
    <name>javax.jdo.option.ConnectionUserName</name>  
    <value>root</value> 
  </property>  
  <property> 
    <name>javax.jdo.option.ConnectionPassword</name>  
    <value>000000</value> 
  </property>  
  <property> 
    <name>hive.querylog.location</name>  
    <value>/user/hive/iotmp</value> 
  </property>  
  <property> 
    <name>hive.server2.logging.operation.log.location</name>  
    <value>/user/hive/operation_logs</value> 
  </property>  
  <property> 
    <name>datanucleus.readOnlyDatastore</name>  
    <value>false</value> 
  </property>  
  <property> 
    <name>datanucleus.fixedDatastore</name>  
    <value>false</value> 
  </property>  
  <property> 
    <name>datanucleus.autoCreateSchema</name>  
    <value>true</value> 
  </property>  
  <property> 
    <name>datanucleus.autoCreateTables</name>  
    <value>true</value> 
  </property>  
  <property> 
    <name>datanucleus.autoCreateColumns</name>  
    <value>true</value> 
  </property> 
<property>
    <name>datanucleus.schema.autoCreateAll</name>
    <value>true</value>
  </property>
</configuration>

启动服务端：hive --service metastore 等待

悬停并占用会话，可用&

client端(hive-site.xml 同时需要将主机hive目录复制到客户机)

<configuration> 
  <property> 
    <name>hive.metastore.warehouse.dir</name>  
    <value>/user/hive/warehouse</value> 
  </property>  
  <property> 
    <name>hive.metastore.local</name>  
    <value>false</value> 
  </property>  
  <property> 
    <name>hive.metastore.uris</name>  
    <value>thrift://bigdata166:9083</value> 
  </property> 
</configuration>

启动客户端：hive(仅作测试和查看spark和hive互通结果)

然后将客户端配置文件发送到主机spark的conf中

然后将core和hdfs-site两个文件复制到主机spark/conf中

cp /usr/local/hadoop-2.8.4/etc/hadoop/core-site.xml ./conf/
cp /usr/local/hadoop-2.8.4/etc/hadoop/hdfs-site.xml ./conf/

主节点上启动spark start-all.sh

shell一定要启动集群模式的(如果不用连接hive，备注掉三个文件即可 mv xxx .xxx-bak)

其它操作：

./bin/spark-shell --master spark://bigdata166:7077

spark.sql("show tables").show
	
spark.sql("select * from company.emp limit 10").show
spark.sql("select * from student1 limit 10").show  //hive中的表

spark.sql("create table company.emp_1104(empno Int,ename String,job String,mgr String,hiredate String,sal Int,comm String,deptno Int) row format delimited fields terminated by ','");

spark.sql("load data local inpath '/testdata/emp.csv' overwrite into table company.emp_1104");

spark.sql("select * from company.emp_1104 limit 10").show



scala> spark.sql("show tables").show
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default|      emp|      false|
| default| emp_1120|      false|
+--------+---------+-----------+


scala> spark.sql("create table default.student (id Int,name String,age Int) row format delimited fields terminated by "\t"").show
<console>:24: error: value \ is not a member of String
       spark.sql("create table default.student (id Int,name String,age Int) row format delimited fields terminated by "\t"").show
                                                                                                                       ^
<console>:24: error: value t is not a member of StringContext
       spark.sql("create table default.student (id Int,name String,age Int) row format delimited fields terminated by "\t"").show
                                                                                                                        ^

scala> spark.sql("create table default.student (id Int,name String,age Int) row format delimited fields terminated by '\t'").show
++
||
++
++


scala> spark.sql("show tables").show
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default|      emp|      false|
| default| emp_1120|      false|
| default|  student|      false|
+--------+---------+-----------+


scala> spark.sql("load data local inpath '/opt/module/test/student.txt' into table default.student").show
20/01/07 06:08:10 ERROR KeyProviderCache: Could not find uri with key [dfs.encryption.key.provider.uri] to create a keyProvider !!
++
||
++
++


scala> spark.sql("show tables").show
+--------+---------+-----------+
|database|tableName|isTemporary|
+--------+---------+-----------+
| default|      emp|      false|
| default| emp_1120|      false|
| default|  student|      false|
+--------+---------+-----------+


scala> spark.sql("select * from student").show
+---+----+----+
| id|name| age|
+---+----+----+
|  1|  吕布|3500|
|  2|  貂蝉|3480|
|  3|  董卓|5000|
|  4|  赵云|2500|
|  5|  张飞|2500|
+---+----+----+

发表评论取消回复