博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
HiveQL DML 常用QL示例资料
阅读量:7111 次
发布时间:2019-06-28

本文共 4405 字,大约阅读时间需要 14 分钟。

 hive 2.1.1  DML操作

将文件加载到hive表
//官方指导LOAD DATA [LOCAL] INPATH 'filepath' [OVERWRITE] INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]
//示例LOAD DATA LOCAL INPATH '/root/a.txt' OVERWRITE INTO TABLE testdb.test partition(dd='a',ed='2018');

说明:local 如果没有指定local关键字,则filepath引用表所在的文件系统中的文件。

查询数据到hive表
//官方指导Standard syntax:INSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1 FROM from_statement;INSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1 FROM from_statement; Hive extension (multiple inserts):FROM from_statementINSERT OVERWRITE TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...) [IF NOT EXISTS]] select_statement1[INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2][INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2] ...;FROM from_statementINSERT INTO TABLE tablename1 [PARTITION (partcol1=val1, partcol2=val2 ...)] select_statement1[INSERT INTO TABLE tablename2 [PARTITION ...] select_statement2][INSERT OVERWRITE TABLE tablename2 [PARTITION ... [IF NOT EXISTS]] select_statement2] ...; Hive extension (dynamic partition inserts):INSERT OVERWRITE TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;INSERT INTO TABLE tablename PARTITION (partcol1[=val1], partcol2[=val2] ...) select_statement FROM from_statement;
//示例insert into table test2  partition(area,year) select id,ad,bd,cd,dd,ed  from test where ed='2019';

说明:后面的字段对应的是分区字段

  into是拼接新的值

  overwrite是新值覆盖掉全部旧的值,又分区会覆盖指定分区内的值

表中写入到文件系统
//官方指导Standard syntax:INSERT OVERWRITE [LOCAL] DIRECTORY directory1  [ROW FORMAT row_format] [STORED AS file_format] (Note: Only available starting with Hive 0.11.0)  SELECT ... FROM ... Hive extension (multiple inserts):FROM from_statementINSERT OVERWRITE [LOCAL] DIRECTORY directory1 select_statement1[INSERT OVERWRITE [LOCAL] DIRECTORY directory2 select_statement2] ...  row_format  : DELIMITED [FIELDS TERMINATED BY char [ESCAPED BY char]] [COLLECTION ITEMS TERMINATED BY char]        [MAP KEYS TERMINATED BY char] [LINES TERMINATED BY char]        [NULL DEFINED AS char] (Note: Only available starting with Hive 0.13)
//示例insert  overwrite local directory  '/root' row format delimited fields terminated by '\t ' collection items terminated by ',' map keys terminated by ':' lines terminated by '\n'  select * from test;

说明:如果不指定字段分割方式 将以  ^ A   ^ B  ^C 分割

修改表数据
//官方指导更新只支持在ACID表上执行。 客户端hive.support.concurrency – truehive.enforce.bucketing – true (Not required as of Hive 2.0) hive.exec.dynamic.partition.mode – nonstricthive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager服务端hive.compactor.initiator.on – true (See table below for more details)hive.compactor.worker.threads – a positive number on at least one instance of the Thrift metastore service Standard Syntax:UPDATE tablename SET column = value [, column = value ...] [WHERE expression]
//示例//服务端set hive.compactor.initiator.on = true ;set hive.compactor.worker.threads = 1;//客户端set hive.support.concurrency = true;set hive.enforce.bucketing = true;set hive.exec.dynamic.partition.mode = nonstrict;set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;update testdb.test set ad = 'xiaoming' where id =1 ;

说明:创建的表要为分桶表,还要设置参数为 'transactional'='true'

  不支持子查询

  当where条件匹配上了才能更新

  分区,分桶的列不能更新

  设置hive.optimize.sort.dynamic.partition=false可以提高效率

  属性都可以在hive-site.xml文件中配置 

删除表数据
//官方指导只支持在ACID表上执行。 客户端hive.support.concurrency – truehive.enforce.bucketing – true (Not required as of Hive 2.0) hive.exec.dynamic.partition.mode – nonstricthive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager服务端hive.compactor.initiator.on – true (See table below for more details)hive.compactor.worker.threads – a positive number on at least one instance of the Thrift metastore serviceStandard Syntax:DELETE FROM tablename [WHERE expression]
//示例//服务端set hive.compactor.initiator.on = true ;set hive.compactor.worker.threads = 1;set hive.optimize.sort.dynamic.partition = false;//客户端set hive.support.concurrency = true;set hive.enforce.bucketing = true;set hive.exec.dynamic.partition.mode = nonstrict;set hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager; delete from testdb.test where id = 1;

说明:见修改表数据说明

从hive2.2开始支持表合并

 

转载于:https://www.cnblogs.com/jottings/p/7727571.html

你可能感兴趣的文章
区块链是一种用一种不可变的形式存储数字信息
查看>>
fiddler跨域
查看>>
[LeetCode]两数相加(Add Two Numbers)
查看>>
SQLServer数据库增删改查
查看>>
[前端工坊] 微信小游戏|萌狗冠军之路,纯干货分享!
查看>>
Redux
查看>>
mac 安装 lightgbm 无法导入(以及解决cmake命令无法编译)
查看>>
人人都能懂的Vue源码系列—09—initEvents
查看>>
express+nginx 搭建最简单web项目
查看>>
在Laravel中使用事件记录SQL查询到日志
查看>>
教你如何修改github上的项目语言类型
查看>>
spring mvc中的几类拦截器对比
查看>>
PAI在线深度学习开发产品DSW发布
查看>>
Branckets快捷键
查看>>
赶紧登陆百家号看看,有没有被降级
查看>>
ATAC-Seq 数据分析(上)
查看>>
WPS Office 2019 上架微软商城,全新可定制 UI
查看>>
服务器宕机原因
查看>>
「mysql优化专题」视图应用竟然还可以这么优化?不得不收藏(8)
查看>>
Facebook开源看图问答模型Pythia:拿下VQA比赛冠军就靠它
查看>>