1. 程式人生 > >spark-sql 集合hive查詢資料執行日誌

spark-sql 集合hive查詢資料執行日誌

[[email protected] spark]# spark-sql --master spark://hadoop1:7077,hadoop2:7077 --executor-memory 1g --total-executor-cores 2 --driver-class-path /usr/local/hive/lib/mysql-connector-java-5.1.35-bin.jar 
log4j:WARN No appenders could be found for logger (org.apache.hadoop.util.Shell).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
16/11/05 19:00:43 INFO SparkContext: Running Spark version 1.6.2
16/11/05 19:00:43 INFO SecurityManager: Changing view acls to: root
16/11/05 19:00:43 INFO SecurityManager: Changing modify acls to: root
16/11/05 19:00:43 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); users with modify permissions: Set(root)
16/11/05 19:00:44 INFO Utils: Successfully started service 'sparkDriver' on port 45828.
16/11/05 19:00:46 INFO Slf4jLogger: Slf4jLogger started
16/11/05 19:00:46 INFO Remoting: Starting remoting
16/11/05 19:00:47 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://
[email protected]
:52901]
16/11/05 19:00:47 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 52901.
16/11/05 19:00:47 INFO SparkEnv: Registering MapOutputTracker
16/11/05 19:00:47 INFO SparkEnv: Registering BlockManagerMaster
16/11/05 19:00:47 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-d22d1ecd-436b-4070-a353-5ff243f76846
16/11/05 19:00:47 INFO MemoryStore: MemoryStore started with capacity 517.4 MB
16/11/05 19:00:47 INFO SparkEnv: Registering OutputCommitCoordinator
16/11/05 19:00:47 INFO Utils: Successfully started service 'SparkUI' on port 4040.
16/11/05 19:00:47 INFO SparkUI: Started SparkUI at http://192.168.215.133:4040
16/11/05 19:00:48 INFO AppClient$ClientEndpoint: Connecting to master spark://hadoop1:7077...
16/11/05 19:00:48 INFO AppClient$ClientEndpoint: Connecting to master spark://hadoop2:7077...

16/11/05 19:00:48 INFO SparkDeploySchedulerBackend: Connected to Spark cluster with app ID app-20161105190048-0000
16/11/05 19:00:48 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 42456.
16/11/05 19:00:48 INFO NettyBlockTransferService: Server created on 42456
16/11/05 19:00:48 INFO BlockManagerMaster: Trying to register BlockManager
16/11/05 19:00:48 INFO BlockManagerMasterEndpoint: Registering block manager 192.168.215.133:42456 with 517.4 MB RAM, BlockManagerId(driver, 192.168.215.133, 42456)
16/11/05 19:00:48 INFO BlockManagerMaster: Registered BlockManager
16/11/05 19:00:48 INFO AppClient$ClientEndpoint: Executor added: app-20161105190048-0000/0 on worker-20161105185605-192.168.215.132-48933 (192.168.215.132:48933) with 1 cores
16/11/05 19:00:48 INFO SparkDeploySchedulerBackend: Granted executor ID app-20161105190048-0000/0 on hostPort 192.168.215.132:48933 with 1 cores, 1024.0 MB RAM
16/11/05 19:00:49 INFO SparkDeploySchedulerBackend: SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.0
16/11/05 19:00:50 INFO AppClient$ClientEndpoint: Executor updated: app-20161105190048-0000/0 is now RUNNING
16/11/05 19:00:51 INFO HiveContext: Initializing execution hive, version 1.2.1
16/11/05 19:00:51 INFO ClientWrapper: Inspected Hadoop version: 2.6.0
16/11/05 19:00:51 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
SET hive.support.sql11.reserved.keywords=false
16/11/05 19:00:52 INFO HiveContext: default warehouse location is /user/hive/warehouse
16/11/05 19:00:52 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.
16/11/05 19:00:52 INFO ClientWrapper: Inspected Hadoop version: 2.6.0
16/11/05 19:00:52 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.6.0
16/11/05 19:00:57 INFO HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
16/11/05 19:00:57 INFO ObjectStore: ObjectStore, initialize called
16/11/05 19:00:58 INFO Persistence: Property datanucleus.cache.level2 unknown - will be ignored
16/11/05 19:00:58 INFO Persistence: Property hive.metastore.integral.jdo.pushdown unknown - will be ignored
16/11/05 19:00:59 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/11/05 19:01:00 WARN Connection: BoneCP specified but not present in CLASSPATH (or one of dependencies)
16/11/05 19:01:02 INFO ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
16/11/05 19:01:03 INFO SparkDeploySchedulerBackend: Registered executor NettyRpcEndpointRef(null) (hadoop3:35524) with ID 0
16/11/05 19:01:03 INFO BlockManagerMasterEndpoint: Registering block manager hadoop3:40617 with 517.4 MB RAM, BlockManagerId(0, hadoop3, 40617)
16/11/05 19:01:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/11/05 19:01:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/11/05 19:01:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
16/11/05 19:01:05 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
16/11/05 19:01:06 INFO Query: Reading in results for query "
[email protected]
" since the connection used is closing
16/11/05 19:01:06 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL
16/11/05 19:01:06 INFO ObjectStore: Initialized ObjectStore
16/11/05 19:01:06 INFO HiveMetaStore: Added admin role in metastore
16/11/05 19:01:06 INFO HiveMetaStore: Added public role in metastore
16/11/05 19:01:07 INFO HiveMetaStore: No user is added in admin role, since config is empty
16/11/05 19:01:07 INFO HiveMetaStore: 0: get_all_databases
16/11/05 19:01:07 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_all_databases
16/11/05 19:01:07 INFO HiveMetaStore: 0: get_functions: db=default pat=*
16/11/05 19:01:07 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_functions: db=default pat=*
16/11/05 19:01:07 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MResourceUri" is tagged as "embedded-only" so does not have its own datastore table.
16/11/05 19:01:07 INFO SessionState: Created local directory: /tmp/862a773f-674f-4949-9934-f6257fc5e434_resources
16/11/05 19:01:07 INFO SessionState: Created HDFS directory: /tmp/hive/root/862a773f-674f-4949-9934-f6257fc5e434
16/11/05 19:01:07 INFO SessionState: Created local directory: /tmp/root/862a773f-674f-4949-9934-f6257fc5e434
16/11/05 19:01:07 INFO SessionState: Created HDFS directory: /tmp/hive/root/862a773f-674f-4949-9934-f6257fc5e434/_tmp_space.db
SET spark.sql.hive.version=1.2.1
SET spark.sql.hive.version=1.2.1
spark-sql> show databases;
16/11/05 19:06:59 INFO ParseDriver: Parsing command: show databases
16/11/05 19:07:01 INFO ParseDriver: Parse Completed
16/11/05 19:07:02 INFO PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:02 INFO PerfLogger: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:02 INFO PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:03 INFO PerfLogger: <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:03 INFO ParseDriver: Parsing command: show databases
16/11/05 19:07:04 INFO ParseDriver: Parse Completed
16/11/05 19:07:04 INFO PerfLogger: </PERFLOG method=parse start=1478398023173 end=1478398024886 duration=1713 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:04 INFO PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:05 INFO Driver: Semantic Analysis Completed
16/11/05 19:07:05 INFO PerfLogger: </PERFLOG method=semanticAnalyze start=1478398024896 end=1478398025166 duration=270 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO ListSinkOperator: Initializing operator OP[0]
16/11/05 19:07:06 INFO ListSinkOperator: Initialization Done 0 OP
16/11/05 19:07:06 INFO ListSinkOperator: Operator 0 OP initialized
16/11/05 19:07:06 INFO Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:database_name, type:string, comment:from deserializer)], properties:null)
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=compile start=1478398022976 end=1478398026099 duration=3123 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO Driver: Concurrency mode is disabled, not creating a lock manager
16/11/05 19:07:06 INFO PerfLogger: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO Driver: Starting command(queryId=root_20161105190703_c5231ad0-745a-4daa-8a2e-08b3407d74f5): show databases
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=TimeToSubmit start=1478398022976 end=1478398026105 duration=3129 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO Driver: Starting task [Stage-0:DDL] in serial mode
16/11/05 19:07:06 INFO HiveMetaStore: 0: get_all_databases
16/11/05 19:07:06 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_all_databases
16/11/05 19:07:06 INFO DDLTask: results : 1
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=runTasks start=1478398026105 end=1478398026181 duration=76 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=Driver.execute start=1478398026099 end=1478398026181 duration=82 from=org.apache.hadoop.hive.ql.Driver>
OK
16/11/05 19:07:06 INFO Driver: OK
16/11/05 19:07:06 INFO PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=releaseLocks start=1478398026215 end=1478398026215 duration=0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=Driver.run start=1478398022976 end=1478398026215 duration=3239 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO deprecation: mapred.input.dir is deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
16/11/05 19:07:06 INFO FileInputFormat: Total input paths to process : 1
16/11/05 19:07:06 INFO ListSinkOperator: 0 finished. closing... 
16/11/05 19:07:06 INFO ListSinkOperator: 0 Close done
16/11/05 19:07:06 INFO PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:06 INFO PerfLogger: </PERFLOG method=releaseLocks start=1478398026441 end=1478398026441 duration=0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:07:07 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/11/05 19:07:07 INFO DAGScheduler: Got job 0 (processCmd at CliDriver.java:376) with 1 output partitions
16/11/05 19:07:07 INFO DAGScheduler: Final stage: ResultStage 0 (processCmd at CliDriver.java:376)
16/11/05 19:07:07 INFO DAGScheduler: Parents of final stage: List()
16/11/05 19:07:07 INFO DAGScheduler: Missing parents: List()
16/11/05 19:07:07 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[1] at processCmd at CliDriver.java:376), which has no missing parents
16/11/05 19:07:07 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1968.0 B, free 1968.0 B)
16/11/05 19:07:08 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 1224.0 B, free 3.1 KB)
16/11/05 19:07:08 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 192.168.215.133:42456 (size: 1224.0 B, free: 517.4 MB)
16/11/05 19:07:08 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006
16/11/05 19:07:08 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[1] at processCmd at CliDriver.java:376)
16/11/05 19:07:08 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks
16/11/05 19:07:08 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, hadoop3, partition 0,PROCESS_LOCAL, 2199 bytes)
16/11/05 19:07:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on hadoop3:40617 (size: 1224.0 B, free: 517.4 MB)
16/11/05 19:07:10 INFO DAGScheduler: ResultStage 0 (processCmd at CliDriver.java:376) finished in 2.285 s
16/11/05 19:07:10 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 2268 ms on hadoop3 (1/1)
16/11/05 19:07:10 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 
16/11/05 19:07:10 INFO DAGScheduler: Job 0 finished: processCmd at CliDriver.java:376, took 3.836982 s
default
Time taken: 12.808 seconds, Fetched 1 row(s)

16/11/05 19:07:11 INFO CliDriver: Time taken: 12.808 seconds, Fetched 1 row(s)
spark-sql> 16/11/05 19:07:11 INFO StatsReportListener: Finished stage: [email protected]
16/11/05 19:07:11 INFO StatsReportListener: task runtime:(count: 1, mean: 2268.000000, stdev: 0.000000, max: 2268.000000, min: 2268.000000)
16/11/05 19:07:11 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:07:11 INFO StatsReportListener: 2.3 s2.3 s2.3 s2.3 s2.3 s2.3 s2.3 s2.3 s2.3 s
16/11/05 19:07:11 INFO StatsReportListener: task result size:(count: 1, mean: 1053.000000, stdev: 0.000000, max: 1053.000000, min: 1053.000000)
16/11/05 19:07:11 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:07:11 INFO StatsReportListener: 1053.0 B1053.0 B1053.0 B1053.0 B1053.0 B1053.0 B1053.0 B1053.0 B1053.0 B
16/11/05 19:07:11 INFO StatsReportListener: executor (non-fetch) time pct: (count: 1, mean: 2.733686, stdev: 0.000000, max: 2.733686, min: 2.733686)
16/11/05 19:07:11 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:07:11 INFO StatsReportListener: 3 %3 %3 %3 %3 %3 %3 %3 %3 %
16/11/05 19:07:11 INFO StatsReportListener: other time pct: (count: 1, mean: 97.266314, stdev: 0.000000, max: 97.266314, min: 97.266314)
16/11/05 19:07:11 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:07:11 INFO StatsReportListener: 97 %97 %97 %97 %97 %97 %97 %97 %97 %


         > use default;

16/11/05 19:10:13 INFO ParseDriver: Parsing command: use default
16/11/05 19:10:13 INFO ParseDriver: Parse Completed
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO ParseDriver: Parsing command: use default
16/11/05 19:10:13 INFO ParseDriver: Parse Completed
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=parse start=1478398213682 end=1478398213684 duration=2 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO HiveMetaStore: 0: get_database: default
16/11/05 19:10:13 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_database: default
16/11/05 19:10:13 INFO Driver: Semantic Analysis Completed
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=semanticAnalyze start=1478398213684 end=1478398213833 duration=149 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=compile start=1478398213679 end=1478398213834 duration=155 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO Driver: Concurrency mode is disabled, not creating a lock manager
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO Driver: Starting command(queryId=root_20161105191013_da499c87-0b8b-4976-8ac2-a9cce4e09a44): use default
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=TimeToSubmit start=1478398213679 end=1478398213834 duration=155 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO Driver: Starting task [Stage-0:DDL] in serial mode
16/11/05 19:10:13 INFO HiveMetaStore: 0: get_database: default
16/11/05 19:10:13 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_database: default
16/11/05 19:10:13 INFO HiveMetaStore: 0: get_database: default
16/11/05 19:10:13 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_database: default
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=runTasks start=1478398213834 end=1478398213874 duration=40 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=Driver.execute start=1478398213834 end=1478398213874 duration=40 from=org.apache.hadoop.hive.ql.Driver>
OK
16/11/05 19:10:13 INFO Driver: OK
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=releaseLocks start=1478398213874 end=1478398213874 duration=0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=Driver.run start=1478398213679 end=1478398213875 duration=196 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:13 INFO PerfLogger: </PERFLOG method=releaseLocks start=1478398213875 end=1478398213875 duration=0 from=org.apache.hadoop.hive.ql.Driver>
16/11/05 19:10:14 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/11/05 19:10:14 INFO DAGScheduler: Got job 1 (processCmd at CliDriver.java:376) with 1 output partitions
16/11/05 19:10:14 INFO DAGScheduler: Final stage: ResultStage 1 (processCmd at CliDriver.java:376)
16/11/05 19:10:14 INFO DAGScheduler: Parents of final stage: List()
16/11/05 19:10:14 INFO DAGScheduler: Missing parents: List()
16/11/05 19:10:14 INFO DAGScheduler: Submitting ResultStage 1 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376), which has no missing parents
16/11/05 19:10:14 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 1968.0 B, free 5.0 KB)
16/11/05 19:10:14 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 1219.0 B, free 6.2 KB)
16/11/05 19:10:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 192.168.215.133:42456 (size: 1219.0 B, free: 517.4 MB)
16/11/05 19:10:14 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
16/11/05 19:10:14 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 1 (MapPartitionsRDD[3] at processCmd at CliDriver.java:376)
16/11/05 19:10:14 INFO TaskSchedulerImpl: Adding task set 1.0 with 1 tasks
16/11/05 19:10:14 INFO TaskSetManager: Starting task 0.0 in stage 1.0 (TID 1, hadoop3, partition 0,PROCESS_LOCAL, 2059 bytes)
16/11/05 19:10:14 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on hadoop3:40617 (size: 1219.0 B, free: 517.4 MB)
16/11/05 19:10:14 INFO DAGScheduler: ResultStage 1 (processCmd at CliDriver.java:376) finished in 0.151 s
16/11/05 19:10:14 INFO StatsReportListener: Finished stage: [email protected]
16/11/05 19:10:14 INFO DAGScheduler: Job 1 finished: processCmd at CliDriver.java:376, took 0.243612 s
Time taken: 0.638 seconds
16/11/05 19:10:14 INFO CliDriver: Time taken: 0.638 seconds
spark-sql> 16/11/05 19:10:14 INFO StatsReportListener: task runtime:(count: 1, mean: 156.000000, stdev: 0.000000, max: 156.000000, min: 156.000000)
16/11/05 19:10:14 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:14 INFO StatsReportListener: 156.0 ms156.0 ms156.0 ms156.0 ms156.0 ms156.0 ms156.0 ms156.0 ms156.0 ms
16/11/05 19:10:14 INFO StatsReportListener: task result size:(count: 1, mean: 916.000000, stdev: 0.000000, max: 916.000000, min: 916.000000)
16/11/05 19:10:14 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:14 INFO StatsReportListener: 916.0 B916.0 B916.0 B916.0 B916.0 B916.0 B916.0 B916.0 B916.0 B
16/11/05 19:10:14 INFO TaskSetManager: Finished task 0.0 in stage 1.0 (TID 1) in 156 ms on hadoop3 (1/1)
16/11/05 19:10:14 INFO TaskSchedulerImpl: Removed TaskSet 1.0, whose tasks have all completed, from pool 
16/11/05 19:10:14 INFO StatsReportListener: executor (non-fetch) time pct: (count: 1, mean: 6.410256, stdev: 0.000000, max: 6.410256, min: 6.410256)
16/11/05 19:10:14 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:14 INFO StatsReportListener: 6 %6 %6 %6 %6 %6 %6 %6 %6 %
16/11/05 19:10:14 INFO StatsReportListener: other time pct: (count: 1, mean: 93.589744, stdev: 0.000000, max: 93.589744, min: 93.589744)
16/11/05 19:10:14 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:14 INFO StatsReportListener: 94 %94 %94 %94 %94 %94 %94 %94 %94 %


         > show tables;

16/11/05 19:10:33 INFO HiveMetaStore: 0: get_tables: db=default pat=.*
16/11/05 19:10:33 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_tables: db=default pat=.*
16/11/05 19:10:34 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/11/05 19:10:34 INFO DAGScheduler: Got job 2 (processCmd at CliDriver.java:376) with 1 output partitions
16/11/05 19:10:34 INFO DAGScheduler: Final stage: ResultStage 2 (processCmd at CliDriver.java:376)
16/11/05 19:10:34 INFO DAGScheduler: Parents of final stage: List()
16/11/05 19:10:34 INFO DAGScheduler: Missing parents: List()
16/11/05 19:10:34 INFO DAGScheduler: Submitting ResultStage 2 (MapPartitionsRDD[5] at processCmd at CliDriver.java:376), which has no missing parents
16/11/05 19:10:34 INFO MemoryStore: Block broadcast_2 stored as values in memory (estimated size 1968.0 B, free 8.2 KB)
16/11/05 19:10:34 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in memory (estimated size 1217.0 B, free 9.3 KB)
16/11/05 19:10:34 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 192.168.215.133:42456 (size: 1217.0 B, free: 517.4 MB)
16/11/05 19:10:34 INFO SparkContext: Created broadcast 2 from broadcast at DAGScheduler.scala:1006
16/11/05 19:10:34 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 2 (MapPartitionsRDD[5] at processCmd at CliDriver.java:376)
16/11/05 19:10:34 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
16/11/05 19:10:34 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, hadoop3, partition 0,PROCESS_LOCAL, 2200 bytes)
16/11/05 19:10:34 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on hadoop3:40617 (size: 1217.0 B, free: 517.4 MB)
16/11/05 19:10:34 INFO DAGScheduler: ResultStage 2 (processCmd at CliDriver.java:376) finished in 0.151 s
16/11/05 19:10:34 INFO StatsReportListener: Finished stage: [email protected]
16/11/05 19:10:34 INFO DAGScheduler: Job 2 finished: processCmd at CliDriver.java:376, took 0.202041 s
person false
Time taken: 0.591 seconds, Fetched 1 row(s)

16/11/05 19:10:34 INFO CliDriver: Time taken: 0.591 seconds, Fetched 1 row(s)
spark-sql> 16/11/05 19:10:34 INFO StatsReportListener: task runtime:(count: 1, mean: 154.000000, stdev: 0.000000, max: 154.000000, min: 154.000000)
16/11/05 19:10:34 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:34 INFO StatsReportListener: 154.0 ms154.0 ms154.0 ms154.0 ms154.0 ms154.0 ms154.0 ms154.0 ms154.0 ms
16/11/05 19:10:34 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) in 154 ms on hadoop3 (1/1)
16/11/05 19:10:34 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
16/11/05 19:10:34 INFO StatsReportListener: task result size:(count: 1, mean: 1054.000000, stdev: 0.000000, max: 1054.000000, min: 1054.000000)
16/11/05 19:10:34 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:34 INFO StatsReportListener: 1054.0 B1054.0 B1054.0 B1054.0 B1054.0 B1054.0 B1054.0 B1054.0 B1054.0 B
16/11/05 19:10:34 INFO StatsReportListener: executor (non-fetch) time pct: (count: 1, mean: 6.493506, stdev: 0.000000, max: 6.493506, min: 6.493506)
16/11/05 19:10:34 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:34 INFO StatsReportListener: 6 %6 %6 %6 %6 %6 %6 %6 %6 %
16/11/05 19:10:34 INFO StatsReportListener: other time pct: (count: 1, mean: 93.506494, stdev: 0.000000, max: 93.506494, min: 93.506494)
16/11/05 19:10:34 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:10:34 INFO StatsReportListener: 94 %94 %94 %94 %94 %94 %94 %94 %94 %


         > select * from person;

16/11/05 19:10:52 INFO ParseDriver: Parsing command: select * from person
16/11/05 19:10:52 INFO ParseDriver: Parse Completed
16/11/05 19:10:52 INFO HiveMetaStore: 0: get_table : db=default tbl=person
16/11/05 19:10:52 INFO audit: ugi=rootip=unknown-ip-addrcmd=get_table : db=default tbl=person
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_2_piece0 on hadoop3:40617 in memory (size: 1217.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_2_piece0 on 192.168.215.133:42456 in memory (size: 1217.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO ContextCleaner: Cleaned accumulator 3
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 192.168.215.133:42456 in memory (size: 1219.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_1_piece0 on hadoop3:40617 in memory (size: 1219.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO ContextCleaner: Cleaned accumulator 2
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_0_piece0 on 192.168.215.133:42456 in memory (size: 1224.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO BlockManagerInfo: Removed broadcast_0_piece0 on hadoop3:40617 in memory (size: 1224.0 B, free: 517.4 MB)
16/11/05 19:10:55 INFO ContextCleaner: Cleaned accumulator 1
16/11/05 19:10:56 INFO deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/11/05 19:10:56 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 466.5 KB, free 466.5 KB)
16/11/05 19:10:56 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 41.5 KB, free 508.0 KB)
16/11/05 19:10:56 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on 192.168.215.133:42456 (size: 41.5 KB, free: 517.4 MB)
16/11/05 19:10:56 INFO SparkContext: Created broadcast 3 from processCmd at CliDriver.java:376
16/11/05 19:10:57 INFO FileInputFormat: Total input paths to process : 1
16/11/05 19:10:57 INFO SparkContext: Starting job: processCmd at CliDriver.java:376
16/11/05 19:10:57 INFO DAGScheduler: Got job 3 (processCmd at CliDriver.java:376) with 2 output partitions
16/11/05 19:10:57 INFO DAGScheduler: Final stage: ResultStage 3 (processCmd at CliDriver.java:376)
16/11/05 19:10:57 INFO DAGScheduler: Parents of final stage: List()
16/11/05 19:10:57 INFO DAGScheduler: Missing parents: List()
16/11/05 19:10:57 INFO DAGScheduler: Submitting ResultStage 3 (MapPartitionsRDD[9] at processCmd at CliDriver.java:376), which has no missing parents
16/11/05 19:10:58 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 6.4 KB, free 514.4 KB)
16/11/05 19:10:58 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 3.5 KB, free 518.0 KB)
16/11/05 19:10:58 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on 192.168.215.133:42456 (size: 3.5 KB, free: 517.4 MB)
16/11/05 19:10:58 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1006
16/11/05 19:10:58 INFO DAGScheduler: Submitting 2 missing tasks from ResultStage 3 (MapPartitionsRDD[9] at processCmd at CliDriver.java:376)
16/11/05 19:10:58 INFO TaskSchedulerImpl: Adding task set 3.0 with 2 tasks
16/11/05 19:10:58 INFO TaskSetManager: Starting task 0.0 in stage 3.0 (TID 3, hadoop3, partition 0,NODE_LOCAL, 2161 bytes)
16/11/05 19:10:58 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on hadoop3:40617 (size: 3.5 KB, free: 517.4 MB)
16/11/05 19:11:01 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on hadoop3:40617 (size: 41.5 KB, free: 517.4 MB)
16/11/05 19:11:07 INFO TaskSetManager: Starting task 1.0 in stage 3.0 (TID 4, hadoop3, partition 1,NODE_LOCAL, 2161 bytes)
16/11/05 19:11:07 INFO TaskSetManager: Finished task 0.0 in stage 3.0 (TID 3) in 9102 ms on hadoop3 (1/2)
16/11/05 19:11:07 INFO DAGScheduler: ResultStage 3 (processCmd at CliDriver.java:376) finished in 9.180 s
16/11/05 19:11:07 INFO StatsReportListener: Finished stage: [email protected]
16/11/05 19:11:07 INFO DAGScheduler: Job 3 finished: processCmd at CliDriver.java:376, took 9.399122 s
16/11/05 19:11:07 INFO StatsReportListener: task runtime:(count: 2, mean: 4620.000000, stdev: 4482.000000, max: 9102.000000, min: 138.000000)
16/11/05 19:11:07 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:11:07 INFO StatsReportListener: 138.0 ms138.0 ms138.0 ms138.0 ms9.1 s9.1 s9.1 s9.1 s9.1 s
16/11/05 19:11:07 INFO TaskSetManager: Finished task 1.0 in stage 3.0 (TID 4) in 138 ms on hadoop3 (2/2)
16/11/05 19:11:07 INFO TaskSchedulerImpl: Removed TaskSet 3.0, whose tasks have all completed, from pool 
16/11/05 19:11:07 INFO StatsReportListener: task result size:(count: 2, mean: 2276.500000, stdev: 4.500000, max: 2281.000000, min: 2272.000000)
16/11/05 19:11:07 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:11:07 INFO StatsReportListener: 2.2 KB2.2 KB2.2 KB2.2 KB2.2 KB2.2 KB2.2 KB2.2 KB2.2 KB
16/11/05 19:11:07 INFO StatsReportListener: executor (non-fetch) time pct: (count: 2, mean: 39.008866, stdev: 25.240750, max: 64.249615, min: 13.768116)
16/11/05 19:11:07 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:11:07 INFO StatsReportListener: 14 %14 %14 %14 %64 %64 %64 %64 %64 %
16/11/05 19:11:07 INFO StatsReportListener: other time pct: (count: 2, mean: 60.991134, stdev: 25.240750, max: 86.231884, min: 35.750385)
16/11/05 19:11:07 INFO StatsReportListener: 0%5%10% 25%50% 75%90% 95%100%
16/11/05 19:11:07 INFO StatsReportListener: 36 %36 %36 %36 %86 %86 %86 %86 %86 %
1 xiaozhang23
2xiaowang 24
3xiaoli 25
4xiaoxiao 26
5xiaoxiao 27
6xiaolizi 39
7xiaodaye 10
8dageda 12
9daye 24
10dada 25

Time taken: 14.784 seconds, Fetched 10 row(s)
16/11/05 19:11:07 INFO CliDriver: Time taken: 14.784 seconds, Fetched 10 row(s)
spark-sql>

相關推薦

spark-sql 集合hive查詢資料執行日誌

[[email protected] spark]# spark-sql --master spark://hadoop1:7077,hadoop2:7077 --executor-memory 1g --total-executor-cores 2 --driv

Spark-Sql整合hive,在spark-sql命令和spark-shell命令下執行sql命令和整合調用hive

type with hql lac 命令 val driver spark集群 string 1.安裝Hive 如果想創建一個數據庫用戶,並且為數據庫賦值權限,可以參考:http://blog.csdn.net/tototuzuoquan/article/details/5

Spark-Sql整合hive,在spark-sql命令和spark-shell命令下執行sql命令和整合呼叫hive

分享一下我老師大神的人工智慧教程!零基礎,通俗易懂!http://blog.csdn.net/jiangjunshow 也歡迎大家轉載本篇文章。分享知識,造福人民,實現我們中華民族偉大復興!        

spark sql 訪問Hive資料

測試環境 hadoop版本:2.6.5 spark版本:2.3.0 hive版本:1.2.2 master主機:192.168.11.170 slave1主機:192.168.11.171 程式碼實現 針對Hive表的sql語句會轉化為MR程式,一般執行起來會比較耗時,spar

spark從入門到放棄三十三:Spark Sql(6)hive sql 案例 查詢分數大於80分的同學

DROP TABLE IF EXISTS student_info"); sqlContext.sql("CREATE TABLE IF NOT EXISTS student_info (name STRING ,age INT)"); System.out.println(

Spark SQL 訪問hive 出現異常:org.datanucleus.exceptions.NucleusDataStoreException

異常: org.datanucleus.exceptions.NucleusDataStoreException: Exception thrown obtaining schema column information from datastore 出現問題原因:  1、hi

第四天 -- Accumulator累加器 -- Spark SQL -- DataFrame -- Hive on Spark

第四天 – Accumulator累加器 – Spark SQL – DataFrame – Hive on Spark 文章目錄 第四天 -- Accumulator累加器 -- Spark SQL -- DataFrame -- Hive on Spark

Spark sql操作Hive

這裡說的是最簡便的方法,通過Spark sql直接操作hive。前提是hive-site.xml等配置檔案已經在Spark叢集配置好。  val logger = LoggerFactory.getLogger(SevsSpark4.getClass)   def main(args:

Spark SQLHive資料來源複雜綜合案例實戰

一、Hive資料來源實戰 Spark SQL支援對Hive中儲存的資料進行讀寫。操作Hive中的資料時,必須建立HiveContext,而不是SQLContext。HiveContext繼承自SQLContext,但是增加了在Hive元資料庫中查詢表,以及用Hi

Spark SQLHive 的第一場會師

“你好,一杯熱美式,加 2 份shot, 1 份焦糖,謝謝” L 跨進匯智國際中心大廈的 Starbucks, 拿著 iPhone 對著點餐機輕輕一掃,對黑帶服務員小妹丟擲一個笑臉。 “ L 先生,您的熱美式” “謝謝” 最近 1 禮拜,無論雙休還是工作日,L 每天基本都是同一時間,在早上 Starbucks

spark sql: 操作hive

目標: 實現類似於navicat的功能=> 寫hql語句,在idea下使用spark sql 一鍵執行,而不用到shell視窗下執行命令 步驟: 寫sql檔案 (resources目錄)—> 讀取內容 --> 以 ‘;’ 解析每條命令 --

Spark SQLhive hbase mysql整合

虛擬機器環境:centos7 一、Spark SQL 與Hive整合(spark-shell) 1.需要配置的專案     1)將hive的配置檔案hive-site.xml拷貝到spark conf目錄,同時新增metastore的url配置。         執行

第69課:Spark SQL通過Hive資料來源實戰

內容:     1.Spark SQL操作Hive解析     2.SparkSQL操作Hive實戰 一、Spark SQL操作Hive解析     1.在目前企業級大資料Spark開發的時候,

提高spark sql翻頁查詢效能的想法

一般每一頁的資料量比較小,1000條以內。大概的想法就是把要查詢的資料先一次性查出來快取在記憶體中,之後翻頁查詢的時候直接取結果就行了,這樣只是第一次查的比較慢,後面從記憶體中直接取資料就非常快了。但是這又帶來一個問題,如果結果集太大,比如有100w行資料,而且有很多列,這樣就會佔用大量記憶體,使執

同樣的SQL語句在查詢分析器執行很快,但是網站上執行超時的詭異問題

使用DbParameter傳遞引數撈SQL Server資料速度異常的慢 c#Dbtype與SQL dbtype一一對應關係,提高效率關鍵 c#Dbtype與SQL dbtype一一對應關係,提高效率關鍵,正確對應,使用SQL Server監視時,可看到省

通過spark sql建立HIVE的分割槽表

今天需要通過匯入文字中的資料到HIVE資料庫,而且因為預設該表的資料會比較大,所以採用分割槽表的設計方案。將表按地區和日期分割槽。在這個過程出現過一些BUG,記錄以便後期檢視。 spark.sql("use oracledb") spark.sql("CREATE TABL

Spark SQL相容Hive及擴充套件

前言      相比於Shark對Hive的過渡依賴,Spark SQL在Hive相容層面僅依賴HQL Parser、Hive Metastore和Hive SerDes。也就是說,從HQL被解析成抽象語法樹(AST)起,就全部由Spark SQL接管了,執行計劃生成和優

C#連線SQL Server並查詢資料

用C#連線本地SQL Server 查詢資訊 using System; using System.Collections.Generic; using System.Linq; using System.Text; using System.Threading.Task

hive查詢資料匯出到本地目錄或hdfs的方法

 一、匯出到本地檔案系統   hive> insert overwrite local directory '/home/mydir/mydir' > select * from test; 二、匯出到HDFS中hive> insert overwr

python3操作psycopg2/其它SQL資料庫時查詢資料以字典格式返回

    在python3中,操作pymysql或者psycopg2等SQL資料庫進行資料查詢時,它這個庫裡面好像並沒有像python2一樣在底層自動幫我們迴圈轉換好以字典鍵值對的格式給我們返回資料(可能是我沒找到),而是在列表裡面以元組型別直接把值返回過來,這樣就導致我們在取