这是我参与8月更文挑战的第4天,活动详情查看:8月更文挑战
NO.3 CORRECT TEXT(第三题:正确文本)
Problem Scenario 84 : In Continuation of previous question, please accomplish following activities.(问题场景84 : 继续前一个问题(NO2),请完成以下活动。)
-
Select all the products which has product code as null
-
Select all the products, whose name starts with Pen and results should be order by Price descending order.
-
Select all the products, whose name starts with Pen and results should be order by Price descending order and quantity ascending order.
-
Select top 2 products by price
Answer:(答)
See the explanation for Step by Step Solution and configuration.(请参阅逐步解决方案和配置的说明)
Explanation:(说明)
Solution :(解决方案)
Step 1 : Select all the products which has product code as null(选出code为空的所有产品)
val sqlContext=new org.apache.spark.sql.SQLContext(sc)
val results = sqlContext.sql("SELECT * FROM products WHERE code IS NULL")
results. show()
Step 2 : Select all the products , whose name starts with Pen and results should be order by Price descending order. (选出name以Pen开头所有产品,并按Price降序排列)
val results = sqlContext.sql("SELECT * FROM products WHERE name LIKE 'Pen %' ORDER BY price DESC")
results. show()
Step 3 : Select all the products , whose name starts with Pen and results should be order by Price descending order and quantity ascending order. (选出name以pen开头,并按Price降序和quantity升序排列所有产品)
val results = sqlContext.sql("SELECT * FROM products WHERE name LIKE 'Pen %' ORDER BY quantity,price DESC")
results. show()
Step 4 : Select top 2 products by price(按Price选择前2名产品)
val results = sqlContext.sql("SELECT * FROM products ORDER BY price desc LIMIT 2")
results. show()
NO.4 CORRECT TEXT(第四题 正确文本)
Problem Scenario 4: You have been given MySQL DB with following details.(问题场景4:MySQL数据库提供了以下详细信息)
user=retail_dba
password=cloudera
database=retail_db
table=retail_db.categories
jdbc URL = jdbc:mysql://quickstart:3306/retail_db
Please accomplish following activities.(请完成以下活动)
Import Single table categories (Subset data} to hive managed table , where category_id between 1 and 22(将categories表(子集数据)id介于1和22之间的数据导入hive托管表)
| 备注:1.我的数据库用户名和密码跟考试环境不同2.考试的时候表是已经建好的,我这里就先造一点数据create table categories (id int,name varchar(12));insert into categories values(1,'yee'),(2,'xun'),(3,'zhang'),(4,'ya'),(5,'guang'),(6,'zyg'),(7,'ya'),(8,'guang'),(9,'zyg'),(10,'ya'),(11,'guang'),(12,'zyg'),(13,'ya'),(14,'guang'),(15,'zyg'),(16,'ya'),(17,'guang'),(18,'zyg'),(19,'ya'),(20,'guang'),(21,'zyg'),(22,'ya'),(23,'guang'),(24,'zyg'),(25,'ya'),(26,'guang'),(27,'zyg'),(28,'ya'),(29,'guang'),(30,'zyg'); |
Answer:(答)
See the explanation for Step by Step Solution and configuration.(请参阅逐步解决方案和配置的说明)
Explanation:(说明)
Solution :(解决方案)
Step 1 : Import Single table (Subset data)(导入categories表(子集数据))
sqoop import --connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -
password=cloudera -table=categories -where "category_id between 1 and 22" --hive-import --m 1
| 我测试环境的sqoop语句:sqoop import --connect jdbc:mysql://manager:3306/retail_db -username=root -password=yeexun123 -table=categories -where "id between 1 and 22" --hive-import --m 1部分执行过程如下: |
Note: This command will create a managed table and content will be created in the following directory.(此命令将创建一个托管表,并在以下目录中创建内容。)
/user/hive/warehouse/categories
我这里的目录是我配置hdfs参数时写的,跟考试环境的目录不同:
Step 2 : Check whether table is created or not (In Hive)
show tables;
select * from categories;