jq指令使用

583 阅读4分钟

本文已参与「新人创作礼」活动,一起开启掘金创作之路.

最近有个项目在使用jq指令对json数据进行处理,今天就说自己用到对数组以及map处理以及对数据的合并处理,基础的用法,可以参考官网。支持windows,Mac,以及linux等平台,特别注意windows平台与其他平台的差异性,主要集中在字符串的拼接,本文以linux平台为主,数据在jqplay验证平台 上进行展示。

1.安装jq

官网地址

官方手册

2.支持的类型

数据结构:Object-->'{}'、Array-->'[]'

基本类型:string、number、true、false、null -->'6种'

对象:一个'花括号'{},整个代表一个对象-->'object'-->里面的元素必须是'key:value'

数组: 一个'方括号'[],整个代表一个数组-->'array' -->数组元素可以是'上面的六种'、或者'[]'、'{}'-->'广义上的单个元素'

特点:里面是一种'key:value'的存储形式,它还有不同的'数据类型'来区分

备注:复杂的数据结构是'通过'{}、'[]'嵌套来'实现'的

本文以如下数据作为数据源,进行测试

{
  "version": 4,
  "terraform_version": "1.2.4",
  "serial": 0,
  "lineage": "4003a2ff-2140-30ba-cb97-0dfbdabb4005",
  "outputs": {
    "output_mjy_test_state_cvm-ap-nanjing": {
      "value": "success",
      "type": "string"
    }
  },
  "resources": [
    {
      "mode": "data",
      "type": "tencentcloud_instances_set",
      "name": "dtci_mjy_test_state_cvm-ap-nanjing",
      "provider": "provider[\"registry.terraform.io/tencentcloudstack/tencentcloud\"]",
      "instances": [
        {
          "schema_version": 0,
          "attributes": {
            "availability_zone": null,
            "id": "3806323117",
            "instance_id": null,
            "instance_list": [
              {
                "allocate_public_ip": false,
                "availability_zone": "ap-nanjing-1",
                "cam_role_name": "",
                "cpu": 2,
                "create_time": "2022-07-13T08:31:43Z",
                "data_disks": [],
                "expired_time": "",
                "image_id": "img-8inj6zre",
                "instance_charge_type": "POSTPAID_BY_HOUR",
                "instance_charge_type_prepaid_renew_flag": "",
                "instance_id": "ins-3vvr6nzg",
                "instance_name": "chicken-test-cvm-chicken-l5jcd25l39j2bzykocn-1",
                "instance_type": "SA2.MEDIUM2",
                "internet_charge_type": "TRAFFIC_POSTPAID_BY_HOUR",
                "internet_max_bandwidth_out": 100,
                "memory": 2,
                "private_ip": "10.1.2.96",
                "project_id": 0,
                "public_ip": "119.45.9.173",
                "security_groups": [
                  "sg-ho7g150o"
                ],
                "status": "RUNNING",
                "subnet_id": "subnet-n02ejqv8",
                "system_disk_id": "disk-d8d8s3nw",
                "system_disk_size": 50,
                "system_disk_type": "CLOUD_PREMIUM",
                "tags": {},
                "vpc_id": "vpc-q6attagl"
              },
              {
                "allocate_public_ip": false,
                "availability_zone": "ap-nanjing-2",
                "cam_role_name": "",
                "cpu": 2,
                "create_time": "2022-07-13T08:31:44Z",
                "data_disks": [],
                "expired_time": "",
                "image_id": "img-8inj6zre",
                "instance_charge_type": "POSTPAID_BY_HOUR",
                "instance_charge_type_prepaid_renew_flag": "",
                "instance_id": "ins-6xwvisg4",
                "instance_name": "team-test-cvm-team-l5jcdfpkw1drnff2pb-12",
                "instance_type": "SA2.MEDIUM2",
                "internet_charge_type": "TRAFFIC_POSTPAID_BY_HOUR",
                "internet_max_bandwidth_out": 100,
                "memory": 2,
                "private_ip": "10.1.34.130",
                "project_id": 0,
                "public_ip": "118.195.206.186",
                "security_groups": [
                  "sg-ho7g150o"
                ],
                "status": "RUNNING",
                "subnet_id": "subnet-bobaaste",
                "system_disk_id": "disk-qik07loq",
                "system_disk_size": 50,
                "system_disk_type": "CLOUD_PREMIUM",
                "tags": {},
                "vpc_id": "vpc-q6attagl"
              },
              {
                "allocate_public_ip": false,
                "availability_zone": "ap-nanjing-2",
                "cam_role_name": "",
                "cpu": 2,
                "create_time": "2022-07-13T08:31:42Z",
                "data_disks": [],
                "expired_time": "",
                "image_id": "img-8inj6zre",
                "instance_charge_type": "POSTPAID_BY_HOUR",
                "instance_charge_type_prepaid_renew_flag": "",
                "instance_id": "ins-byxdidsc",
                "instance_name": "team-test-cvm-team-l5jcdfpkw1drnff2pb-11",
                "instance_type": "SA2.MEDIUM2",
                "internet_charge_type": "TRAFFIC_POSTPAID_BY_HOUR",
                "internet_max_bandwidth_out": 100,
                "memory": 2,
                "private_ip": "10.1.34.63",
                "project_id": 0,
                "public_ip": "118.195.225.180",
                "security_groups": [
                  "sg-ho7g150o"
                ],
                "status": "RUNNING",
                "subnet_id": "subnet-bobaaste",
                "system_disk_id": "disk-djoe22iy",
                "system_disk_size": 50,
                "system_disk_type": "CLOUD_PREMIUM",
                "tags": {},
                "vpc_id": "vpc-q6attagl"
              }
            ],
            "instance_name": null,
            "project_id": null,
            "result_output_file": null,
            "subnet_id": null,
            "tags": null,
            "vpc_id": "vpc-q6attagl"
          },
          "sensitive_attributes": []
        }
      ]
    }
  ]
}

3.基础用法

key: 一般是'string'类型,用"双引号" value:'任何'基本类型或数据结构 细节: 最后一个'不要加逗号',加了会'出错'

#输出所有的数据
jq '.'

image.png

4.高级用法

  • 核心: 数据结构之间的'嵌套',很多时候,服务器返回的JSON都"不是一个扁平"的结构,而是'包含了各种嵌套'
  • 强调: 一般'json结构'保持一样
  • 理解: 类似于C的'struct'结构体,自己'自由'组合

4.1获取数据中的第一个元素

jq '.resources[0].instances[0]'

image.png

4.2获取数组中的第二个元素

jq '.resources[0].instances[0].attributes.instance_list[1]'

image.png

4.3 获取所有的数组元素

jq '.resources[0].instances[0].attributes.instance_list'

image.png

其他用法可以参考

5.本次项目用到的过滤

5.1过滤所有的的public_ip,private_ip,instance_id,instance_name,availability_zone并输出一个数组

jq '.resources[0].instances[0].attributes.instance_list | [sort_by(.create_time) |.[] |{private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name, instance_id:.instance_id,availability_zone:.availability_zone}]

image.png

5.2 根据区域进行匹配过滤public_ip,private_ip,instance_id,instance_name,availability_zone并输出一个数组

jq '.resources[0].instances[0].attributes.instance_list | [.[] | select(.availability_zone | match("ap-nanjing-1")) |{private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name, instance_id:.instance_id,availability_zone:.availability_zone}]'

image.png

5.3 根据创建时间来排序区域进行匹配过滤public_ip,private_ip,instance_id,instance_name,availability_zone并输出一个数组

jq '.resources[0].instances[0].attributes.instance_list | [sort_by(.create_time) |.[] | select(.availability_zone | match("ap-nanjing-1")) |{private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name, instance_id:.instance_id,availability_zone:.availability_zone}]'

image.png

5.4 根据创建时间来排序,区域以及是否允许分配公网进行匹配过滤public_ip,private_ip,instance_id,instance_name,availability_zone并输出一个数组

针对与字符串的进行模糊匹配可以使用match,针对于int,bool等类型基本上就是精确匹配,这样就可以过滤出来自己需要的数据

jq '.resources[0].instances[0].attributes.instance_list | [sort_by(.create_time) |.[] | select(.availability_zone | match("ap-nanjing-1"))|select(.allocate_public_ip ==false) |{private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name, instance_id:.instance_id,availability_zone:.availability_zone}]'

image.png

5.5 获取区域,输出一个数组

.resources[0].instances[0].attributes.instance_list | [sort_by(.create_time) |.[] |{availability_zone:.availability_zone}]

image.png 上述的数组有重复的,要求不允许重复,可以使用map形式输出

jq '.resources[0].instances[0].attributes.instance_list|map( { (.availability_zone|tostring): {} } ) | add'

image.png

5.6 输出特定的格式数据

jq '.resources[0].instances[0].attributes.instance_list |map( { (.private_ip|tostring): {private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name,instance_id:.instance_id}}) |add'

image.png

6.golang处理数据

采用gojq这个开源组件进行解析。 官网地址

import (
	"encoding/json"
	"fmt"
	"io/ioutil"
	"log"
	"os"
	"os/exec"
	"strings"
	"github.com/itchyny/gojq"
)


func main() {
	readdata()
}
func readJson(filepath string) map[string]interface{} {
    // 打开json文件
    jsonFile, err := os.Open(filepath)

    if err != nil {
        fmt.Println(err)
    }
    defer jsonFile.Close()
    byteValue, _ := ioutil.ReadAll(jsonFile)
    // fmt.Println(string(byteValue))
    input := map[string]interface{}{}
    json.Unmarshal(byteValue, &input)
    return input
}

func readdata() {
	cmd := ".resources[0].instances[0].attributes.instance_list |map( { (.private_ip|tostring): {private_ip:.private_ip,public_ip:.public_ip,instance_name:.instance_name,instance_id:.instance_id}}) |add"
	query, err := gojq.Parse(cmd)

	if err != nil {
		log.Fatalln(err)
	}
	path := "文件的路径"
	input := readJson(path)
	iter := query.Run(input) // or query.RunWithContext
	for {
		v, ok := iter.Next()
		if !ok {
			break
		}
		if err, ok := v.(error); ok {
			fmt.Println(err)
			continue
		}
		
		//fmt.Printf("%v\n", v)
		body, err := json.Marshal(v)

		
		fmt.Println(string(body), err)
		fmt.Println("==============================")
	}
}

7.总结

  • 选择jq对原有的json数据进行处理,代码简洁,后续维护方便

  • 只要指令用的好,数据处理速度也能上去