Tags: #jq #yq #shell #bash #json
💡 Documentation with interactive search for functions (very useful)
Here’s some basic usage examples…
$ cat /tmp/example.json | jq
{
"foo": "bar",
"errors": [
"beep"
]
}
{
"foo": null,
"errors": [
"boop"
]
}
{
"foo": {
"a": 1,
"b": 2
},
"errors": [
"beep"
]
}
{
"foo": {
"a": 3,
"b": 4
},
"errors": [
"beep",
"boop"
]
}
$ jq < /tmp/example.json
{
"foo": "bar",
"errors": [
"beep"
]
}
{
"foo": null,
"errors": [
"boop"
]
}
{
"foo": {
"a": 1,
"b": 2
},
"errors": [
"beep"
]
}
{
"foo": {
"a": 3,
"b": 4
},
"errors": [
"beep",
"boop"
]
}
$ jq .foo < /tmp/example.json
"bar"
null
{
"a": 1,
"b": 2
}
{
"a": 3,
"b": 4
}
$ jq keys < /tmp/example.json
[
"errors",
"foo"
]
[
"errors",
"foo"
]
[
"errors",
"foo"
]
[
"errors",
"foo"
]
$ jq '{renamed: .foo}' < /tmp/example.json
{
"renamed": "bar"
}
{
"renamed": null
}
{
"renamed": {
"a": 1,
"b": 2
}
}
{
"renamed": {
"a": 3,
"b": 4
}
}
$ jq 'select(.foo != null) | {renamed: .foo}' < /tmp/example.json
{
"renamed": "bar"
}
{
"renamed": {
"a": 1,
"b": 2
}
}
{
"renamed": {
"a": 3,
"b": 4
}
}
$ jq 'select(.foo != null and (.foo | type == "object") and .foo.a > 1) | {renamed: .foo}' < /tmp/example.json
{
"renamed": {
"a": 3,
"b": 4
}
}
$ jq 'select(.foo != null and (.foo | type == "object") and .foo.a > 1 and (.errors | length > 1 and any(.[]; contains("boop")))) | {renamed: .foo, errs: .errors}' < /tmp/example.json
{
"renamed": {
"a": 3,
"b": 4
},
"errs": [
"beep",
"boop"
]
}
When opened in Neovim:
:%!jq
:%!jq -c
:'<,'>!jq
:%!jq -c 'select(.errors | length > 1)'
More advanced jq
examples…
yq
examples…
With the following nd-json stream:
{
"result": {
"agent": "foo",
"count": "80",
}
}
{
"result": {
"agent": "bar",
"count": "123",
}
}
You can produce the following output:
"client: foo, count: 80"
"client: bar, count: 123"
By executing the following command:
cat data.json | jq '.result | "client: \(.agent), count: \(.count)"'
The .result
accesses the relevant field on each object in the stream.
We use |
to pipe the data to the next ‘script’ which defines the output string we want.
The \()
is interpolation syntax (you pass in the field, e.g. .count
). Interpolation must be used inside a string.
NOTE: Make sure the interpolation syntax is quoted! See below JSON example that would break otherwise…
# i.e. JSON strings need to be quoted hence "\(<FIELD>)"
# GOOD
fastly kv-store list --json | jq '.Data.[] | {"name": "\(.Name)", "store_id": "\(.StoreID)"}'
# BROKEN
fastly kv-store list --json | jq '.Data.[] | {"name": \(.Name), "store_id": \(.StoreID)}'
Here is the example JSON
{
"commands": [
{
"name": "...",
...
},
]
}
We want to get the object that has a "name"
field set with a value of "compute"
. This means we need to first get the top-level "commands"
field, then dip inside its assigned list and find the relevant object.
go run cmd/fastly/main.go help --format json | jq '.commands[] | select(.name | contains("compute"))'
NOTE: an alternative to
contains
would be==
operator:fastly service list --json | jq 'map(select(.Name == "testing-tf-provider-reactivation-bug"))'
$ curl -s https://api.github.com/repos/hashicorp/terraform-plugin-docs/releases/latest | \
jq '.assets[] | select(.name | test("darwin")) | .browser_download_url' | \
xargs -I {} curl -sLo /tmp/tfplugindocs.zip {} | \
cd /tmp | \
unzip tfplugindocs.zip tfplugindocs | \
chmod +x ./tfplugindocs | \
./tfplugindocs -h
Below is an alternative approach I found via https://smarterco.de/download-latest-version-from-github-with-curl/
DOWNLOAD_URL=$(curl -s https://api.github.com/repos/felixb/swamp/releases/latest \
| grep browser_download_url \
| grep swamp_amd64 \
| cut -d '"' -f 4)
curl -s -L --create-dirs -o ~/downloadDir "$DOWNLOAD_URL"
function configure_rig_environment {
# The rig platform sets up the environment variables from the config.yml
# before even the scripts/prebuild (as part of a docker image build) is
# triggered. Because of the config.yml interpolation with other yaml files,
# it means we need to reset the CONFIG environment variable for testing.
# first we need to convert the config.yml into json
python -c 'import sys, yaml, json; json.dump(yaml.load(sys.stdin), sys.stdout, indent=4)' < /app/config.yml > /tmp/config.json
# we extract the location/overrides (which doesn't change between environments).
cat /tmp/config.json | jq .default.config.locations > /tmp/locations.json
cat /tmp/config.json | jq .default.config.overrides > /tmp/overrides.json
# then we store off the current config (this is missing the
# locations/overrides) as they're only merged in after the image was built.
echo $CONFIG > /tmp/config.json
# next we combine the three configs back into one
# and we generate a new environment.json file for it
# -c is for generating compact json and not pretty-printed json
jq -c -s '.[0] * {"locations": .[1]} * {"overrides": .[2]}' /tmp/config.json /tmp/locations.json /tmp/overrides.json > /tmp/environment.json
# finally, we re-export the config with the new combined values
# shellcheck disable=SC2155
export CONFIG=$(cat /tmp/environment.json)
}
We have the following YAML, and I want all API paths and HTTP methods that contain a x-fastly-preprocess-exclude
that matches a string I’m searching for (e.g. only filter the data when I’m searching for fastly-ruby
)…
paths:
/content/edge_check:
get:
summary: Check status of content in each POP's cache
description: Retrieve headers and MD5 hash of the content for a particular URL from each Fastly edge server. This API is limited to 200 requests per hour.
operationId: content-check
parameters:
- name: url
in: query
description: Full URL (host and path) to check on all nodes. if protocol is omitted, http will be assumed.
style: form
explode: true
schema:
type: string
example: https://www.example.com/foo/bar
responses:
"200":
description: OK
content:
application/json:
schema:
type: array
items:
$ref: "#/components/schemas/content"
examples:
body:
value:
$ref: "examples/content-edge-check.yaml"
x-fastly-preprocess-exclude:
- fastly-ruby # fastly-ruby cannot handle a property named 'hash'
We convert it to JSON and pipe that JSON data to a jq
script…
#!/bin/bash
__dir=$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)
CLIENT=$1
ISSUES="\nThe $CLIENT API client currently does not support the following endpoints:\n\n"
HAS_ISSUES=false
DEFAULT_URL="https://developer.fastly.com/reference/api/"
# Create temporary file that will eventually contain all unsupported endpoints.
tmp="$(mktemp)"
for schema in ./api-documentation/src/*.yaml; do
# NOTE: We convert each YAML schema into JSON and process it with jq.
json=$(ruby -rjson -ryaml -e "puts YAML.load_file('$schema', permitted_classes: [Time]).to_json")
# We need the API endpoint so we can link to it in the API client's README.
url=$(echo $json | jq -r '.externalDocs.url')
if [[ "$url" == null ]]; then
url="$DEFAULT_URL"
fi
exclude_all=$(echo $json | jq --arg client "$CLIENT" -r '. | if has("x-fastly-preprocess-exclude") then (if (.["x-fastly-preprocess-exclude"] | contains([$client])) or (.["x-fastly-preprocess-exclude"] | contains(["api-client"])) then true else false end) else false end')
EXCLUDE_ALL="$exclude_all"
result=$(echo $json | jq -f "$__dir/transmogrify.jq" --arg exclude_all "$EXCLUDE_ALL" --arg client "$CLIENT" --arg url "$url")
if [[ "$result" != "" ]]; then
HAS_ISSUES=true
# NOTE:
# We get the 'raw' value (jq -r) to avoid having "" around the output string.
# The reason for storing the results into a tmp file is so we can sort the endpoints.
echo $result | jq -r >> $tmp
cat $tmp | sort -o $tmp
fi
done
# NOTE:
# The Awk command is used to replace line breaks with the literal \n.
# Otherwise Awk will later error with the message "newline in string".
ISSUES="${ISSUES}$(cat $tmp | awk '{ printf "%s\\n", $0 }')"
if [[ "$HAS_ISSUES" == true ]]; then
# NOTE: I've used Awk as (when installed) it's consistent across OS' unlike sed.
awk -v issues="$ISSUES" '1;/## Issues/{ print issues; }' "$CLIENT/README.md"
fi
# Cleanup.
rm -rf tmp
Here’s our jq
transmorgrify.jq script file…
.paths | with_entries(
# If we're excluding all endpoints, then all we need to do is to grab the
# keys and convert them into a comma-separated string.
#
# Otherwise, we need to find only those endpoints that have an
# x-fastly-preprocess-exclude containing the API client we're looking for.
#
# .key is the API endpoint path (e.g. /service/{service_id}/version/{version_id}/acl)
# .value is an object containing the supported HTTP methods (e.g. {"get": {...metadata...}, "post": {...metadata...}})
if $exclude_all == "true" then
.value |= (
. | keys | map(select(. != "parameters")) | join(", ")
)
else
# We modify the .value to replace the metadata associated with each HTTP
# method with a list containing the API client we're searching for inside
# of the x-fastly-preprocess-exclude field.
#
# e.g. {"/some/path": {"get": ["fastly-rust"], "post": ["fastly-rust"]}}
.value |= (
. | with_entries(
# .key is the HTTP method (e.g. "get", "post" etc)
# .value is the metadata object (e.g. {"summary": "...", "x-fastly-preprocess-exclude": [...]})
#
# We modify the .value from an object of metadata to a list
# containing only the excluded API client we're searching for
# inside of the x-fastly-preprocess-exclude field.
#
# e.g. {"get": ["fastly-rust"]} or {"get": ["api-client"]}
.value |= (
if (type=="object" and has("x-fastly-preprocess-exclude")) then
.["x-fastly-preprocess-exclude"]
else
[]
end
)
# We make sure to filter out the API clients that aren't relevant
# to our search. We also search for the generic "api-client",
# which means exclude ALL clients.
| select((.value | contains([$client])) or (.value | contains(["api-client"])))
)
)
# For each HTTP method object, keep it, if it's not empty.
| select(.value != {})
# We modify the .value from an object like:
# {"get": ["fastly-rust"], "post": ["fastly-rust"]}
#
# To a comma-separated string like:
# "get, post".
#
# We do this by first getting the keys as an array ["get", "post"], then
# using join() to turn it into a string.
#
# e.g. {"/some/path": "get, post"}
| .value |= (. | keys | map(select(. != "parameters")) | join(", "))
end
)
# We only keep objects (e.g. {"/some/path": "get, post"}) that aren't empty.
| select(. != {})
# Finally, we convert the object (e.g. {"/some/path": "get, post"}) into a
# string that is formated so it can be inserted into a Markdown file.
#
# e.g. "- /some/path (get, post)"
#
# The trick is to modify the object key to be the final string, then get all the
# keys as an array, and join the array with a newline.
#
# I use `with_entries` to modify the key to also contain its value (e.g. the key
# becomes "- /some/path (GET, POST)").
#
# You'll notice I use `ascii_upcase` to capitalize the HTTP methods.
# Then I create an array from the modified keys and join those by a new line.
#
# Additionally I wrap the path in a Markdown link and set the URL using the $url
# variable that is passed into the jq script via jq's --arg flag.
| with_entries(.value as $v | .key |= "- [" + . + "](" + $url +") (" + ($v | ascii_upcase) + ")") | keys | join("\n")
# EXAMPLE OUTPUT:
#
# - [/user-groups/{user_group_id}](...) (PATCH)
# - [/user-groups/{user_group_id}/roles](...) (DELETE, POST)
# - [/user-groups/{user_group_id}/service-groups](...) (DELETE, POST)
# - [/user-groups/{user_group_id}/members](...) (DELETE, POST)
#
# DOCUMENTATION REFERENCE:
#
# with_entries:
# converts object to list of key/value objects, modifies the data, then converts back to an object.
# https://stedolan.github.io/jq/manual/#to_entries,from_entries,with_entries
#
# select:
# returns input value if it matches the given condition.
# https://stedolan.github.io/jq/manual/#select(boolean_expression)
#
# keys:
# takes object and returns the keys in an array.
# https://stedolan.github.io/jq/manual/#keys,keys_unsorted
#
# |= is a "modification assignment operator" which means it assigns the new value after processing.
You can use the +=
with an object {...}
# .paths == {"/api/path": {"parameters": [...], "get": {...}, "post": {...}}}
.paths | with_entries(
# .key == /api/path
# .value == {parameters: [...], get: {...}, post: {...}}
.value |= (
. | with_entries(
# .key == 'parameters' and all support HTTP methods (e.g. 'get', 'post').
# .value == the values assigned to the keys, e.g. HTTP methods have {"summary": "", ...etc}.
.value |= (
if (type=="object" and has("summary")) then
.summary
elif (type=="array") then # i.e. "parameters" is an array
. | map(."$ref" | split("/") | last)
else
"No summary"
end
)
) | . += {"documentation": $api_url}
)
)
Yaml file:
- metadata:
foo: some_string
bar: 123
baz:
- an
- array
qux:
an: object
with: more_keys
Command:
yq -o=json example.yaml | jq '[.[] | select(.metadata != null) | .metadata | paths | join(".")] | unique | sort'
Yaml file:
- host: foo
tls_server_name: foo
- host: bar
tls_server_name: ... # this has a different value to the host key
- host: baz
tls_server_name: baz
Command:
yq e '.[] | select(has("tls_server_name") and .tls_server_name != .host) | {"tls_server_name": .tls_server_name, "host": .host}' resources/checkers.yaml
Output:
host: bar
tls_server_name: ...
Yaml file:
- name: foo
timeout: '10s'
- name: bar
timeout: '5s'
- name: baz
timeout: '20s'
Command:
# converts yaml to json so we use jq
cat resources/checkers.yaml | yq eval -o=json | jq '.[] | select(.timeout != null and (.timeout | sub("s$"; "") | tonumber) > 10) | {name: .name, timeout: .timeout}'
# just uses yq but notice we MUST quote the json key names
# also yq will continue to output yaml even though we define a json output!
yq '.[] | select(.timeout != null and (.timeout | sub("s$"; "") | tonumber) > 10) | {"name": .name, "timeout": .timeout}' resources/checkers.yaml
Output:
{
"name": "nic.in",
"timeout": "30s"
}
{
"name": "nic.gdn",
"timeout": "15s"
}
{
"name": "zdns",
"timeout": "20s"
}
name: nic.in
timeout: 30s
name: nic.gdn
timeout: 15s
name: zdns
timeout: 20s
- name: foo
metadata:
logins:
console: # KEY IS SET HERE
url: 123
- name: bar
metadata:
logins:
ote: # KEY IS NOT SET HERE
url: 123
yq eval -o=json '.' example.yaml | jq -r 'map(select(.metadata.logins.console == null) | .name)'
We have to wrap our data in an array '[.[] | select(.tls_certificate != null)]'
and then we can use unique_by
:
$ yq -o=json example.yaml | jq '[.[] | select(.tls_certificate != null)] | unique_by(.tls_certificate)[] | {tls_cert: .tls_certificate}'
{
"tls_cert": "$EXAMPLE_1"
}
{
"tls_cert": "$EXAMPLE_2"
}