jq 中文手册(v1.5)
本文档为 stedolan.github.io/jq/manual/v1.5/ 的中文翻译,旨在推广 jq 的国内使用 (翻译说明)
jq 程序就像一个过滤器:接收输入,并产生输出。有许多内置的过滤器,提取一个对象的特定字段、或是把数字转成字符串,或是大量的其他的标准任务。
可以通过多种方法结合这些过滤器 - 可以用管道将一个过滤器的输出连到另一个的输入上,或者是把一个过滤器的输出收集到一个数组里。
一些过滤器产生多个结果,比如就有一个可以展开输入的数组把每个元素都给出来的的过滤器,用管道把这个连到第二个过滤器上就使第二个过滤器在数组的每一个元素上作用一遍。通常在其他语言里用循环或者迭代的任务在jq中通过结合过滤器来完成。
切记每个过滤器都有一个输入和一个输出。即使像”hello”或者42这样的常量都是过滤器-他们接受输入但是只产生同样的常量作为输出罢了。操作符可以结合两个过滤器,比如 加 , 一般是给两个过滤器同样的输入,并把结果连接起来。所以你可以实现一个求平均过滤器,即add/length
- 把输入数组分给add
过滤器和length
过滤器,然后做了一个除法。
但是这个可能有些超前了。:),接着看一些简单的:
内容:
调用jq
jq 过滤器运行在JSON 数据流上。jq的输入被解析为一系列由空格分隔的JSON 值,它们一次一个地通过jq 的过滤器。过滤器的输出被写入标准输出,再次作为一系列由空格分隔的JSON数据。
注意: 注意: 一定要当心 shell 的 quote 规则。作为一般规则,最好总是为 jq 程序带上引号(使用单引号),因为太多对jq有特殊意义的字符也是 shell 的元字符。比如 jq "foo"
,在大多数的Unix shell中将会失败,因为这会被当做jq foo
来执行,而通常会报 foo is not defined
这样的错误。当使用Windows 的命令行 shell(cmd.exe)时,最好在命令行上给 jq 程序带上双引号(而不是-f program-file
选项),不过这样的话,jq 程序里面的双引号就需要反斜杠转义了。
你可以使用一些命令行选项来影响 jq 如何读写输入和输出:
-
--version
:输出 jq 的版本并以 exit 0 退出;
-
--seq
:使用
application/json-seq
MIME类型格式分隔 jq 输入和输出中的 JSON 文本。这意味着会在每个输出值前打印一个ASCII RS(记录分隔符)字符,并在每次输出后打印一个ASCII LF(换行符)。输入无法解析的JSON文本会被忽略(但会被警告),直到下一个RS丢弃所有后续输入。 这样另外也可以解析 jq 不使用-- seq
选项时的输出。This more also parses the output of jq without the--seq
option.(译者注:存疑,不懂这句。) -
--stream
:以流方式解析输入,输出路径和叶子上的值(标量和空数组或空字典)。比如:
"a"
becomes[[],"a"]
, and[[],"a",["b"]]
becomes[[0],[]]
,[[1],"a"]
, and[[1,0],"b"]
.这对于处理非常大的输入非常有用。 将此选项与过滤以及
reduce
和foreach
语法结合使用,可逐渐减少大量输入。 -
--slurp
/-s
:不需要为输入中的每个JSON对象运行过滤器,只需将整个输入流读入为一个大型数组,然后只运行一次过滤器。
-
--raw-input
/-R
:不要将输入解析为JSON。 相反,每行文本都以字符串形式传递给过滤器。如果与
--slurp
结合使用,则整个输入将作为单个长字符串传递给过滤器。 -
--null-input
/-n
:根本不读任何输入!而是,过滤器使用
null
作为输入运行一次。 将jq用作简单计算器或从头开始构建JSON数据时,这很有用 -
--compact-output
/-c
:默认情况下, jq 会 pretty-prints JSON 输出。使用这个选项可以把每一个 JSON 对象在单行内更紧凑的输出出来
-
--tab
:每个缩进将会使用 tab 而不是两个空格
-
--indent n
:指定缩进使用的空格数(不能超过 8)
-
--color-output
/-C
and--monochrome-output
/-M
:默认情况下,如果是写入到终端,jq 会输出 colored JSON 。也可以使用
-C
强制输出彩色的JSON 到管道或者文件。也可以使用-M
禁掉输出 colored JSON。 -
--ascii-output
/-a
:jq 一般将 非-ASCII 的 Unicode 字符使用 UTF-8 输出,即使输入的时候可能是转义后的序列(如 “\u03bc”)。使用这个选项,可以强制 jq 生成纯 ASCII 输出,其中每个 非-ASCII 字符将用等价的转义序列替换。
-
--unbuffered
:Flush the output after each JSON object is printed (useful if you’re piping a slow data source into jq and piping jq’s output elsewhere). (译者: 很难翻译)
-
--sort-keys
/-S
:将每个 JSON object 的各个字段按照 key 排序的顺序输出
-
--raw-output
/-r
:在开启这个选项的情况下,如果
过滤器
的结果是 string,就会直接写入标准输出而不是以 JSON string 的格式输出。这在 jq 过滤器和其他处理 非-JSON 系统交互时比较有用。 -
--join-output
/-j
:和
-r
作用一样,但是不会在每个输出的末尾打印一个换行。 -
-f filename
/--from-file filename
:从文件中读取
filter
而不是从命令行中,类似 awk 的 -f 选项。文件中同样可以使用 # 来写注释 -
-Ldirectory
/-L directory
:Prepend
directory
to the search list for modules. If this option is used then no builtin search list is used. See the section on modules below. -
-e
/--exit-status
:设置 jq 的退出状态, 如果最后的输出值既不是
false
也不是null
则 exit 0,如果最后的输出值是false
或null
则 exit 1,如果没有输出有效的结果则 exit 4, 正常情况下如果有 Usage 问题或者系统错误则 exit 2,如果是 jq 程序编译出错则 exit 3,jq 程序正常跑起来则 exit 0 -
--arg name value
:这个选项向 jq 程序传递一个值作为一个预定义的变量。如果以
--arg foo bar
运行 jq 程序,那么在程序中$foo
就是一个值为"bar"
的变量。需要注意的是value
只会被当做 string 处理,如--arg foo 123
会提供$foo
变量,值为"123"
。 -
--argjson name JSON-text
:这个选项向 jq 程序传递一个 JSON 编码的值作为一个预定义的变量。 如果以
--argjson foo 123
运行 jq,那么程序中$foo
就是一个值为123
的变量。 -
--slurpfile variable-name filename
:这个选项会读取名为
filename
文件里面所有的 JSON 文本并将解析后的所有 JSON 组成一个 array 作为名为variable-name
的全局变量的值。如果以--argfile foo bar
运行 jq, 那么在程序中变量$foo
就是一个 array,array 中的每个元素都对应着文件bar
中的 JSON 文本。 -
--argfile variable-name filename
:Do not use. Use
--slurpfile
instead.(这个选项和
--slurpfile
类似,不过当文件只有一个 JSON 文本时用这个,其他多个 JSON 文本用起来和--slurpfile
) -
--run-tests [filename]
:Runs the tests in the given file or standard input. This must be the last option given and does not honor all preceding options. The input consists of comment lines, empty lines, and program lines followed by one input line, as many lines of output as are expected (one per output), and a terminating empty line. Compilation failure tests start with a line containing only “%%FAIL”, then a line containing the program to compile, then a line containing an error message to compare to the actual.
Be warned that this option can change backwards-incompatibly. (译者: 暂不翻译)
基本过滤器
.
绝对最简单(也最平常)的过滤器是 `.`,这是一个接收输入并原样输出的过滤器。
因为jq默认会优美打印所有的输出,这个小程序可以用来格式化一些JSON输出,比如curl
。
jq ‘.’ | |
---|---|
Input | “Hello, world!” |
Output | “Hello, world!” |
.foo
,.foo.bar
最简单的有用过滤器是 .foo
。给定一个JSON Object (即字典或hash)做输入,它会给出”foo”键的值,如果没有这个key则给出null.
如果键里含有关键字符,就要用双引号括起来,比如:.”foo$”.
一个形如.foo.bar
的过滤器是.foo|.bar
的等效写法。
jq '.foo'
--------------------
Input {"foo": 42, "bar": "less interesting data"}
Output 42
jq '.foo'
--------------------
Input {"notfoo": true, "alsonotfoo": false}
Output null
jq '.["foo"]'
--------------------
Input {"foo": 42}
Output 42
.foo?
就跟.foo
差不多,但是当.
不是一个数组或一个对象而报错时,不会输出。
jq '.foo?'
--------------------
Input {"foo": 42, "bar": "less interesting data"}
Output 42
jq '.foo?'
--------------------
Input {"notfoo": true, "alsonotfoo": false}
Output null
jq '.["foo"]?'
--------------------
Input {"foo": 42}
Output 42
jq '[.foo?]'
--------------------
Input [1,2]
Output []
.[<string>]
,.[2]
,.[10:15]
也可以使用类似 .["foo"]
的语法来查找 JSON Object 的一些元素 (上面的 .foo
是这个的一个速记版本)。如果 key 是数字的话,这种用法在 array 的情况下也可以有效。array 是以 0 为基的(类似 javascript),因此 .[2]
返回 array 的第三个元素。
.[10:15]
这种语法可以用来返回一个数组的子数组,或者一个字符串的子字符串。.[10:15]
返回的数组长为 5,包含了索引从 10(包含)到 15(不包含)的元素。索引可以是负数的(这种情况下会从 array 的尾部开始倒着计数) 或者可以忽略(这种情况下指向数组的头部或者尾部)。
.[2]
这种语法用来返回数组的指定索引的元素。负索引也是可以的,-1 表示最后一个元素,-2 表示倒数第二个元素,以此类推。
.foo
这种语法仅对简单的 key 有效,即 key 仅包含字母或数字字符(alphanumeric)。
.[<string>]
这种语法可以对包含特殊字符的 key 有效,诸如冒号或者点号。比如 .["foo::bar"]
和 .["for.bar"]
可以起效,而.foo::bar
和 .foo.bar
就不行。
?
操作符(“operator”) 也可以在这种切片操作(slice operator)下使用。例如 .[10:15]?
可以在输入是可进行切片操作(slice-able)的时候输出一些值。
jq '.[0]'
--------------------
Input [{"name":"JSON", "good":true}, {"name":"XML", "good":false}]
Output {"name":"JSON", "good":true}
jq '.[2]'
--------------------
Input [{"name":"JSON", "good":true}, {"name":"XML", "good":false}]
Output null
jq '.[2:4]'
--------------------
Input ["a","b","c","d","e"]
Output ["c", "d"]
jq '.[2:4]'
--------------------
Input "abcdefghi"
Output "cd"
jq '.[:3]'
--------------------
Input ["a","b","c","d","e"]
Output ["a", "b", "c"]
jq '.[-2:]'
--------------------
Input ["a","b","c","d","e"]
Output ["d", "e"]
jq '.[-2]'
--------------------
Input [1,2,3]
Output 2
.[]
如果使用 .[index]
这种语法,但完全省略 index,他就会返回 array 的 所有 元素。
对于输入 [1,2,3]
运行 .[]
就会生出 3 个独立的结果,而不是一个单个数组。
也可以在 object 上使用它,它将返回 object 的所有 value
jq '.[]'
--------------------
Input [{"name":"JSON", "good":true}, {"name":"XML", "good":false}]
Output {"name":"JSON", "good":true}
{"name":"XML", "good":false}
jq '.[]'
--------------------
Input []
Output
jq '.[]'
--------------------
Input {"a": 1, "b": 1}
Output [1, 1]
.[]?
类似 .[]
, 不过当 .
不是 array 或者 object 是不会输出 errors
Like .[]
, but no errors will be output if . is not an array
or object.
,
如果用逗号分隔两个过滤器,输入就会被每个过滤器处理,并依次输出多个结果:
首先是第一个过滤器的生成的所有输出,然后是第二个过滤器的生成的所有输出。
如 .foo, .bar
生成 “foo” 字段和 “bar” 字段的值。
jq '.foo, .bar'
--------------------
Input {"foo": 42, "bar": "something else", "baz": true}
Output [42, "something else"]
jq '.user, .projects[]'
--------------------
Input {"user":"stedolan", "projects": ["jq", "wikiflow"]}
Output ["stedolan", "jq", "wikiflow"]
jq '.[4,2]'
--------------------
Input ["a","b","c","d","e"]
Output ["e", "c"]
|
|
运算符结合左右两个过滤器 Filter, 把左边的输出 output
投递到右边的输入 input
。如果你用过 Unix shell 的管道 pipe
, |
几乎和它是一样的。
如果左边的过滤器产生多个结果,则将为每个结果运行右侧过滤器。 所以表达式.[] | .foo
检索输入数组里每个元素的 “foo” 字段。
jq '.[] | .name'
--------------------
Input [{"name":"JSON", "good":true}, {"name":"XML", "good":false}]
Output ["JSON", '"XML"']
Types and Values
jq 支持与 JSON 相同的一组数据类型集合 – 数字 numbers
、 字符串 strings
、布尔值 booleans
、数组 arrays
、对象 objects
(在 JSON 中称做是仅有字符串键的哈希值 hashes
)和 null 。
布尔值 booleans
、空值 null
、字符串 strings
和数字 numbers
的书写方式与在 JavaScript 中相同。和 jq 中其他的内容一样,这些简单的值也被看做接收一个输入, 并且产生一个输出。如, 42
是一个合法的 jq 表达式, 忽略接收的输入, 并输出 42。
数组构造(array) - []
与 JSON 一样,[]
用于构造数组,如 [1,2,3]
。 数组的元素可以是任何 jq 表达式。所有表达式产生的所有结果都被收集到一个大数组中。你可以使用它从已知数量的值中构造一个数组(如 [.foo, .bar, .baz]
)或将过滤器的所有结果”收集”到一个数组中(如 [.items[].name]
)
一旦你理解了 ,
运算符,你就可以从不同的角度看待 jq 的数组语法:表达式 [1,2,3]
不是使用逗号分隔数组的内置语法,而是在对表达式 1,2,3
(输出 3 个值) 应用 []
运算符(收集结果)。
如果你有一个过滤器 X
产生四个结果,那么表达式 [X]
将产生一个结果,即一个包含四个元素的数组。
jq '[.user, .projects[]]'
-------------------------------------
Input {"user":"stedolan", "projects": ["jq", "wikiflow"]}
Output ["stedolan", "jq", "wikiflow"]
Objects - {}
与 JSON 一样,{}
用于构建对象(又叫字典 dictionary
或哈希 hash
),如:{"a": 42, "b": 17}
。
如果 Key 是 “合理的(sensible)” (由所有字母字符组成),则引号可以省略。Value 可以是任何表达式(如果比较复杂, 可以使用括号包起来),表达式会将 {}
表达式的输入作为输入(每个过滤器都有一个输入和输出)。
{foo: .bar}
如果输入是 {"bar":42, "baz":43}
, 那么表达式的输出为 {"foo": 42}
。
可以用来筛选一个 object 的特定字段:如果输入对象有 “user”、”title”、”id”、”content” 字段,而只需要 “user”、”title”,则可以这样写
{user: .user, title: .title}
因为这种用法很常见,所以有一个快捷语法:{user, title}
。
如果其中一个表达式生成多个结果,那么表达式将生成多个词典。如果输入是
{"user":"stedolan","titles":["JQ Primer", "More JQ"]}
那么表达式
{user, title: .titles[]}
将会生成输出
{"user":"stedolan", "title": "JQ Primer"}
{"user":"stedolan", "title": "More JQ"}
使用括号包裹 Key ,意味着它将被当做表达式来计算 key 。使用与上述相同的输入,
表达式
{(.user): .titles}
会输出
{"stedolan": ["JQ Primer", "More JQ"]}
jq '{user, title: .titles[]}'
-------------------------------------
Input {"user":"stedolan","titles":["JQ Primer", "More JQ"]}
Output {"user":"stedolan", "title": "JQ Primer"}
{"user":"stedolan", "title": "More JQ"}
jq '{(.user): .titles}'
-------------------------------------
Input {"user":"stedolan","titles":["JQ Primer", "More JQ"]}
Output {"stedolan": ["JQ Primer", "More JQ"]}
TODO
-
title: Builtin operators and functions body: |
Some jq operator (for instance,
+
) do different things depending on the type of their arguments (arrays, numbers, etc.). However, jq never does implicit type conversions. If you try to add a string to an object you’ll get an error message and no result.entries:
-
title: Addition -
+
body: |The operator
+
takes two filters, applies them both to the same input, and adds the results together. What “adding” means depends on the types involved:-
Numbers are added by normal arithmetic.
-
Arrays are added by being concatenated into a larger array.
-
Strings are added by being joined into a larger string.
-
Objects are added by merging, that is, inserting all the key-value pairs from both objects into a single combined object. If both objects contain a value for the same key, the object on the right of the
+
wins. (For recursive merge use the*
operator.)
null
can be added to any value, and returns the other value unchanged.examples:
- program: ‘.a + 1’ input: ‘{“a”: 7}’ output: [‘8’]
- program: ‘.a + .b’ input: ‘{“a”: [1,2], “b”: [3,4]}’ output: [‘[1,2,3,4]’]
- program: ‘.a + null’ input: ‘{“a”: 1}’ output: [‘1’]
- program: ‘.a + 1’ input: ‘{}’ output: [‘1’]
- program: ‘{a: 1} + {b: 2} + {c: 3} + {a: 42}’ input: ‘null’ output: [’{“a”: 42, “b”: 2, “c”: 3}’]
-
-
title: Subtraction -
-
body: |As well as normal arithmetic subtraction on numbers, the
-
operator can be used on arrays to remove all occurrences of the second array’s elements from the first array.examples:
- program: ‘4 - .a’ input: ‘{“a”:3}’ output: [‘1’]
- program: . - [“xml”, “yaml”] input: ‘[“xml”, “yaml”, “json”]’ output: [’[“json”]’]
-
title: Multiplication, division, modulo -
*
,/
, and%
body: |These infix operators behave as expected when given two numbers. Division by zero raises an error.
x % y
computes x modulo y.Multiplying a string by a number produces the concatenation of that string that many times.
"x" * 0
produces null.Dividing a string by another splits the first using the second as separators.
Multiplying two objects will merge them recursively: this works like addition but if both objects contain a value for the same key, and the values are objects, the two are merged with the same strategy.
examples:
- program: ‘10 / . * 3’ input: 5 output: [6]
- program: ‘. / “, “’ input: ‘“a, b,c,d, e”’ output: [’[“a”,”b,c,d”,”e”]’]
- program: ‘{“k”: {“a”: 1, “b”: 2}} * {“k”: {“a”: 0,”c”: 3}}’ input: ‘null’ output: [’{“k”: {“a”: 0, “b”: 2, “c”: 3}}’]
- program: ‘.[] | (1 / .)?’ input: ‘[1,0,-1]’ output: [‘1’, ‘-1’]
-
title: “
length
” body: |The builtin function
length
gets the length of various different types of value:-
The length of a string is the number of Unicode codepoints it contains (which will be the same as its JSON-encoded length in bytes if it’s pure ASCII).
-
The length of an array is the number of elements.
-
The length of an object is the number of key-value pairs.
-
The length of null is zero.
examples:
- program: ‘.[] | length’ input: ‘[[1,2], “string”, {“a”:2}, null]’ output: [2, 6, 1, 0]
-
-
title: “
keys
,keys_unsorted
” body: |The builtin function
keys
, when given an object, returns its keys in an array.The keys are sorted “alphabetically”, by unicode codepoint order. This is not an order that makes particular sense in any particular language, but you can count on it being the same for any two objects with the same set of keys, regardless of locale settings.
When
keys
is given an array, it returns the valid indices for that array: the integers from 0 to length-1.The
keys_unsorted
function is just likekeys
, but if the input is an object then the keys will not be sorted, instead the keys will roughly be in insertion order.examples:
- program: ‘keys’ input: ‘{“abc”: 1, “abcd”: 2, “Foo”: 3}’ output: [’[“Foo”, “abc”, “abcd”]’]
- program: ‘keys’ input: ‘[42,3,35]’ output: [‘[0,1,2]’]
-
title: “
has(key)
” body: |The builtin function
has
returns whether the input object has the given key, or the input array has an element at the given index.has($key)
has the same effect as checking whether$key
is a member of the array returned bykeys
, althoughhas
will be faster.examples:
- program: ‘map(has(“foo”))’ input: ‘[{“foo”: 42}, {}]’ output: [‘[true, false]’]
- program: ‘map(has(2))’ input: ‘[[0,1], [“a”,”b”,”c”]]’ output: [‘[false, true]’]
-
title: “
in
” body: |The builtin function
in
returns whether or not the input key is in the given object, or the input index corresponds to an element in the given array. It is, essentially, an inversed version ofhas
.examples:
- program: ‘.[] | in({“foo”: 42})’ input: ‘[“foo”, “bar”]’ output: [‘true’, ‘false’]
- program: ‘map(in([0,1]))’ input: ‘[2, 0]’ output: [‘[false, true]’]
-
title: “
path(path_expression)
” body: |Outputs array representations of the given path expression in
.
. The outputs are arrays of strings (object keys) and/or numbers (array indices).Path expressions are jq expressions like
.a
, but also.[]
. There are two types of path expressions: ones that can match exactly, and ones that cannot. For example,.a.b.c
is an exact match path expression, while.a[].b
is not.path(exact_path_expression)
will produce the array representation of the path expression even if it does not exist in.
, if.
isnull
or an array or an object.path(pattern)
will produce array representations of the paths matchingpattern
if the paths exist in.
.Note that the path expressions are not different from normal expressions. The expression
path(..|select(type=="boolean"))
outputs all the paths to boolean values in.
, and only those paths.examples:
- program: ‘path(.a[0].b)’ input: ‘null’ output: [’[“a”,0,”b”]’]
- program: ‘[path(..)]’ input: ‘{“a”:[{“b”:1}]}’ output: [’[[],[“a”],[“a”,0],[“a”,0,”b”]]’]
-
title: “
del(path_expression)
” body: |The builtin function
del
removes a key and its corresponding value from an object.examples:
- program: ‘del(.foo)’ input: ‘{“foo”: 42, “bar”: 9001, “baz”: 42}’ output: [’{“bar”: 9001, “baz”: 42}’]
- program: ‘del(.[1, 2])’ input: ‘[“foo”, “bar”, “baz”]’ output: [’[“foo”]’]
-
title: “
to_entries
,from_entries
,with_entries
” body: |These functions convert between an object and an array of key-value pairs. If
to_entries
is passed an object, then for eachk: v
entry in the input, the output array includes{"key": k, "value": v}
.from_entries
does the opposite conversion, andwith_entries(foo)
is a shorthand forto_entries | map(foo) | from_entries
, useful for doing some operation to all keys and values of an object.from_entries
accepts key, Key, Name, value and Value as keys.examples:
- program: ‘to_entries’ input: ‘{“a”: 1, “b”: 2}’ output: [’[{“key”:”a”, “value”:1}, {“key”:”b”, “value”:2}]’]
- program: ‘from_entries’ input: ‘[{“key”:”a”, “value”:1}, {“key”:”b”, “value”:2}]’ output: [’{“a”: 1, “b”: 2}’]
- program: ‘with_entries(.key |= “KEY_” + .)’ input: ‘{“a”: 1, “b”: 2}’ output: [’{“KEY_a”: 1, “KEY_b”: 2}’]
-
title: “
select(boolean_expression)
” body: |The function
select(foo)
produces its input unchanged iffoo
returns true for that input, and produces no output otherwise.It’s useful for filtering lists:
[1,2,3] | map(select(. >= 2))
will give you[2,3]
.examples:
- program: ‘map(select(. >= 2))’ input: ‘[1,5,3,0,7]’ output: [‘[5,3,7]’]
- program: ‘.[] | select(.id == “second”)’ input: ‘[{“id”: “first”, “val”: 1}, {“id”: “second”, “val”: 2}]’ output: [’{“id”: “second”, “val”: 2}’]
-
title: “
arrays
,objects
,iterables
,booleans
,numbers
,normals
,finites
,strings
,nulls
,values
,scalars
” body: |These built-ins select only inputs that are arrays, objects, iterables (arrays or objects), booleans, numbers, normal numbers, finite numbers, strings, null, non-null values, and non-iterables, respectively.
examples:
- program: ‘.[]|numbers’ input: ‘[[],{},1,”foo”,null,true,false]’ output: [‘1’]
-
title: “
empty
” body: |empty
returns no results. None at all. Not evennull
.It’s useful on occasion. You’ll know if you need it :)
examples:
- program: ‘1, empty, 2’ input: ‘null’ output: [1, 2]
- program: ‘[1,2,empty,3]’ input: ‘null’ output: [‘[1,2,3]’]
-
title: “
error(message)
” body: |Produces an error, just like
.a
applied to values other than null and objects would, but with the given message as the error’s value. -
title: “
$__loc__
” body: |Produces an object with a “file” key and a “line” key, with the filename and line number where
$__loc__
occurs, as values.examples:
- program: ‘try error(“($loc)”) catch .’
input: ‘null’
output: [’”{"file":"
\",\"line\":1}"']
- program: ‘try error(“($loc)”) catch .’
input: ‘null’
output: [’”{"file":"
-
title: “
map(x)
,map_values(x)
” body: |For any filter
x
,map(x)
will run that filter for each element of the input array, and return the outputs in a new array.map(.+1)
will increment each element of an array of numbers.Similarly,
map_values(x)
will run that filter for each element, but it will return an object when an object is passed.map(x)
is equivalent to[.[] | x]
. In fact, this is how it’s defined. Similarly,map_values(x)
is defined as.[] |= x
.examples:
-
program: ‘map(.+1)’ input: ‘[1,2,3]’ output: [‘[2,3,4]’]
-
program: ‘map_values(.+1)’ input: ‘{“a”: 1, “b”: 2, “c”: 3}’ output: [’{“a”: 2, “b”: 3, “c”: 4}’]
-
-
title: “
paths
,paths(node_filter)
,leaf_paths
” body: |paths
outputs the paths to all the elements in its input (except it does not output the empty list, representing . itself).paths(f)
outputs the paths to any values for whichf
is true. That is,paths(numbers)
outputs the paths to all numeric values.leaf_paths
is an alias ofpaths(scalars)
;leaf_paths
is deprecated and will be removed in the next major release.examples:
- program: ‘[paths]’ input: ‘[1,[[],{“a”:2}]]’ output: [’[[0],[1],[1,0],[1,1],[1,1,”a”]]’]
- program: ‘[paths(scalars)]’ input: ‘[1,[[],{“a”:2}]]’ output: [’[[0],[1,1,”a”]]’]
-
title: “
add
” body: |The filter
add
takes as input an array, and produces as output the elements of the array added together. This might mean summed, concatenated or merged depending on the types of the elements of the input array - the rules are the same as those for the+
operator (described above).If the input is an empty array,
add
returnsnull
.examples:
- program: add input: ‘[“a”,”b”,”c”]’ output: [‘“abc”’]
- program: add input: ‘[1, 2, 3]’ output: [6]
- program: add input: ‘[]’ output: [“null”]
-
title: “
any
,any(condition)
,any(generator; condition)
” body: |The filter
any
takes as input an array of boolean values, and producestrue
as output if any of the elements of the array aretrue
.If the input is an empty array,
any
returnsfalse
.The
any(condition)
form applies the given condition to the elements of the input array.The
any(generator; condition)
form applies the given condition to all the outputs of the given generator.examples:
- program: any input: ‘[true, false]’ output: [“true”]
- program: any input: ‘[false, false]’ output: [“false”]
- program: any input: ‘[]’ output: [“false”]
-
title: “
all
,all(condition)
,all(generator; condition)
” body: |The filter
all
takes as input an array of boolean values, and producestrue
as output if all of the elements of the array aretrue
.The
all(condition)
form applies the given condition to the elements of the input array.The
all(generator; condition)
form applies the given condition to all the outputs of the given generator.If the input is an empty array,
all
returnstrue
.examples:
- program: all input: ‘[true, false]’ output: [“false”]
- program: all input: ‘[true, true]’ output: [“true”]
- program: all input: ‘[]’ output: [“true”]
-
title: “
flatten
,flatten(depth)
” body: |The filter
flatten
takes as input an array of nested arrays, and produces a flat array in which all arrays inside the original array have been recursively replaced by their values. You can pass an argument to it to specify how many levels of nesting to flatten.flatten(2)
is likeflatten
, but going only up to two levels deep.examples:
- program: flatten input: ‘[1, [2], [[3]]]’ output: [“[1, 2, 3]”]
- program: flatten(1) input: ‘[1, [2], [[3]]]’ output: [“[1, 2, [3]]”]
- program: flatten input: ‘[[]]’ output: [”[]”]
- program: flatten input: ‘[{“foo”: “bar”}, [{“foo”: “baz”}]]’ output: [’[{“foo”: “bar”}, {“foo”: “baz”}]’]
-
title: “
range(upto)
,range(from;upto)
range(from;upto;by)
” body: |The
range
function produces a range of numbers.range(4;10)
produces 6 numbers, from 4 (inclusive) to 10 (exclusive). The numbers are produced as separate outputs. Use[range(4;10)]
to get a range as an array.The one argument form generates numbers from 0 to the given number, with an increment of 1.
The two argument form generates numbers from
from
toupto
with an increment of 1.The three argument form generates numbers
from
toupto
with an increment ofby
.examples:
- program: ‘range(2;4)’ input: ‘null’ output: [‘2’, ‘3’]
- program: ‘[range(2;4)]’ input: ‘null’ output: [‘[2,3]’]
- program: ‘[range(4)]’ input: ‘null’ output: [‘[0,1,2,3]’]
- program: ‘[range(0;10;3)]’ input: ‘null’ output: [‘[0,3,6,9]’]
- program: ‘[range(0;10;-1)]’ input: ‘null’ output: [’[]’]
- program: ‘[range(0;-5;-1)]’ input: ‘null’ output: [‘[0,-1,-2,-3,-4]’]
-
title: “
floor
” body: |The
floor
function returns the floor of its numeric input.examples:
- program: ‘floor’ input: ‘3.14159’ output: [‘3’]
-
title: “
sqrt
” body: |The
sqrt
function returns the square root of its numeric input.examples:
- program: ‘sqrt’ input: ‘9’ output: [‘3’]
-
title: “
tonumber
” body: |The
tonumber
function parses its input as a number. It will convert correctly-formatted strings to their numeric equivalent, leave numbers alone, and give an error on all other input.examples:
- program: ‘.[] | tonumber’ input: ‘[1, “1”]’ output: [1, 1]
-
title: “
tostring
” body: |The
tostring
function prints its input as a string. Strings are left unchanged, and all other values are JSON-encoded.examples:
- program: ‘.[] | tostring’ input: ‘[1, “1”, [1]]’ output: [‘“1”’, ‘“1”’, ‘“[1]”’]
-
title: “
type
” body: |The
type
function returns the type of its argument as a string, which is one of null, boolean, number, string, array or object.examples:
- program: ‘map(type)’ input: ‘[0, false, [], {}, null, “hello”]’ output: [’[“number”, “boolean”, “array”, “object”, “null”, “string”]’]
-
title: “
infinite
,nan
,isinfinite
,isnan
,isfinite
,isnormal
” body: |Some arithmetic operations can yield infinities and “not a number” (NaN) values. The
isinfinite
builtin returnstrue
if its input is infinite. Theisnan
builtin returnstrue
if its input is a NaN. Theinfinite
builtin returns a positive infinite value. Thenan
builtin returns a NaN. Theisnormal
builtin returns true if its input is a normal number.Note that division by zero raises an error.
Currently most arithmetic operations operating on infinities, NaNs, and sub-normals do not raise errors.
examples:
- program: ‘.[] | (infinite * .) < 0’ input: ‘[-1, 1]’ output: [‘true’, ‘false’]
- program: ‘infinite, nan | type’ input: ‘null’ output: [‘“number”’, ‘“number”’]
-
title: “
sort, sort_by(path_expression)
” body: |The
sort
functions sorts its input, which must be an array. Values are sorted in the following order:null
false
true
- numbers
- strings, in alphabetical order (by unicode codepoint value)
- arrays, in lexical order
- objects
The ordering for objects is a little complex: first they’re compared by comparing their sets of keys (as arrays in sorted order), and if their keys are equal then the values are compared key by key.
sort
may be used to sort by a particular field of an object, or by applying any jq filter.sort_by(foo)
compares two elements by comparing the result offoo
on each element.examples:
- program: ‘sort’ input: ‘[8,3,null,6]’ output: [‘[null,3,6,8]’]
- program: ‘sort_by(.foo)’ input: ‘[{“foo”:4, “bar”:10}, {“foo”:3, “bar”:100}, {“foo”:2, “bar”:1}]’ output: [’[{“foo”:2, “bar”:1}, {“foo”:3, “bar”:100}, {“foo”:4, “bar”:10}]’]
-
title: “
group_by(path_expression)
” body: |group_by(.foo)
takes as input an array, groups the elements having the same.foo
field into separate arrays, and produces all of these arrays as elements of a larger array, sorted by the value of the.foo
field.Any jq expression, not just a field access, may be used in place of
.foo
. The sorting order is the same as described in thesort
function above.examples:
- program: ‘group_by(.foo)’ input: ‘[{“foo”:1, “bar”:10}, {“foo”:3, “bar”:100}, {“foo”:1, “bar”:1}]’ output: [’[[{“foo”:1, “bar”:10}, {“foo”:1, “bar”:1}], [{“foo”:3, “bar”:100}]]’]
-
title: “
min
,max
,min_by(path_exp)
,max_by(path_exp)
” body: |Find the minimum or maximum element of the input array.
The
min_by(path_exp)
andmax_by(path_exp)
functions allow you to specify a particular field or property to examine, e.g.min_by(.foo)
finds the object with the smallestfoo
field.examples:
- program: ‘min’ input: ‘[5,4,2,7]’ output: [‘2’]
- program: ‘max_by(.foo)’ input: ‘[{“foo”:1, “bar”:14}, {“foo”:2, “bar”:3}]’ output: [’{“foo”:2, “bar”:3}’]
-
title: “
unique
,unique_by(path_exp)
” body: |The
unique
function takes as input an array and produces an array of the same elements, in sorted order, with duplicates removed.The
unique_by(path_exp)
function will keep only one element for each value obtained by applying the argument. Think of it as making an array by taking one element out of every group produced bygroup
.examples:
- program: ‘unique’ input: ‘[1,2,5,3,5,3,1,3]’ output: [‘[1,2,3,5]’]
- program: ‘unique_by(.foo)’ input: ‘[{“foo”: 1, “bar”: 2}, {“foo”: 1, “bar”: 3}, {“foo”: 4, “bar”: 5}]’ output: [’[{“foo”: 1, “bar”: 2}, {“foo”: 4, “bar”: 5}]’]
- program: ‘unique_by(length)’ input: ‘[“chunky”, “bacon”, “kitten”, “cicada”, “asparagus”]’ output: [’[“bacon”, “chunky”, “asparagus”]’]
-
title: “
reverse
” body: |This function reverses an array.
examples:
- program: ‘reverse’ input: ‘[1,2,3,4]’ output: [‘[4,3,2,1]’]
-
title: “
contains(element)
” body: |The filter
contains(b)
will produce true if b is completely contained within the input. A string B is contained in a string A if B is a substring of A. An array B is contained in an array A if all elements in B are contained in any element in A. An object B is contained in object A if all of the values in B are contained in the value in A with the same key. All other types are assumed to be contained in each other if they are equal.examples:
- program: ‘contains(“bar”)’ input: ‘“foobar”’ output: [‘true’]
- program: ‘contains([“baz”, “bar”])’ input: ‘[“foobar”, “foobaz”, “blarp”]’ output: [‘true’]
- program: ‘contains([“bazzzzz”, “bar”])’ input: ‘[“foobar”, “foobaz”, “blarp”]’ output: [‘false’]
- program: ‘contains({foo: 12, bar: [{barp: 12}]})’ input: ‘{“foo”: 12, “bar”:[1,2,{“barp”:12, “blip”:13}]}’ output: [‘true’]
- program: ‘contains({foo: 12, bar: [{barp: 15}]})’ input: ‘{“foo”: 12, “bar”:[1,2,{“barp”:12, “blip”:13}]}’ output: [‘false’]
-
title: “
indices(s)
” body: |Outputs an array containing the indices in
.
wheres
occurs. The input may be an array, in which case ifs
is an array then the indices output will be those where all elements in.
match those ofs
.examples:
- program: ‘indices(“, “)’ input: ‘“a,b, cd, efg, hijk”’ output: [‘[3,7,12]’]
- program: ‘indices(1)’ input: ‘[0,1,2,1,3,1,4]’ output: [‘[1,3,5]’]
- program: ‘indices([1,2])’ input: ‘[0,1,2,3,1,4,2,5,1,2,6,7]’ output: [‘[1,8]’]
-
title: “
index(s)
,rindex(s)
” body: |Outputs the index of the first (
index
) or last (rindex
) occurrence ofs
in the input.examples:
- program: ‘index(“, “)’ input: ‘“a,b, cd, efg, hijk”’ output: [‘3’]
- program: ‘rindex(“, “)’ input: ‘“a,b, cd, efg, hijk”’ output: [‘12’]
-
title: “
inside
” body: |The filter
inside(b)
will produce true if the input is completely contained within b. It is, essentially, an inversed version ofcontains
.examples:
- program: ‘inside(“foobar”)’ input: ‘“bar”’ output: [‘true’]
- program: ‘inside([“foobar”, “foobaz”, “blarp”])’ input: ‘[“baz”, “bar”]’ output: [‘true’]
- program: ‘inside([“foobar”, “foobaz”, “blarp”])’ input: ‘[“bazzzzz”, “bar”]’ output: [‘false’]
- program: ‘inside({“foo”: 12, “bar”:[1,2,{“barp”:12, “blip”:13}]})’ input: ‘{“foo”: 12, “bar”: [{“barp”: 12}]}’ output: [‘true’]
- program: ‘inside({“foo”: 12, “bar”:[1,2,{“barp”:12, “blip”:13}]})’ input: ‘{“foo”: 12, “bar”: [{“barp”: 15}]}’ output: [‘false’]
-
title: “
startswith(str)
” body: |Outputs
true
if . starts with the given string argument.examples:
- program: ‘[.[]|startswith(“foo”)]’ input: ‘[“fo”, “foo”, “barfoo”, “foobar”, “barfoob”]’ output: [‘[false, true, false, true, false]’]
-
title: “
endswith(str)
” body: |Outputs
true
if . ends with the given string argument.examples:
- program: ‘[.[]|endswith(“foo”)]’ input: ‘[“foobar”, “barfoo”]’ output: [‘[false, true]’]
-
title: “
combinations
,combinations(n)
” body: |Outputs all combinations of the elements of the arrays in the input array. If given an argument
n
, it outputs all combinations ofn
repetitions of the input array.examples:
- program: ‘combinations’ input: ‘[[1,2], [3, 4]]’ output: [‘[1, 3]’, ‘[1, 4]’, ‘[2, 3]’, ‘[2, 4]’]
- program: ‘combinations(2)’ input: ‘[0, 1]’ output: [‘[0, 0]’, ‘[0, 1]’, ‘[1, 0]’, ‘[1, 1]’]
-
title: “
ltrimstr(str)
” body: |Outputs its input with the given prefix string removed, if it starts with it.
examples:
- program: ‘[.[]|ltrimstr(“foo”)]’ input: ‘[“fo”, “foo”, “barfoo”, “foobar”, “afoo”]’ output: [’[“fo”,””,”barfoo”,”bar”,”afoo”]’]
-
title: “
rtrimstr(str)
” body: |Outputs its input with the given suffix string removed, if it ends with it.
examples:
- program: ‘[.[]|rtrimstr(“foo”)]’ input: ‘[“fo”, “foo”, “barfoo”, “foobar”, “foob”]’ output: [’[“fo”,””,”bar”,”foobar”,”foob”]’]
-
title: “
explode
” body: |Converts an input string into an array of the string’s codepoint numbers.
examples:
- program: ‘explode’ input: ‘“foobar”’ output: [‘[102,111,111,98,97,114]’]
-
title: “
implode
” body: |The inverse of explode.
examples:
- program: ‘implode’ input: ‘[65, 66, 67]’ output: [‘“ABC”’]
-
title: “
split
” body: |Splits an input string on the separator argument.
examples:
- program: ‘split(“, “)’ input: ‘“a, b,c,d, e, “’ output: [’[“a”,”b,c,d”,”e”,””]’]
-
title: “
join(str)
” body: |Joins the array of elements given as input, using the argument as separator. It is the inverse of
split
: that is, runningsplit("foo") | join("foo")
over any input string returns said input string.examples:
- program: ‘join(“, “)’ input: ‘[“a”,”b,c,d”,”e”]’ output: [‘“a, b,c,d, e”’]
-
title: “
ascii_downcase
,ascii_upcase
” body: |Emit a copy of the input string with its alphabetic characters (a-z and A-Z) converted to the specified case.
example:
- program: ‘ascii_upcase’ input: ‘“useful but not for é”’ output: ‘“USEFUL BUT NOT FOR é”’
-
title: “
while(cond; update)
” body: |The
while(cond; update)
function allows you to repeatedly apply an update to.
untilcond
is false.Note that
while(cond; update)
is internally defined as a recursive jq function. Recursive calls withinwhile
will not consume additional memory ifupdate
produces at most one output for each input. See advanced topics below.examples:
- program: ‘[while(.<100; .*2)]’ input: ‘1’ output: [‘[1,2,4,8,16,32,64]’]
-
title: “
until(cond; next)
” body: |The
until(cond; next)
function allows you to repeatedly apply the expressionnext
, initially to.
then to its own output, untilcond
is true. For example, this can be used to implement a factorial function (see below).Note that
until(cond; next)
is internally defined as a recursive jq function. Recursive calls withinuntil()
will not consume additional memory ifnext
produces at most one output for each input. See advanced topics below.examples:
- program: ‘[.,1]|until(.[0] < 1; [.[0] - 1, .[1] * .[0]])|.[1]’ input: ‘4’ output: [‘24’]
-
title: “
recurse(f)
,recurse
,recurse(f; condition)
,recurse_down
” body: |The
recurse(f)
function allows you to search through a recursive structure, and extract interesting data from all levels. Suppose your input represents a filesystem:{"name": "/", "children": [ {"name": "/bin", "children": [ {"name": "/bin/ls", "children": []}, {"name": "/bin/sh", "children": []}]}, {"name": "/home", "children": [ {"name": "/home/stephen", "children": [ {"name": "/home/stephen/jq", "children": []}]}]}]}
Now suppose you want to extract all of the filenames present. You need to retrieve
.name
,.children[].name
,.children[].children[].name
, and so on. You can do this with:recurse(.children[]) | .name
When called without an argument,
recurse
is equivalent torecurse(.[]?)
.recurse(f)
is identical torecurse(f; . != null)
and can be used without concerns about recursion depth.recurse(f; condition)
is a generator which begins by emitting . and then emits in turn .|f, .|f|f, .|f|f|f, … so long as the computed value satisfies the condition. For example, to generate all the integers, at least in principle, one could writerecurse(.+1; true)
.For legacy reasons,
recurse_down
exists as an alias to callingrecurse
without arguments. This alias is considered deprecated and will be removed in the next major release.The recursive calls in
recurse
will not consume additional memory wheneverf
produces at most a single output for each input.examples:
- program: ‘recurse(.foo[])’
input: ‘{“foo”:[{“foo”: []}, {“foo”:[{“foo”:[]}]}]}’
output:
- ’{“foo”:[{“foo”:[]},{“foo”:[{“foo”:[]}]}]}’
- ’{“foo”:[]}’
- ’{“foo”:[{“foo”:[]}]}’
- ’{“foo”:[]}’
- program: ‘recurse’
input: ‘{“a”:0,”b”:[1]}’
output:
- ’{“a”:0,”b”:[1]}’
- ‘0’
- ‘[1]’
- ‘1’
- program: ‘recurse(. * .; . < 20)’ input: 2 output: - 2 - 4 - 16
- program: ‘recurse(.foo[])’
input: ‘{“foo”:[{“foo”: []}, {“foo”:[{“foo”:[]}]}]}’
output:
-
title: “
..
” body: |Short-hand for
recurse
without arguments. This is intended to resemble the XPath//
operator. Note that..a
does not work; use..|a
instead. In the example below we use..|.a?
to find all the values of object keys “a” in any object found “below”.
.examples:
- program: ‘..|.a?’ input: ‘[[{“a”:1}]]’ output: [‘1’]
-
title: “
env
” body: |Outputs an object representing jq’s environment.
examples:
- program: ‘env.PAGER’ input: ‘null’ output: [‘“less”’]
-
title: “
transpose
” body: |Transpose a possibly jagged matrix (an array of arrays). Rows are padded with nulls so the result is always rectangular.
examples:
- program: ‘transpose’ input: ‘[[1], [2,3]]’ output: [’[[1,2],[null,3]]’]
-
title: “
bsearch(x)
” body: |bsearch(x) conducts a binary search for x in the input array. If the input is sorted and contains x, then bsearch(x) will return its index in the array; otherwise, if the array is sorted, it will return (-1 - ix) where ix is an insertion point such that the array would still be sorted after the insertion of x at ix. If the array is not sorted, bsearch(x) will return an integer that is probably of no interest.
examples:
- program: ‘bsearch(0)’ input: ‘[0,1]’ output: [‘0’]
- program: ‘bsearch(0)’ input: ‘[1,2,3]’ output: [‘-1’]
- program: ‘bsearch(4) as $ix | if $ix < 0 then .[-(1+$ix)] = 4 else . end’ input: ‘[1,2,3]’ output: [‘[1,2,3,4]’]
-
title: “String interpolation -
\\(foo)
” body: |Inside a string, you can put an expression inside parens after a backslash. Whatever the expression returns will be interpolated into the string.
examples:
- program: ‘“The input was (.), which is one less than (.+1)”’ input: ‘42’ output: [‘“The input was 42, which is one less than 43”’]
-
title: “Convert to/from JSON” body: |
The
tojson
andfromjson
builtins dump values as JSON texts or parse JSON texts into values, respectively. The tojson builtin differs from tostring in that tostring returns strings unmodified, while tojson encodes strings as JSON strings.examples:
- program: ‘[.[]|tostring]’ input: ‘[1, “foo”, [“foo”]]’ output: [’[“1”,”foo”,”["foo"]”]’]
- program: ‘[.[]|tojson]’ input: ‘[1, “foo”, [“foo”]]’ output: [’[“1”,”"foo"”,”["foo"]”]’]
- program: ‘[.[]|tojson|fromjson]’ input: ‘[1, “foo”, [“foo”]]’ output: [‘[1,”foo”,[“foo”]]’]
-
title: “Format strings and escaping” body: |
The
@foo
syntax is used to format and escape strings, which is useful for building URLs, documents in a language like HTML or XML, and so forth.@foo
can be used as a filter on its own, the possible escapings are:-
@text
:Calls
tostring
, see that function for details. -
@json
:Serializes the input as JSON.
-
@html
:Applies HTML/XML escaping, by mapping the characters
<>&'"
to their entity equivalents<
,>
,&
,'
,"
. -
@uri
:Applies percent-encoding, by mapping all reserved URI characters to a
%XX
sequence. -
@csv
:The input must be an array, and it is rendered as CSV with double quotes for strings, and quotes escaped by repetition.
-
@tsv
:The input must be an array, and it is rendered as TSV (tab-separated values). Each input array will be printed as a single line. Fields are separated by a single tab (ascii
0x09
). Input characters line-feed (ascii0x0a
), carriage-return (ascii0x0d
), tab (ascii0x09
) and backslash (ascii0x5c
) will be output as escape sequences\n
,\r
,\t
,\\
respectively. -
@sh
:The input is escaped suitable for use in a command-line for a POSIX shell. If the input is an array, the output will be a series of space-separated strings.
-
@base64
:The input is converted to base64 as specified by RFC 4648.
This syntax can be combined with string interpolation in a useful way. You can follow a
@foo
token with a string literal. The contents of the string literal will not be escaped. However, all interpolations made inside that string literal will be escaped. For instance,@uri "https://www.google.com/search?q=\(.search)"
will produce the following output for the input
{"search":"what is jq?"}
:"https://www.google.com/search?q=what%20is%20jq%3F"
Note that the slashes, question mark, etc. in the URL are not escaped, as they were part of the string literal.
examples:
- program: ‘@html’ input: ‘“This works if x < y”’ output: [‘“This works if x < y”’]
-
-
- program: ‘@html “Anonymous said: (.)”’
input: ‘“”’
output: [“Anonymous said: <script>alert("lol hax");</script>”]
- program: '@sh "echo \(.)"'
input: "\"O'Hara's Ale\""
output: ["\"echo 'O'\\\\''Hara'\\\\''s Ale'\""]
- title: "Dates"
body: |
jq provides some basic date handling functionality, with some
high-level and low-level builtins. In all cases these
builtins deal exclusively with time in UTC.
The `fromdateiso8601` builtin parses datetimes in the ISO 8601
format to a number of seconds since the Unix epoch
(1970-01-01T00:00:00Z). The `todateiso8601` builtin does the
inverse.
The `fromdate` builtin parses datetime strings. Currently
`fromdate` only supports ISO 8601 datetime strings, but in the
future it will attempt to parse datetime strings in more
formats.
The `todate` builtin is an alias for `todateiso8601`.
The `now` builtin outputs the current time, in seconds since
the Unix epoch.
Low-level jq interfaces to the C-library time functions are
also provided: `strptime`, `strftime`, `mktime`, and `gmtime`.
Refer to your host operating system's documentation for the
format strings used by `strptime` and `strftime`. Note: these
are not necessarily stable interfaces in jq, particularly as
to their localization functionality.
The `gmtime` builtin consumes a number of seconds since the
Unix epoch and outputs a "broken down time" representation of
time as an array of numbers representing (in this order): the
year, the month (zero-based), the day of the month, the hour
of the day, the minute of the hour, the second of the minute,
the day of the week, and the day of the year -- all one-based
unless otherwise stated.
The `mktime` builtin consumes "broken down time"
representations of time output by `gmtime` and `strptime`.
The `strptime(fmt)` builtin parses input strings matching the
`fmt` argument. The output is in the "broken down time"
representation consumed by `gmtime` and output by `mktime`.
The `strftime(fmt)` builtin formats a time with the given
format.
The format strings for `strptime` and `strftime` are described
in typical C library documentation. The format string for ISO
8601 datetime is `"%Y-%m-%dT%H:%M:%SZ"`.
jq may not support some or all of this date functionality on
some systems.
examples:
- program: 'fromdate'
input: '"2015-03-05T23:51:47Z"'
output: ['1425599507']
- program: 'strptime("%Y-%m-%dT%H:%M:%SZ")'
input: '"2015-03-05T23:51:47Z"'
output: ['[2015,2,5,23,51,47,4,63]']
- program: 'strptime("%Y-%m-%dT%H:%M:%SZ")|mktime'
input: '"2015-03-05T23:51:47Z"'
output: ['1425599507']
- title: Conditionals and Comparisons
entries:
-
title: “
==
,!=
” body: |The expression ‘a == b’ will produce ‘true’ if the result of a and b are equal (that is, if they represent equivalent JSON documents) and ‘false’ otherwise. In particular, strings are never considered equal to numbers. If you’re coming from Javascript, jq’s == is like Javascript’s === - considering values equal only when they have the same type as well as the same value.
!= is “not equal”, and ‘a != b’ returns the opposite value of ‘a == b’
examples:
- program: ‘.[] == 1’ input: ‘[1, 1.0, “1”, “banana”]’ output: [‘true’, ‘true’, ‘false’, ‘false’]
-
title: if-then-else body: |
if A then B else C end
will act the same asB
ifA
produces a value other than false or null, but act the same asC
otherwise.Checking for false or null is a simpler notion of “truthiness” than is found in Javascript or Python, but it means that you’ll sometimes have to be more explicit about the condition you want: you can’t test whether, e.g. a string is empty using
if .name then A else B end
, you’ll need something more likeif (.name | length) > 0 then A else B end
instead.If the condition
A
produces multiple results, thenB
is evaluated once for each result that is not false or null, andC
is evaluated once for each false or null.More cases can be added to an if using
elif A then B
syntax.examples:
- program: |- if . == 0 then “zero” elif . == 1 then “one” else “many” end input: 2 output: [‘“many”’]
-
title: “
>, >=, <=, <
” body: |The comparison operators
>
,>=
,<=
,<
return whether their left argument is greater than, greater than or equal to, less than or equal to or less than their right argument (respectively).The ordering is the same as that described for
sort
, above.examples:
- program: ‘. < 5’ input: 2 output: [‘true’]
-
title: and/or/not body: |
jq supports the normal Boolean operators and/or/not. They have the same standard of truth as if expressions - false and null are considered “false values”, and anything else is a “true value”.
If an operand of one of these operators produces multiple results, the operator itself will produce a result for each input.
not
is in fact a builtin function rather than an operator, so it is called as a filter to which things can be piped rather than with special syntax, as in.foo and .bar | not
.These three only produce the values “true” and “false”, and so are only useful for genuine Boolean operations, rather than the common Perl/Python/Ruby idiom of “value_that_may_be_null or default”. If you want to use this form of “or”, picking between two values rather than evaluating a condition, see the “//” operator below.
examples:
- program: ‘42 and “a string”’ input: ‘null’ output: [‘true’]
- program: ‘(true, false) or false’
input: ‘null’
output: [‘true’, ‘false’]
- program: ‘(true, false) and (true, false)’
input: ‘null’
output: [‘true’, ‘false’, ‘false’, ‘false’]
- program: ‘(true, true) and (true, false)’ input: ‘null’ output: [‘true’, ‘false’, ‘true’, ‘false’]
- program: ‘[true, false | not]’ input: ‘null’ output: [‘[false, true]’]
-
title: Alternative operator -
//
body: |A filter of the form
a // b
produces the same results asa
, ifa
produces results other thanfalse
andnull
. Otherwise,a // b
produces the same results asb
.This is useful for providing defaults:
.foo // 1
will evaluate to1
if there’s no.foo
element in the input. It’s similar to howor
is sometimes used in Python (jq’sor
operator is reserved for strictly Boolean operations).examples:
- program: ‘.foo // 42’ input: ‘{“foo”: 19}’ output: [19]
- program: ‘.foo // 42’ input: ‘{}’ output: [42]
-
title: try-catch body: |
Errors can be caught by using
try EXP catch EXP
. The first expression is executed, and if it fails then the second is executed with the error message. The output of the handler, if any, is output as if it had been the output of the expression to try.The
try EXP
form usesempty
as the exception handler.examples:
- program: ‘try .a catch “. is not an object”’ input: ‘true’ output: [’”. is not an object”’]
- program: ‘[.[]|try .a]’ input: ‘[{}, true, {“a”:1}]’ output: [‘[null, 1]’]
- program: ‘try error(“some exception”) catch .’ input: ‘true’ output: [‘“some exception”’]
-
title: Breaking out of control structures body: |
A convenient use of try/catch is to break out of control structures like
reduce
,foreach
,while
, and so on.For example:
# Repeat an expression until it raises "break" as an # error, then stop repeating without re-raising the error. # But if the error caught is not "break" then re-raise it. try repeat(exp) catch .=="break" then empty else error;
jq has a syntax for named lexical labels to “break” or “go (back) to”:
label $out | ... break $out ...
The
break $label_name
expression will cause the program to to act as though the nearest (to the left)label $label_name
producedempty
.The relationship between the
break
and correspondinglabel
is lexical: the label has to be “visible” from the break.To break out of a
reduce
, for example:label $out | reduce .[] as $item (null; if .==false then break $out else ... end)
The following jq program produces a syntax error:
break $out
because no label
$out
is visible. -
title: “
?
operator” body: |The
?
operator, used asEXP?
, is shorthand fortry EXP
.examples:
- program: ‘[.[]|(.a)?]’ input: ‘[{}, true, {“a”:1}]’ output: [‘[null, 1]’]
-
-
title: Regular expressions (PCRE) body: |
jq uses the Oniguruma regular expression library, as do php, ruby, TextMate, Sublime Text, etc, so the description here will focus on jq specifics.
The jq regex filters are defined so that they can be used using one of these patterns:
STRING | FILTER( REGEX ) STRING | FILTER( REGEX; FLAGS ) STRING | FILTER( [REGEX] ) STRING | FILTER( [REGEX, FLAGS] )
where:
- STRING, REGEX and FLAGS are jq strings and subject to jq string interpolation;
- REGEX, after string interpolation, should be a valid PCRE regex;
- FILTER is one of
test
,match
, orcapture
, as described below.
FLAGS is a string consisting of one of more of the supported flags:
g
- Global search (find all matches, not just the first)i
- Case insensitive searchm
- Multi line mode (‘.’ will match newlines)n
- Ignore empty matchesp
- Both s and m modes are enableds
- Single line mode (‘^’ -> ‘\A’, ‘$’ -> ‘\Z’)l
- Find longest possible matchesx
- Extended regex format (ignore whitespace and comments)
To match whitespace in an x pattern use an escape such as \s, e.g.
- test( “a\sb”, “x” ).
Note that certain flags may also be specified within REGEX, e.g.
-
jq -n ‘(“test”, “TEst”, “teST”, “TEST”) test( “(?i)te(?-i)st” )’
evaluates to: true, true, false, false.
entries:
-
title: “
test(val)
,test(regex; flags)
” body: |Like
match
, but does not return match objects, onlytrue
orfalse
for whether or not the regex matches the input.examples:
- program: ‘test(“foo”)’ input: ‘“foo”’ output: [‘true’]
- program: ‘.[] | test(“a b c # spaces are ignored”; “ix”)’ input: ‘[“xabcd”, “ABC”]’ output: [‘true’, ‘true’]
-
title: “
match(val)
,match(regex; flags)
” body: |match outputs an object for each match it finds. Matches have the following fields:
offset
- offset in UTF-8 codepoints from the beginning of the inputlength
- length in UTF-8 codepoints of the matchstring
- the string that it matchedcaptures
- an array of objects representing capturing groups.
Capturing group objects have the following fields:
offset
- offset in UTF-8 codepoints from the beginning of the inputlength
- length in UTF-8 codepoints of this capturing groupstring
- the string that was capturedname
- the name of the capturing group (ornull
if it was unnamed)
Capturing groups that did not match anything return an offset of -1
examples:
- program: ‘match(“(abc)+”; “g”)’
input: ‘“abc abc”’
output:
- ’{“offset”: 0, “length”: 3, “string”: “abc”, “captures”: [{“offset”: 0, “length”: 3, “string”: “abc”, “name”: null}]}’
- ’{“offset”: 4, “length”: 3, “string”: “abc”, “captures”: [{“offset”: 4, “length”: 3, “string”: “abc”, “name”: null}]}’
- program: ‘match(“foo”)’ input: ‘“foo bar foo”’ output: [’{“offset”: 0, “length”: 3, “string”: “foo”, “captures”: []}’]
- program: ‘match([“foo”, “ig”])’
input: ‘“foo bar FOO”’
output:
- ’{“offset”: 0, “length”: 3, “string”: “foo”, “captures”: []}’
- ’{“offset”: 8, “length”: 3, “string”: “FOO”, “captures”: []}’
- program: ‘match(“foo (?
bar)? foo"; "ig")' input: '"foo bar foo foo foo"' output: - ’{“offset”: 0, “length”: 11, “string”: “foo bar foo”, “captures”: [{“offset”: 4, “length”: 3, “string”: “bar”, “name”: “bar123”}]}’
- ’{“offset”: 12, “length”: 8, “string”: “foo foo”, “captures”: [{“offset”: -1, “length”: 0, “string”: null, “name”: “bar123”}]}’
- program: ‘[ match(“.”; “g”)] | length’ input: ‘“abc”’ output: [3]
-
title: “
capture(val)
,capture(regex; flags)
” body: |Collects the named captures in a JSON object, with the name of each capture as the key, and the matched string as the corresponding value.
examples:
- program: ‘capture(“(?[a-z]+)-(?
[0-9]+)")' input: '"xyzzy-14"' output: ['{ "a": "xyzzy", "n": "14" }']
- program: ‘capture(“(?[a-z]+)-(?
-
title: “
scan(regex)
,scan(regex; flags)
” body: |Emit a stream of the non-overlapping substrings of the input that match the regex in accordance with the flags, if any have been specified. If there is no match, the stream is empty. To capture all the matches for each input string, use the idiom
[ expr ]
, e.g.[ scan(regex) ]
.example:
-
program: ‘scan(“c”)’ input: ‘“abcdefabc”’ output: [‘“c”’, ‘“c”’]
-
program: ‘scan(“b”)’ input: (“”, “”) output: [’[]’, ‘[]’]
-
-
title: “
split(regex; flags)
” body: |For backwards compatibility,
split
splits on a string, not a regex.example:
- program: ‘split(“, *”; null)’ input: ‘“ab,cd, ef”’ output: [‘“ab”,”cd”,”ef”’]
-
title: “
splits(regex)
,splits(regex; flags)
” body: |These provide the same results as their
split
counterparts, but as a stream instead of an array.example:
- program: ‘splits(“, *”)’ input: ‘(“ab,cd”, “ef, gh”)’ output: [‘“ab”’, ‘“cd”’, ‘“ef”’, ‘“gh”’]
-
title: “
sub(regex; tostring)
sub(regex; string; flags)
” body: |Emit the string obtained by replacing the first match of regex in the input string with
tostring
, after interpolation.tostring
should be a jq string, and may contain references to named captures. The named captures are, in effect, presented as a JSON object (as constructed bycapture
) totostring
, so a reference to a captured variable named “x” would take the form: “(.x)”.example:
- program: ‘sub(“^[^a-z]*(?
[a-z]*).*")' input: '"123abc456"' output: '"ZabcZabc"'
- program: ‘sub(“^[^a-z]*(?
-
title: “
gsub(regex; string)
,gsub(regex; string; flags)
” body: |gsub
is likesub
but all the non-overlapping occurrences of the regex are replaced by the string, after interpolation.example:
- program: ‘gsub(“(?
.)[^a]*"; "+\(.x)-")' input: '"Abcabc"' output: '"+A-+a-"'
- program: ‘gsub(“(?
-
title: Advanced features body: | Variables are an absolute necessity in most programming languages, but they’re relegated to an “advanced feature” in jq.
In most languages, variables are the only means of passing around data. If you calculate a value, and you want to use it more than once, you’ll need to store it in a variable. To pass a value to another part of the program, you’ll need that part of the program to define a variable (as a function parameter, object member, or whatever) in which to place the data.
It is also possible to define functions in jq, although this is is a feature whose biggest use is defining jq’s standard library (many jq functions such as
map
andfind
are in fact written in jq).jq has reduction operators, which are very powerful but a bit tricky. Again, these are mostly used internally, to define some useful bits of jq’s standard library.
It may not be obvious at first, but jq is all about generators (yes, as often found in other languages). Some utilities are provided to help deal with generators.
Some minimal I/O support (besides reading JSON from standard input, and writing JSON to standard output) is available.
Finally, there is a module/library system.
entries:
-
title: Variables body: |
In jq, all filters have an input and an output, so manual plumbing is not necessary to pass a value from one part of a program to the next. Many expressions, for instance
a + b
, pass their input to two distinct subexpressions (herea
andb
are both passed the same input), so variables aren’t usually necessary in order to use a value twice.For instance, calculating the average value of an array of numbers requires a few variables in most languages - at least one to hold the array, perhaps one for each element or for a loop counter. In jq, it’s simply
add / length
- theadd
expression is given the array and produces its sum, and thelength
expression is given the array and produces its length.So, there’s generally a cleaner way to solve most problems in jq than defining variables. Still, sometimes they do make things easier, so jq lets you define variables using
expression as $variable
. All variable names start with$
. Here’s a slightly uglier version of the array-averaging example:length as $array_length | add / $array_length
We’ll need a more complicated problem to find a situation where using variables actually makes our lives easier.
Suppose we have an array of blog posts, with “author” and “title” fields, and another object which is used to map author usernames to real names. Our input looks like:
{"posts": [{"title": "Frist psot", "author": "anon"}, {"title": "A well-written article", "author": "person1"}], "realnames": {"anon": "Anonymous Coward", "person1": "Person McPherson"}}
We want to produce the posts with the author field containing a real name, as in:
{"title": "Frist psot", "author": "Anonymous Coward"} {"title": "A well-written article", "author": "Person McPherson"}
We use a variable, $names, to store the realnames object, so that we can refer to it later when looking up author usernames:
.realnames as $names | .posts[] | {title, author: $names[.author]}
The expression
exp as $x | ...
means: for each value of expressionexp
, run the rest of the pipeline with the entire original input, and with$x
set to that value. Thusas
functions as something of a foreach loop.Just as
{foo}
is a handy way of writing{foo: .foo}
, so{$foo}
is a handy way of writing{foo:$foo}
.Multiple variables may be declared using a single
as
expression by providing a pattern that matches the structure of the input (this is known as “destructuring”):. as {realnames: $names, posts: [$first, $second]} | ...
The variable declarations in array patterns (e.g.,
. as [$first, $second]
) bind to the elements of the array in from the element at index zero on up, in order. When there is no value at the index for an array pattern element,null
is bound to that variable.Variables are scoped over the rest of the expression that defines them, so
.realnames as $names | (.posts[] | {title, author: $names[.author]})
will work, but
(.realnames as $names | .posts[]) | {title, author: $names[.author]}
won’t.
For programming language theorists, it’s more accurate to say that jq variables are lexically-scoped bindings. In particular there’s no way to change the value of a binding; one can only setup a new binding with the same name, but which will not be visible where the old one was.
examples:
- program: ‘.bar as $x | .foo | . + $x’ input: ‘{“foo”:10, “bar”:200}’ output: [‘210’]
- program: ‘. as $i|[(.*2|. as $i| $i), $i]’ input: ‘5’ output: [‘[10,5]’]
- program: ‘. as [$a, $b, {c: $c}] | $a + $b + $c’ input: ‘[2, 3, {“c”: 4, “d”: 5}]’ output: [‘9’]
- program: ‘.[] as [$a, $b] | {a: $a, b: $b}’ input: ‘[[0], [0, 1], [2, 1, 0]]’ output: [’{“a”:0,”b”:null}’, ‘{“a”:0,”b”:1}’, ‘{“a”:2,”b”:1}’]
-
title: ‘Defining Functions’ body: |
You can give a filter a name using “def” syntax:
def increment: . + 1;
From then on,
increment
is usable as a filter just like a builtin function (in fact, this is how some of the builtins are defined). A function may take arguments:def map(f): [.[] | f];
Arguments are passed as filters, not as values. The same argument may be referenced multiple times with different inputs (here
f
is run for each element of the input array). Arguments to a function work more like callbacks than like value arguments. This is important to understand. Consider:def foo(f): f|f; 5|foo(.*2)
The result will be 20 because
f
is.*2
, and during the first invocation off
.
will be 5, and the second time it will be 10 (5 * 2), so the result will be 20. Function arguments are filters, and filters expect an input when invoked.If you want the value-argument behaviour for defining simple functions, you can just use a variable:
def addvalue(f): f as $f | map(. + $f);
Or use the short-hand:
def addvalue($f): ...;
With either definition,
addvalue(.foo)
will add the current input’s.foo
field to each element of the array.Multiple definitions using the same function name are allowed. Each re-definition replaces the previous one for the same number of function arguments, but only for references from functions (or main program) subsequent to the re-definition.
examples:
- program: ‘def addvalue(f): . + [f]; map(addvalue(.[0]))’ input: ‘[[1,2],[10,20]]’ output: [’[[1,2,1], [10,20,10]]’]
- program: ‘def addvalue(f): f as $x | map(. + $x); addvalue(.[0])’ input: ‘[[1,2],[10,20]]’ output: [’[[1,2,1,2], [10,20,1,2]]’]
-
title: Reduce body: |
The
reduce
syntax in jq allows you to combine all of the results of an expression by accumulating them into a single answer. As an example, we’ll pass[3,2,1]
to this expression:reduce .[] as $item (0; . + $item)
For each result that
.[]
produces,. + $item
is run to accumulate a running total, starting from 0. In this example,.[]
produces the results 3, 2, and 1, so the effect is similar to running something like this:0 | (3 as $item | . + $item) | (2 as $item | . + $item) | (1 as $item | . + $item)
examples:
- program: ‘reduce .[] as $item (0; . + $item)’ input: ‘[10,2,5,3]’ output: [‘20’]
-
title: “
limit(n; exp)
” body: |The
limit
function extracts up ton
outputs fromexp
.examples:
- program: ‘[limit(3;.[])]’ input: ‘[0,1,2,3,4,5,6,7,8,9]’ output: [‘[0,1,2]’]
-
title: “
first(expr)
,last(expr)
,nth(n; expr)
” body: |The
first(expr)
andlast(expr)
functions extract the first and last values fromexpr
, respectively.The
nth(n; expr)
function extracts the nth value output byexpr
. This can be defined asdef nth(n; expr): last(limit(n + 1; expr));
. Note thatnth(n; expr)
doesn’t support negative values ofn
.examples:
- program: ‘[first(range(.)), last(range(.)), nth(./2; range(.))]’ input: ‘10’ output: [‘[0,9,5]’]
-
title: “
first
,last
,nth(n)
” body: |The
first
andlast
functions extract the first and last values from any array at.
.The
nth(n)
function extracts the nth value of any array at.
.examples:
- program: ‘[range(.)]|[first, last, nth(5)]’ input: ‘10’ output: [‘[0,9,5]’]
-
title: “
foreach
” body: |The
foreach
syntax is similar toreduce
, but intended to allow the construction oflimit
and reducers that produce intermediate results (see example).The form is
foreach EXP as $var (INIT; UPDATE; EXTRACT)
. Likereduce
,INIT
is evaluated once to produce a state value, then each output ofEXP
is bound to$var
,UPDATE
is evaluated for each output ofEXP
with the current state and with$var
visible. Each value output byUPDATE
replaces the previous state. Finally,EXTRACT
is evaluated for each new state to extract an output offoreach
.This is mostly useful only for constructing
reduce
- andlimit
-like functions. But it is much more general, as it allows for partial reductions (see the example below).examples:
- program: ‘[foreach .[] as $item ([[],[]]; if $item == null then [[],.[0]] else [(.[0] + [$item]),[]] end; if $item == null then .[1] else empty end)]’ input: ‘[1,2,3,4,null,”a”,”b”,null]’ output: [’[[1,2,3,4],[“a”,”b”]]’]
-
title: Recursion body: |
As described above,
recurse
uses recursion, and any jq function can be recursive. Thewhile
builtin is also implemented in terms of recursion.Tail calls are optimized whenever the expression to the left of the recursive call outputs its last value. In practice this means that the expression to the left of the recursive call should not produce more than one output for each input.
For example:
def recurse(f): def r: ., (f | select(. != null) | r); r; def while(cond; update): def _while: if cond then ., (update | _while) else empty end; _while; def repeat(exp): def _repeat: exp, _repeat; _repeat;
-
title: Generators and iterators body: |
Some jq operators and functions are actually generators in that they can produce zero, one, or more values for each input, just as one might expect in other programming languages that have generators. For example, `.[]` generates all the values in its input (which must be an array or an object), `range(0; 10)` generates the integers between 0 and 10, and so on. Even the comma operator is a generator, generating first the values generated by the expression to the left of the comma, then for each of those, the values generate by the expression on the right of the comma. The `empty` builtin is the generator that produces zero outputs. The `empty` builtin backtracks to the preceding generator expression. All jq functions can be generators just by using builtin generators. It is also possible to define new generators using only recursion and the comma operator. If the recursive call(s) is(are) "in tail position" then the generator will be efficient. In the example below the recursive call by `_range` to itself is in tail position. The example shows off three advanced topics: tail recursion, generator construction, and sub-functions.
examples:
- program: ‘def range(init; upto; by): def _range: if (by > 0 and . < upto) or (by < 0 and . > upto) then ., ((.+by)|_range) else . end; if by == 0 then init else init|_range end | select((by > 0 and . < upto) or (by < 0 and . > upto)); range(0; 10; 3)’ input: ‘null’ output: [‘0’, ‘3’, ‘6’, ‘9’]
- program: ‘def while(cond; update): def _while: if cond then ., (update | _while) else empty end; _while; [while(.<100; .*2)]’ input: ‘1’ output: [‘[1,2,4,8,16,32,64]’]
-
-
title: ‘Math’ body: |
jq currently only has IEEE754 double-precision (64-bit) floating point number support.
Besides simple arithmetic operators such as
+
, jq also has most standard math functions from the C math library. C math functions that take a single input argument (e.g.,sin()
) are available as zero-argument jq functions. C math functions that take two input arguments (e.g.,pow()
) are available as two-argument jq functions that ignore.
.Availability of standard math functions depends on the availability of the corresponding math functions in your operating system and C math library. Unavailable math functions will be defined but will raise an error.
-
title: ‘I/O’ body: |
At this time jq has minimal support for I/O, mostly in the form of control over when inputs are read. Two builtins functions are provided for this,
input
andinputs
, that read from the same sources (e.g.,stdin
, files named on the command-line) as jq itself. These two builtins, and jq’s own reading actions, can be interleaved with each other.One builtin provides minimal output capabilities,
debug
. (Recall that a jq program’s output values are always output as JSON texts onstdout
.) Thedebug
builtin can have application-specific behavior, such as for executables that use the libjq C API but aren’t the jq executable itself.entries:
-
title: “
input
” body: |Outputs one new input.
-
title: “
inputs
” body: |Outputs all remaining inputs, one by one.
This is primarily useful for reductions over a program’s inputs.
-
title: “
debug
” body: |Causes a debug message based on the input value to be produced. The jq executable wraps the input value with
["DEBUG:", <input-value>]
and prints that and a newline on stderr, compactly. This may change in the future. -
title: “
input_filename
” body: |Returns the name of the file whose input is currently being filtered. Note that this will not work well unless jq is running in a UTF-8 locale.
-
title: “
input_line_number
” body: |Returns the line number of the input currently being filtered.
-
-
title: ‘Streaming’ body: |
With the
--stream
option jq can parse input texts in a streaming fashion, allowing jq programs to start processing large JSON texts immediately rather than after the parse completes. If you have a single JSON text that is 1GB in size, streaming it will allow you to process it much more quickly.However, streaming isn’t easy to deal with as the jq program will have
[<path>, <leaf-value>]
(and a few other forms) as inputs.Several builtins are provided to make handling streams easier.
The examples below use the streamed form of
[0,[1]]
, which is[[0],0],[[1,0],1],[[1,0]],[[1]]
.Streaming forms include
[<path>, <leaf-value>]
(to indicate any scalar value, empty array, or empty object), and[<path>]
(to indicate the end of an array or object). Future versions of jq run with--stream
and-seq
may output additional forms such as["error message"]
when an input text fails to parse.entries:
-
title: “
truncate_stream(stream_expression)
” body: |Consumes a number as input and truncates the corresponding number of path elements from the left of the outputs of the given streaming expression.
examples:
- program: ‘[1|truncate_stream([[0],1],[[1,0],2],[[1,0]],[[1]])]’ input: ‘1’ output: [’[[[0],2],[[0]]]’]
-
title: “
fromstream(stream_expression)
” body: |Outputs values corresponding to the stream expression’s outputs.
examples:
- program: ‘fromstream(1|truncate_stream([[0],1],[[1,0],2],[[1,0]],[[1]]))’ input: ‘null’ output: [‘[2]’]
-
title: “
tostream
” body: |The
tostream
builtin outputs the streamed form of its input.examples:
- program: ‘. as $dot|fromstream($dot|tostream)|.==$dot’ input: ‘[0,[1,{“a”:1},{“b”:2}]]’ output: [‘true’]
-
-
title: Assignment body: |
Assignment works a little differently in jq than in most programming languages. jq doesn’t distinguish between references to and copies of something - two objects or arrays are either equal or not equal, without any further notion of being “the same object” or “not the same object”.
If an object has two fields which are arrays,
.foo
and.bar
, and you append something to.foo
, then.bar
will not get bigger. Even if you’ve just set.bar = .foo
. If you’re used to programming in languages like Python, Java, Ruby, Javascript, etc. then you can think of it as though jq does a full deep copy of every object before it does the assignment (for performance, it doesn’t actually do that, but that’s the general idea).All the assignment operators in jq have path expressions on the left-hand side.
entries:
-
title: “
=
” body: |The filter
.foo = 1
will take as input an object and produce as output an object with the “foo” field set to- There is no notion of “modifying” or “changing” something in jq - all jq values are immutable. For instance,
.foo = .bar .foo.baz = 1 will not have the side-effect of setting .bar.baz to be set to 1, as the similar-looking program in Javascript, Python, Ruby or other languages would. Unlike these languages (but like Haskell and some other functional languages), there is no notion of two arrays or objects being “the same array” or “the same object”. They can be equal, or not equal, but if we change one of them in no circumstances will the other change behind our backs.
This means that it’s impossible to build circular values in jq (such as an array whose first element is itself). This is quite intentional, and ensures that anything a jq program can produce can be represented in JSON.
Note that the left-hand side of ‘=’ refers to a value in
.
. Thus$var.foo = 1
won’t work as expected ($var.foo
is not a valid or useful path expression in.
); use$var | .foo = 1
instead.If the right-hand side of ‘=’ produces multiple values, then for each such value jq will set the paths on the left-hand side to the value and then it will output the modified
.
. For example,(.a,.b)=range(2)
outputs{"a":0,"b":0}
, then{"a":1,"b":1}
. The “update” assignment forms (see below) do not do this.Note too that
.a,.b=0
does not set.a
and.b
, but(.a,.b)=0
sets both. -
title: “
|=
” body: | As well as the assignment operator ‘=’, jq provides the “update” operator ‘|=’, which takes a filter on the right-hand side and works out the new value for the property of.
being assigned to by running the old value through this expression. For instance, .foo |= .+1 will build an object with the “foo” field set to the input’s “foo” plus 1.This example should show the difference between ‘=’ and ‘ =’: Provide input ‘{“a”: {“b”: 10}, “b”: 20}’ to the programs:
.a = .b .a |= .b
The former will set the “a” field of the input to the “b” field of the input, and produce the output {“a”: 20}. The latter will set the “a” field of the input to the “a” field’s “b” field, producing {“a”: 10}.
The left-hand side can be any general path expression; see
path()
.Note that the left-hand side of ‘|=’ refers to a value in
.
. Thus$var.foo |= . + 1
won’t work as expected ($var.foo
is not a valid or useful path expression in.
); use$var | .foo |= . + 1
instead.If the right-hand side outputs multiple values, only the last one will be used.
examples:
- program: ‘(..|select(type==”boolean”)) |= if . then 1 else 0 end’ input: ‘[true,false,[5,true,[true,[false]],false]]’ output: [‘[1,0,[5,1,[1,[0]],0]]’]
-
title: “
+=
,-=
,*=
,/=
,%=
,//=
” body: |jq has a few operators of the form
a op= b
, which are all equivalent toa |= . op b
. So,+= 1
can be used to increment values.examples:
- program: .foo += 1 input: ‘{“foo”: 42}’ output: [’{“foo”: 43}’]
-
title: Complex assignments body: | Lots more things are allowed on the left-hand side of a jq assignment than in most languages. We’ve already seen simple field accesses on the left hand side, and it’s no surprise that array accesses work just as well:
.posts[0].title = "JQ Manual"
What may come as a surprise is that the expression on the left may produce multiple results, referring to different points in the input document:
.posts[].comments |= . + ["this is great"]
That example appends the string “this is great” to the “comments” array of each post in the input (where the input is an object with a field “posts” which is an array of posts).
When jq encounters an assignment like ‘a = b’, it records the “path” taken to select a part of the input document while executing a. This path is then used to find which part of the input to change while executing the assignment. Any filter may be used on the left-hand side of an equals - whichever paths it selects from the input will be where the assignment is performed.
This is a very powerful operation. Suppose we wanted to add a comment to blog posts, using the same “blog” input above. This time, we only want to comment on the posts written by “stedolan”. We can find those posts using the “select” function described earlier:
.posts[] | select(.author == "stedolan")
The paths provided by this operation point to each of the posts that “stedolan” wrote, and we can comment on each of them in the same way that we did before:
(.posts[] | select(.author == "stedolan") | .comments) |= . + ["terrible."]
-
-
title: Modules body: |
jq has a library/module system. Modules are files whose names end in
.jq
.Modules imported by a program are searched for in a default search path (see below). The
import
andinclude
directives allow the importer to alter this path.Paths in the a search path are subject to various substitutions.
For paths starting with “~/”, the user’s home directory is substituted for “~”.
For paths starting with “$ORIGIN/”, the path of the jq executable is substituted for “$ORIGIN”.
For paths starting with “./” or paths that are “.”, the path of the including file is substituted for “.”. For top-level programs given on the command-line, the current directory is used.
Import directives can optionally specify a search path to which the default is appended.
The default search path is the search path given to the
-L
command-line option, else["~/.jq", "$ORIGIN/../lib/jq", "$ORIGIN/../lib"]
.Null and empty string path elements terminate search path processing.
A dependency with relative path “foo/bar” would be searched for in “foo/bar.jq” and “foo/bar/bar.jq” in the given search path. This is intended to allow modules to be placed in a directory along with, for example, version control files, README files, and so on, but also to allow for single-file modules.
Consecutive components with the same name are not allowed to avoid ambiguities (e.g., “foo/foo”).
For example, with
-L$HOME/.jq
a modulefoo
can be found in$HOME/.jq/foo.jq
and$HOME/.jq/foo/foo.jq
.If “$HOME/.jq” is a file, it is sourced into the main program.
entries:
-
title: “
import RelativePathString as NAME [<metadata>];
” body: |Imports a module found at the given path relative to a directory in a search path. A “.jq” suffix will be added to the relative path string. The module’s symbols are prefixed with “NAME::”.
The optional metadata must be a constant jq expression. It should be an object with keys like “homepage” and so on. At this time jq only uses the “search” key/value of the metadata. The metadata is also made available to users via the
modulemeta
builtin.The “search” key in the metadata, if present, should have a string or array value (array of strings); this is the search path to be prefixed to the top-level search path.
-
title: “
include RelativePathString [<metadata>];
” body: |Imports a module found at the given path relative to a directory in a search path as if it were included in place. A “.jq” suffix will be added to the relative path string. The module’s symbols are imported into the caller’s namespace as if the module’s content had been included directly.
The optional metadata must be a constant jq expression. It should be an object with keys like “homepage” and so on. At this time jq only uses the “search” key/value of the metadata. The metadata is also made available to users via the
modulemeta
builtin. -
title: “
import RelativePathString as $NAME [<metadata>];
” body: |Imports a JSON file found at the given path relative to a directory in a search path. A “.json” suffix will be added to the relative path string. The file’s data will be available as
$NAME::NAME
.The optional metadata must be a constant jq expression. It should be an object with keys like “homepage” and so on. At this time jq only uses the “search” key/value of the metadata. The metadata is also made available to users via the
modulemeta
builtin.The “search” key in the metadata, if present, should have a string or array value (array of strings); this is the search path to be prefixed to the top-level search path.
-
title: “
module <metadata>;
” body: |This directive is entirely optional. It’s not required for proper operation. It serves only the purpose of providing metadata that can be read with the
modulemeta
builtin.The metadata must be a constant jq expression. It should be an object with keys like “homepage”. At this time jq doesn’t use this metadata, but it is made available to users via the
modulemeta
builtin. -
title: “
modulemeta
” body: |Takes a module name as input and outputs the module’s metadata as an object, with the module’s imports (including metadata) as an array value for the “deps” key.
Programs can use this to query a module’s metadata, which they could then use to, for example, search for, download, and install missing dependencies.
-