SunSpec
is a Modbus standard common in
the United States with energy devices; think Solar Panels, Inverters and
Battery Packs. After finding SunSpec Alliance
on
GitHub and playing with their pysunspec2
package, I wanted to get an overview
of all the attributes that are available. They do provide
models
repository where models are provided
in individual groups as JSON
or SMDX
files; there are 106
files.
I found these pretty unwieldy and decided to convert all the JSON files
into a single CSV.
Tools:
- fish
(could use bash with tweaks)
- jq
- qsv
The SunSpec Alliance provided a
schema.json
,
but as a general rule the files look like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
|
{
"id": 1,
"group": {
"desc": "All SunSpec compliant devices must include this as the first model",
"label": "Common",
"name": "common",
"points": [
{
"desc": "Model identifier",
"label": "Model ID",
"mandatory": "M",
"name": "ID",
"size": 1,
"static": "S",
"type": "uint16",
"value": 1,
"standards": [
"IEEE 1547-2018"
]
}
]
}
}
|
Beginning
The first bit would be to get the points
as that is mainly what I am interested
in:
1
|
cat model_2.json | jq -c '.["group"]["points"]'
|
[{"desc":"Model identifier","label":"Model ID","mandatory":"M", ...}, ...]
Pretty good! Lets pipe that into qsv
and see what happens:
1
|
cat model_2.json | jq -c '.["group"]["points"]' | qsv json
|
Selector name 'standards' does not exist as a named header in the given CSV data.
Not good! My original idea was to exclude standards
all together, but in the
end I decided to keep it, either way they both require the same solution:
re-formatting the dictionary, which can be done with a pipe. This also has the
added benefit of all files being in the same order for qsv
1
2
3
4
5
6
7
8
|
cat model_2.json | jq -c '[.["group"]["points"][] | {
name: .name,
label: .label,
access: .access,
mandatory: .mandatory,
type: .type,
units: .units
}]' | qsv json
|
name,label,access,mandatory,type,units
ID,Model ID,,M,uint16,
L,Model Length,,M,uint16,
...
Standards
While it’s not useful for me to be including the standards, which is
an array of strings, would be required to be thorough. This lead me to
learn two nice features of jq
- excluding attributes:
(.field != null)
- if statements:
if (statement) then do_a else do_b end
- string concatenation:
.field | join(",")
Combining those we get:
1
2
3
|
cat model_2.json | \
jq -c '.["group"]["points"][]' | \
jq -c '(if (.standards != null) then (.standards | join(",")) else .name end)'
|
"IEEE 1547-2018"
"IEEE 1547-2018"
"IEEE 1547-2018"
""
Include the name to verify they are correct:
1
2
3
|
cat model_2.json | \
jq -c '.["group"]["points"][]' | \
jq -c '(if (.standards != null) then [.name, (.standards | join(","))] else .name end)'
|
["ID","IEEE 1547-2018"]
["L","IEEE 1547-2018"]
["Mn","IEEE 1547-2018"]
["Md","IEEE 1547-2018"]
["Opt",""]
We can now add that to our main re-formatting statement as well as provide
default values for fields like access
using if statements:
1
2
3
4
5
6
7
8
9
|
cat model_2.json | jq -c '[.["group"]["points"][] | {
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
type: .type,
units: .units,
standards: (if (.standards != null) then (.standards | join(",")) else .standards end),
}]' | qsv json
|
Symbols and Enums
Enums were an important thing for me to include and many get more
complicated than 0=off
and 1=on
. Enums in these files are in the
key symbols
and are an array of dictionaries which include two keys:
value
and name
Using what was learned in standards and using model_2.json
, this could
be a decent first attempt
1
2
3
|
cat model_2.json | \
jq -c '.["group"]["points"][]' | \
jq -c '(if (.symbols != null) then (.symbols[] | ([.name, .value] | join("="))) else .symbols end)'
|
null
null
"OFF=1"
"ON=2"
"FULL=3"
"FAULT=4"
The issue being this code will place Enums into new rows and we want them
crammed into their own cell in the CSV. The solution is to use another nice jq
feature called map
1
2
3
|
cat model_2.json | \
jq -c '.["group"]["points"][]' | \
jq -c '(if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) ) else .symbols end)'
|
null
["OFF=1","ON=2","FULL=3","FAULT=4"]
Wrap that with another join, to combine the items with commas, and it’s
ready for the main re-formatting statement:
1
2
3
4
5
6
7
8
9
10
|
cat model_2.json | jq -c '[.["group"]["points"][] | {
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
type: .type,
units: .units,
symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
}]' | qsv json
|
Injecting Groups
The one thing missing now is injecting the group information into each row. This
can be done with --argjson
1
2
3
|
cat model_2.json | \
jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
'.["group"]["points"][] += { group_name: $group_name }'
|
This injects the key into each point’s dictionary. Now all we have to do
is chain this before the main re-formatting statement and add the new key.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
cat model_2.json | \
jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
'.["group"]["points"][] += { group_name: $group_name }' | \
jq -c '[.["group"]["points"][] | {
group_name: .group_name,
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
type: .type,
units: .units,
symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
}]' | qsv json
|
Piping into a file
There is a caveat with piping this directly into a CSV: the headers,
I have not been able to figure out how to exclude them using qsv
and
the --no-headers
argument as no effect. To counter this, the first run
of this command we write directly to the file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
cat model_1.json | \
jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
'.["group"]["points"][] += { group_name: $group_name }' | \
jq -c '[.["group"]["points"][] | {
group_name: .group_name,
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
type: .type,
units: .units,
symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
}]' | qsv json > sunspec.csv
|
Any proceeding writes need to trim the headers off and them append
1
2
3
4
5
6
7
8
9
10
11
12
13
14
|
cat model_2.json | \
jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
'.["group"]["points"][] += { group_name: $group_name }' | \
jq -c '[.["group"]["points"][] | {
group_name: .group_name,
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
type: .type,
units: .units,
symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
}]' | qsv json | tail -n +2 >> sunspec.csv
|
Script
You can manually copy+paste+edit this command to hit all 106 model files
you’ve cloned locally to your machine. I decided to expand on this and
make a fish script that would curl the files directory from GitHub.
With all this in mind and adding more columns, this is the final script:
Path: scrape-sunspec.fish
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
|
set -l base_url "https://raw.githubusercontent.com/sunspec/models/refs/heads/master/json/model_"
set -l model_numbers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 101 102 103 111 112 113 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 160 201 202 203 204 211 212 213 214 220 302 303 304 305 306 307 308 401 402 403 404 501 502 601 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 801 802 803 804 805 806 807 808 809 63001 63002 64001 64020 64101 64111 64112
set -l first true
for model in $model_numbers
set -l raw (curl "$base_url$model.json")
set -l parsed (
echo $raw |
jq \
--argjson group_label $(echo $raw | jq -c '.["group"]["label"]') \
--argjson group_desc $(echo $raw | jq -c '.["group"]["desc"]') \
--argjson group_name $(echo $raw | jq -c '.["group"]["name"]') \
'.["group"]["points"][] += {
group_name: $group_name,
group_label: $group_label,
group_desc: $group_desc
}' | \
jq '[.["group"]["points"][] | {
group_name: .group_name,
group_label: .group_label,
group_desc: .group_desc,
name: .name,
label: .label,
access: (if (.access != null) then .access else "R" end),
mandatory: (if (.mandatory != null) then .mandatory else "O" end),
static: (if (.static != null) then .static else "D" end),
size: .size,
value: .value,
count: .count,
sf: .sf,
type: .type,
units: .units,
symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
desc: .desc,
comments: .comments,
comments: (if (.comments != null) then (.comments | join(",")) else .comments end),
standards: (if (.standards != null) then (.standards | join(",")) else .standards end),
detail: .detail,
notes: .notes
}]')
if $first == true
echo $parsed | qsv json > sunspec.csv
set first false
else
echo $parsed | qsv json | tail -n +2 >> sunspec.csv
end
end
|
Conclusion
This whole thing was a good learning experience. I wish I had used qsv
more during this process, but it wasn’t meant to be! Hopefully you’ve
walked away with a new trick or two! Now time to figure out the actual
register addresses scheme!
Note: some of the model files have nested groups. Those can be accounted
for by extending the script. I took a look at a few of these nested groups and
decided it wasn’t worth the time.