SunSpec CSV

SunSpec is a Modbus standard common in the United States with energy devices; think Solar Panels, Inverters and Battery Packs. After finding SunSpec Alliance on GitHub and playing with their pysunspec2 package, I wanted to get an overview of all the attributes that are available. They do provide models repository where models are provided in individual groups as JSON or SMDX files; there are 106 files. I found these pretty unwieldy and decided to convert all the JSON files into a single CSV.

Tools:

  1. fish (could use bash with tweaks)
  2. jq
  3. qsv

General Format

The SunSpec Alliance provided a schema.json , but as a general rule the files look like:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
  "id": 1,
  "group": {
    "desc": "All SunSpec compliant devices must include this as the first model",
    "label": "Common",
    "name": "common",
    "points": [
      {
        "desc": "Model identifier",
        "label": "Model ID",
        "mandatory": "M",
        "name": "ID",
        "size": 1,
        "static": "S",
        "type": "uint16",
        "value": 1,
        "standards": [
          "IEEE 1547-2018"
        ]
      }
    ]
  }
}

Beginning

The first bit would be to get the points as that is mainly what I am interested in:

1
cat model_2.json | jq -c '.["group"]["points"]'
[{"desc":"Model identifier","label":"Model ID","mandatory":"M", ...}, ...]

Pretty good! Lets pipe that into qsv and see what happens:

1
cat model_2.json | jq -c '.["group"]["points"]' | qsv json
Selector name 'standards' does not exist as a named header in the given CSV data.

Not good! My original idea was to exclude standards all together, but in the end I decided to keep it, either way they both require the same solution: re-formatting the dictionary, which can be done with a pipe. This also has the added benefit of all files being in the same order for qsv

1
2
3
4
5
6
7
8
cat model_2.json | jq -c '[.["group"]["points"][] | {
    name: .name,
    label: .label,
    access: .access,
    mandatory: .mandatory,
    type: .type,
    units: .units
  }]' | qsv json
name,label,access,mandatory,type,units
ID,Model ID,,M,uint16,
L,Model Length,,M,uint16,
...

Standards

While it’s not useful for me to be including the standards, which is an array of strings, would be required to be thorough. This lead me to learn two nice features of jq

  1. excluding attributes: (.field != null)
  2. if statements: if (statement) then do_a else do_b end
  3. string concatenation: .field | join(",")

Combining those we get:

1
2
3
cat model_2.json | \
  jq -c '.["group"]["points"][]' | \
  jq -c '(if (.standards != null) then (.standards | join(",")) else .name end)'
"IEEE 1547-2018"
"IEEE 1547-2018"
"IEEE 1547-2018"
""

Include the name to verify they are correct:

1
2
3
cat model_2.json | \
  jq -c '.["group"]["points"][]' | \
  jq -c '(if (.standards != null) then [.name, (.standards | join(","))] else .name end)'
["ID","IEEE 1547-2018"]
["L","IEEE 1547-2018"]
["Mn","IEEE 1547-2018"]
["Md","IEEE 1547-2018"]
["Opt",""]

We can now add that to our main re-formatting statement as well as provide default values for fields like access using if statements:

1
2
3
4
5
6
7
8
9
cat model_2.json | jq -c '[.["group"]["points"][] | {
    name: .name,
    label: .label,
    access: (if (.access != null) then .access else "R" end),
    mandatory: (if (.mandatory != null) then .mandatory else "O" end),
    type: .type,
    units: .units,
    standards: (if (.standards != null) then (.standards | join(",")) else .standards end),
  }]' | qsv json

Symbols and Enums

Enums were an important thing for me to include and many get more complicated than 0=off and 1=on. Enums in these files are in the key symbols and are an array of dictionaries which include two keys: value and name

Using what was learned in standards and using model_2.json, this could be a decent first attempt

1
2
3
cat model_2.json | \
  jq -c '.["group"]["points"][]' | \
  jq -c '(if (.symbols != null) then (.symbols[] | ([.name, .value] | join("="))) else .symbols end)'
null
null
"OFF=1"
"ON=2"
"FULL=3"
"FAULT=4"

The issue being this code will place Enums into new rows and we want them crammed into their own cell in the CSV. The solution is to use another nice jq feature called map

1
2
3
cat model_2.json | \
  jq -c '.["group"]["points"][]' | \
  jq -c '(if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) ) else .symbols end)'
null
["OFF=1","ON=2","FULL=3","FAULT=4"]

Wrap that with another join, to combine the items with commas, and it’s ready for the main re-formatting statement:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
cat model_2.json | jq -c '[.["group"]["points"][] | {
    name: .name,
    label: .label,
    access: (if (.access != null) then .access else "R" end),
    mandatory: (if (.mandatory != null) then .mandatory else "O" end),
    type: .type,
    units: .units,
    symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
    standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
  }]' | qsv json

Injecting Groups

The one thing missing now is injecting the group information into each row. This can be done with --argjson

1
2
3
cat model_2.json | \
  jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
    '.["group"]["points"][] += { group_name: $group_name }'

This injects the key into each point’s dictionary. Now all we have to do is chain this before the main re-formatting statement and add the new key.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat model_2.json | \
  jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
    '.["group"]["points"][] += { group_name: $group_name }' | \
  jq -c '[.["group"]["points"][] | {
    group_name: .group_name,
    name: .name,
    label: .label,
    access: (if (.access != null) then .access else "R" end),
    mandatory: (if (.mandatory != null) then .mandatory else "O" end),
    type: .type,
    units: .units,
    symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
    standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
  }]' | qsv json

Piping into a file

There is a caveat with piping this directly into a CSV: the headers, I have not been able to figure out how to exclude them using qsv and the --no-headers argument as no effect. To counter this, the first run of this command we write directly to the file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat model_1.json | \
  jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
    '.["group"]["points"][] += { group_name: $group_name }' | \
  jq -c '[.["group"]["points"][] | {
    group_name: .group_name,
    name: .name,
    label: .label,
    access: (if (.access != null) then .access else "R" end),
    mandatory: (if (.mandatory != null) then .mandatory else "O" end),
    type: .type,
    units: .units,
    symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
    standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
  }]' | qsv json > sunspec.csv

Any proceeding writes need to trim the headers off and them append

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat model_2.json | \
  jq --argjson group_name $(cat model_2.json | jq -c '.["group"]["name"]') \
    '.["group"]["points"][] += { group_name: $group_name }' | \
  jq -c '[.["group"]["points"][] | {
    group_name: .group_name,
    name: .name,
    label: .label,
    access: (if (.access != null) then .access else "R" end),
    mandatory: (if (.mandatory != null) then .mandatory else "O" end),
    type: .type,
    units: .units,
    symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
    standards: (if (.standards != null) then (.standards | join(",")) else .standards end)
  }]' | qsv json | tail -n +2 >> sunspec.csv

Script

You can manually copy+paste+edit this command to hit all 106 model files you’ve cloned locally to your machine. I decided to expand on this and make a fish script that would curl the files directory from GitHub.

With all this in mind and adding more columns, this is the final script:

Path: scrape-sunspec.fish
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
set -l base_url "https://raw.githubusercontent.com/sunspec/models/refs/heads/master/json/model_"
set -l model_numbers 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 101 102 103 111 112 113 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 160 201 202 203 204 211 212 213 214 220 302 303 304 305 306 307 308 401 402 403 404 501 502 601 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 801 802 803 804 805 806 807 808 809 63001 63002 64001 64020 64101 64111 64112 

set -l first true
for model in $model_numbers
    set -l raw (curl "$base_url$model.json")
    set -l parsed (
        echo $raw |
        jq \
            --argjson group_label $(echo $raw | jq -c '.["group"]["label"]') \
            --argjson group_desc $(echo $raw | jq -c '.["group"]["desc"]') \
            --argjson group_name $(echo $raw | jq -c '.["group"]["name"]') \
            '.["group"]["points"][] += {
                group_name: $group_name,
                group_label: $group_label,
                group_desc: $group_desc
            }' | \
        jq '[.["group"]["points"][] | {
            group_name: .group_name,
            group_label: .group_label,
            group_desc: .group_desc,
            name: .name,
            label: .label,
            access: (if (.access != null) then .access else "R" end),
            mandatory: (if (.mandatory != null) then .mandatory else "O" end),
            static: (if (.static != null) then .static else "D" end),
            size: .size,
            value: .value,
            count: .count,
            sf: .sf,
            type: .type,
            units: .units,
            symbols: (if (.symbols != null) then (.symbols | map([.name, .value] | join("=")) | join(",")) else .symbols end),
            desc: .desc,
            comments: .comments,
            comments: (if (.comments != null) then (.comments | join(",")) else .comments end),
            standards: (if (.standards != null) then (.standards | join(",")) else .standards end),
            detail: .detail,
            notes: .notes
        }]')

    if $first == true
        echo $parsed | qsv json > sunspec.csv
        set first false
    else
        echo $parsed | qsv json | tail -n +2 >> sunspec.csv
    end
end

Conclusion

This whole thing was a good learning experience. I wish I had used qsv more during this process, but it wasn’t meant to be! Hopefully you’ve walked away with a new trick or two! Now time to figure out the actual register addresses scheme!

Note: some of the model files have nested groups. Those can be accounted for by extending the script. I took a look at a few of these nested groups and decided it wasn’t worth the time.

comments powered by Disqus