27 Aug 2018
It's been a while...
I've been fiddling with home automation for a few years, but only recently found my way to http://www.home-assistant.io - a project with great developers and a great community. I've been hanging about the official Discord Chat Server, and try to give something back by helping people when I can.
One thing I noticed that people often struggle with is the YAML configuration files. A strange choice for the kind of quasi-programming you may want to do when automating your home appliances.
In this post I have tried to describe how YAML works, and how I think about the way it represents basic data structures. I hope it can be useful to someone...
To understand YAML, you need to understand what it's describing. YAML works with two main data structures - Dictionaries and Lists.
Lists are easy to understand. They're just a list - ordered collections of things. There are two things you need to remember though. Let me show them to you in a list :)
A list of lists might sound weird, but just think about the lists you have in your home.
Dictionaries (or dicts for short) are not much more complicated. You've probably encountered them in some way or another, but perhaps with a different name. In different programming languages they can be called dictionaries, hashes, maps, hash tables, tables, collections or even just objects. The technical name is Associative Array.
Regardless of name, the concept is simple. A dictionary is an unordered collection of entries - where each entry has a Key and a Value. The Key can be thought of as a name or title for the entry, and the Value is the data of the entry. Remember:
Let's look at some sample dictionaries:
Monday: Sausage and beans Tuesday: Fish Wednesday: French onion soup Thursday: Pea soup and pancakes Friday: Pizza
Each day is labeled by a key, and has a value - what are you going to eat that
day. Note that while you could add another Wednesday
to the end of the list,
it wouldn't really make sense. Thus keys have to be unique. The values doesn't
however. It would make perfect sense to eat pizza again on Saturday.
Since keys are unique, their order is not important:
Wednesday: French onion soup Friday: Pizza Monday: Sausage and beans Tuesday: Fish Thursday: Pea soup and pancakes
This dictionary contains exactly the same data as the one above. A clear difference from a list, where the order itself is a part of the data.
Another dictionary:
Name: Thomas Lovén Email: thomasloven@example.com
That's a dictionary. Looks kind of like a database of a sort, doesn't it? Like the address book in your email program? Ah! But don't get fooled. The address book is a list, not a dictionary. However - the items in the list are dictionaries.
Let's add on to that dict:
Name: Thomas Lovén Email: thomasloven@example.com Hobbies: singing, woodworking, home automation
Now we added an entry to the dict where the value is a list. I have three hobbies. This illustrates that the value of a dictionary entry can be anything. Even a dictionary:
Name: Thomas Lovén Email: thomasloven@example.com Hobbies: singing, woodworking, home automation Phones: Home: +46 (0)XX XXXXXX, Work: +40 (0)XX XXXXXX
And remember that lists can contain dictionaries too...
Name: Thomas Lovén Email: thomasloven@example.com Hobbies: singing, woodworking, home automation Phones: Home: +46 (0)XX XXXXXX, Work: +40 (0)XX XXXXXX Children: Name: N, Age: 3 ; Name: H, Age: 1
But at this point things are getting advanced. It's hard to keep track of what is a dict and what is a list and what contains what...
If only there was a language to describe these concepts... a sort of Markup Language, if you will... but who needs Yet Another one of those?
Let me tell you about
Javascript Structured Object Notation - JSON. You thought I was going to say YAML, didn't you? Soon...
JSON is a simple way of writing down the concepts described above, which can be easily understood both by humans and by computers.
JSON describes object and arrays, there are objects and arrays, but let's call them dictionaries and lists instead.
lists are surrounded by square brackets and contain items separated by commas.
The items can be strings, numbers, dictionaries, lists or any of the magic
values true
, false
, or null
.
Dictionaries are surrounded by curly braces and contain entries separated by commas. The key and value of each entry is separated by a colon. Keys must be strings, but values can be anything that can be in a list.
Let's look at our dict from above in JSON format:
{ Name: "Thomas Lovén", Email: "thomasloven@example.com", Hobbies: [ "singing", "woodworking", "home automation" ] Phones: { Home: "+46 (0)XX XXXXXX", Work: "+46 (0)XX XXXXXX" } Children: [ { Name: "N", Age: 3}, { Name: "H", Age: 1} ] }
I added some line breaks and indentations to make it more pretty, but this is much easier to read. Even the last entry with the nested dict-list-dict about my children. Two things to note:
OK. Now you understand one markup language. Let's learn something different.
YAML Ain't a Markup Language - but it's pretty darn close, to be honest.
While probably not historically accurate, YAML can be seen as an evolution or superset of JSON. In fact, any valid JSON is also valid YAML. That might be useful to remember. There are some notable differences, though.
First of all, YAML doesn't require braces. Instead items in lists are separated by newlines where each item starts with a dash:
- Item 1 - Item 2 - Item 3 is a long one that stretches over multiple lines. The new item won't start until we get to a line that starts with a dash, like the one below this one. - Item 4 - - Item 5a - Item 5b - Item 5c - Item 5
Some things to note:
"true"
is a string, but true
is a
boolean value.I'll just take the opportunity to say this again: Indentation is important. Probably the most important concept of YAML.
Slightly simplified one can say that by indenting line, those line become a continuation of the line above, until the next de-dented line.
Dictionaries are also written one entry per line, as key: value
:
Name: Thomas Lovén Email: thomasloven@example.com Hobbies: - singing - woodworking - home automation Phones: Home: +46 (0)XX XXXXXX Work: +46 (0)XX XXXXXX Children: - Name: N Age: 3 - Name: H Age: 1
Things to note:
Hobbies
is a list. Like above, each line of the value is
indented by an equal number of spaces.Phones
is a dictionary. The same indentation rules apply.Children
is a list where each item is a dictionary. So each
line in each dictionary is indented twice.And that's all the basics of data representation using YAML.
As I've been trying to help people with their configurations on the homeassistant Discord server, I have found one problem which occurs more than any other. Indentation errors.
Indentation is important
It must be correct or the YAML either won't be accepted by the parser or will describe something entirely different from what you intended.
The only advice I can give is to think carefully about the structure of the data you are trying to represent. What is your object? Is it a dictionary or a list? Where is it contained? Is it freestanding? Is it an entry in a dict? Is it an item in a list? What is it's parent? What are it's children? What are it's siblings?
If you have a complex structure - say some nested cards in lovelace - it might help to make a drawing on actual paper of what you want.
In the YAML dictionary sample above, I used a contracted form in my list of dicts. This is common practice, but may be a bit confusing at first since it makes the indentation unclear.
If might be easier to understand the structure of the document if you use the expanded form:
Children: - Name: N Age: 3 - Name: H Age: 1
Also, remember that dictionary entries are unordered. The YAML below describes the same thing as that above, but might be more confusing.
Children: - Name: N Age: 3 - Age: 1 Name: H
Often software generated YAML (such as what's produced by lovelace-gen) write the entries out in such a way that the keys are ordered alphabetically.
Adding comments to your code makes it easier to understand. Both to other people, and - more importantly - to your self when you return to it in six months because something stopped working.
In YAML, comments begin with a number sign #
, last until the end of the line
and are simply ignored by the parser.
# A dictionary about me Name: Thomas Lovén Email: thomasloven@example.com # This isn't really my email Hobbies: # Just some of the ways I like to waste time - singing # choir, mostly - woodworking
If you want a number sign in your entry, the entry must be quoted.
As mentioned, YAML doesn't require quotes around strings, but they are allowed. Quotes can be useful to tweak the parsing. Imagine for example the following list:
- Halflife 1 - Halflife 2 - Halflife 2: Episode Two
This is a list of strings, right? Wrong. The third entry is a dictionary with a single entry "Halflife 2" with value "Episode Two" (keys can contain spaces, by the way).
To fix this, you can use quotes:
- Halflife 1 - Halflife 2 - "Halflife 2: Episode Two"
There's a reason I went through JSON to explain YAML. As I said, all JSON is also valid YAML. This allows for compact notation:
Name: Thomas Lovén Email: thomasloven@example.com Hobbies: [singing, woodworking, home automation] Phones: Home: +46 (0)XX XXXXXX Work: +46 (0)XX XXXXXX Children: - {Name: N, Age: 3} - {Name: H, Age: 1}
I mention this because you just might run into it sometime. I like to use it in my configurations to bring down the line count, but it's easy to go overboard and make the data hard to read instead. In the end it's a matter of taste.
Note that there are still no quotes. That's OK as long as you don't want commas, } or ] in the value.
Further, YAML allows single line comma separated lists without braces.
Hobbies: singing, woodworking, home automation
This also means that a string containing a comma must be quoted.
Dictionaries can be merged using the merging operator: <<
. For example:
a key: a value b key: b value <<: {c key: c value, d key: d value}
will be parsed as
a key: a value b key: b value c key: c value d key: d value
and so will
a key: a value b key: b value <<: c key: c value d key: d value
In short, the <<
operator takes a dictionary as its value, and merges it into
the parent dictionary.
Merging is very convenient when used in combination with node anchors.
Node anchors are a way of saving a dictionary, and reusing it later
my_dict: &my_dict a: 1 b: 2 c: 3
In this case &my_dict
is NOT the value of my_dict
, but a node
anchor - as signified by the ampersand &
.
The anchor saves the value for later reuse and can be recalled any number of
times using an asterisk *
:
a dictionary: &saved a: 1 b: 2 c: 3 another dictionary: *saved a list: - *saved - *saved
This will be parsed as:
a dictionary: a: 1 b: 2 c: 3 another dictionary: a: 1 b: 2 c: 3 a list: - a: 1 b: 2 c: 3 - a: 1 b: 2 c: 3
You can also merge an anchor if you want to add more entries to the dict:
base: &base a: 1 b: 2 extended version: <<: *base c: 3
You should be able to guess how this parses out.
Now, for my final trick:
The problem with the above examples is that you need to put the definition
somewhere. The YAML snippets above will have the dictionary entries a
dictionary
and base
defined and set no matter what. Sometimes that's
impractical, which is why you often see the following in homeassistant
packages:
homeassistant: customize: package.node_anchors: common: &common key1: val1 key2: val2 sensor.my_sensor: <<: *common icon: mdi:temp sensor.another_sensor: <<: *common icon: mdi:home
The package.node_anchors
entry in the customize
dictionary contains a
dictionary of stuff that is simply ignored. Anything you put there will have no
effect on the package, so it's a great place to define anchors.
Another possibility is to put the definition in the first place it is used, and merge it immediately:
homeassistant: customize: sensor.my_sensor: <<: &common {key1: val1, key2: val2} icon: mdi:temp sensor.another_sensor: <<: *common icon: mdi:temp
Not all YAML parsers allow this, but it seems to work with homeassistant.