lethe.props – Key-value properties

An important part of node metadata is represented as properties (or, making names shorter, props). They are stored in property list files.

The aim is to store multiple pairs of a string key and a string value without having to adapt the data store layer for new node or site metadata.

The order of multiple properties with the same name is not changed in this module, since it might have a meaning in higher layers.

The API

class lethe.props.Properties

A map with string keys and values, supporting serialization and deserialization with the file format used for store’s props files.

The collections.MutableMapping abstract base class is implemented by this class. Use its dictionary-like interface to access the keys and values stored.

For serialization reasons, this object is called a properties list, while it isn’t kept as a list in memory. There are two persistent representations of the property list: in the DVCS as a props file and in the index as key-value pairs for the node.

Since there can be multiple properties with the same key, the value for a key is represented as a list of strings. The properties are saved in the same order as list values. Assigning a string to a key makes it the only value for that key, assigning a list replaces all values for the key. You should remove properties by changing the list, not by deleting the whole key.

Assigning objects of different types than strings or lists of strings causes a TypeError. This is not checked when modifying the list for a key, leading to a TypeError during the serialization of the property list.

The constructor makes an empty properties list.

dump()

Return the property list as a string.

load(data)

Load properties from the string data.

If a property of the given name already occurs in this instance, the ones from the file will follow them.

The on-disk format

There are two main design rules for this format: it must be diffable (i.e. a line-oriented text format without too much space for different files representing the same objects) and it must support any key-value pair. It has two minor rules: it should be easy to implement in a Python program and it should be easy for a Python programmer to read and write using a text editor.

Key-value pairs are stored: the key string is stored separately for each value that it contains, unlike what is exposed by the module interface.

The lines are encoded as Python tuple literals containing two strings: the key and the value. In Python 2 only Unicode strings are used, leading to Python 2 and Python 3 rewriting property lists with different literals.

When writing the file, the lines are sorted by the keys (string objects, not literals). The sort must be stable to handle multiple properties of the same key. The file is read in any order: sorting is done only to make a canonical order of properties and avoid considering unchanged properties to be removed and added in diffs. Lines containing only whitespace are ignored: this is used to handle the empty file and the empty line file as empty property lists.

While this custom format is used, there are two obvious formats that could be used instead: JSON or a line-oriented Unix configuration file format. They weren’t chosen for this, since they support more complex object graphs (that won’t be handled by the relational index nor the user interface) or they don’t support some possibly useful key or value contents. With any format multiline property values will require escaping for the one property per line rule, so this format chooses to escape all values.

Example property list file:

(u'foo', u'bar')
(u'spam', u'eggs')
(u'spam', u'bacon')

A property named foo is represented with value bar, two properties spam of eggs and bacon. The same property list is read regardless of the foo line position, while swapping the spam lines will change the order of their values.

Known properties

At least the following properties will be used in Lethe:

title
displayed node title
alias
title alias, like a redirect title in other wikis
name
like title, but used in the user interface only. This property is designed for use in small nodes from which larger works are built which won’t be shown directly except in the editing interface.
license

URI of the license or URI of the license notice node

Todo

implement UI and editing for this property.

author
name of the node or site author
bib_author
name of the author of the document described by the node (the document pointed to by the url property or the binary attachment)
author_uri
URI to link when showing the above author name
index_revision
an identifier of the index schema used (a site property, stored in the index only, not the DVCS)
content_type
content type of node text, see the formatted text section for list of known values
binary_content_type
content type of the node binary attachment
main_node
a site property containing the UUID of the main node for the Web interface to show
url
an URL of a document that this node represents or describes
last_visited
a date in the W3C date and time format representing the last time the url property referred to the document relevant to the node, useful if the node represents a formal citation of the document
bookmark_folder

name of browser folder for nodes exported from browser bookmark databases which have them, like Mozilla’s Places or XBEL

This field is not used in the Web UI.

meta_description
description used in the HTML meta description tag
meta_keywords
keywords used in the HTML meta keywords tag, probably not useful; multiple values are concatenated joined by commas, tags are used instead if no keywords are specified

Table Of Contents

Previous topic

lethe.datastore – Data storage and representation

Next topic

lethe.index – Relational data index

This Page