lethe.dvcs – Distributed version control system interface

Todo

reorganize the documentation, explain how to implement support for another DVCS

Distributed version control interface of Lethe.

This module contains the base class representing a DVCS used to store the data. It should not be used directly from other code than subclasses for specific DVCS implementations or the data store module.

Why call it a ‘DVCS’ while the ‘distributed’ part is not needed for the design of Lethe? Centralized version control systems are broken and don’t support the functionality needed for decentralized projects like publicly readable wikis.

Todo

describe how the interface work

The repository is a collection of files. Each file has revisions which are atomic objects with revision identifier (a human-readable string), content and some metadata.

Revisions of a single file have a partial order, there usually is a single head revision representing the current version of the program.

Revision identifiers are unique for a given file, while the same one can be used for unrelated revisions of different files – check the revision object attributes to know if it is the same revision. If a file is removed and then added, its future changes can reuse the previous identifiers, since it is not the same file. You can skip this paragraph with any proper DVCS that uses global (or per repository) revision identifiers, unlike some experimental backends that might be supported by Lethe.

Todo

Think how commits of multiple files should be handled if the DVCS doesn’t support committing more than one file in a single revision. Probably the solution is to require support for revisions changing multiple files unless the version control system used is lethe.ext.dvcs.ephemural.

DVCSes usually support a single revision changing multiple files. Such revisions are fully supported by this interface.

Directories aren’t represented in this interface, while Repository.get_subtree_revisions can be used to get revisions of all files in a directory. Directory and file names are always separated by slashes in this interface, even if the local filesystem uses different directory separators.

Todo

we need branches for features like edit conflicts and unapproved revisions, think how to handle them in this interface

class lethe.dvcs.Repository(path)

Base class for interaction with a distributed version control system repository.

An instance represents a single repository.

Todo

which methods should be overridden?

The constructor opens an existing repository.

Parameters:path – the local filesystem path of the repository
Raises RepoNotFoundError:
 no supported repository found at the path
commit(new_contents, description, files_to_remove=(), author=None, date=None, parent=None)

Make a DVCS commit.

Parameters:
  • parent – parent revision identifier, by default the current revision
  • new_contents – a map from file name to its new content as a byte string
  • description – commit message
  • files_to_remove – an iterable of names of files that this commit removes
  • author – the user name and email string for commit author, None means the system default
  • datedatetime.datetime of the revision being made, usually left None to use the current date. This parameter is designed for uses like history import from other systems. Support for values other than None is not implemented yet.
Returns:

the new Revision object. This can be used to obtain the revision identifier which isn’t generally useful on its own.

Todo

exceptions, handling of edit conflicts

classmethod create(path)

Create a repository at path if it doesn’t exist.

The repo should be made to ignore the index.sqlite file and cache directory internally used by the data store layer.

do_commit(new_contents, description, files_to_remove, author, parent)

Implementation of commit. Every subclass must implement it.

Parameters:
  • new_contents – a map from file name to its new content
  • description – commit message
  • files_to_remove – an iterable of names of files to remove in this commit
  • author – author name followed by email in angle brackets
  • parent – parent revision identifier or None for first revision
Returns:

Revision object representing the new commit

get_content(file_name, revision=None)

Get the file content at given or head revision.

Parameters:
  • file_name – name of the file to find
  • revision – a Revision object or a revision identifier. Set it to get a non-newest version of the file.
Returns:

the file content as a bytestring or None if there is no such file at the given revision

get_current_revision_id()

Return the current head revision identifier.

Returns:current revision identifier as string or None if there are no revision or the DVCS doesn’t have the concept of a single head revision
get_default_author()

Return the author name and email that is default for this repository.

get_revisions(file_name)

Iterate revisions of a file.

Revisions of removed files or from before it was most recently added are not returned.

The order of revisions returned is unspecified.

Returns:an iterable of Revision instances.
get_subtree_revisions(path)

Iterate revisions containing objects with file names starting with path.

Revisions without any currently existing files at the path might be returned too.

The order of revisions returned is unspecified.

Depending on the DVCS used, path might have to contain only full path elements, with no substring matching being done on directory or file names.

list_dir(path=u'')

Iterate names of files in the specified repo directory.

Absolute names (starting with path) are returned. Removed files might be returned too.

exception lethe.dvcs.RepoNotFoundError

Exception raised when opening a nonexistent repository.

class lethe.dvcs.Revision

A represenation of a DVCS revision of a file.

Only Repository subclasses should construct such objects directly. The constructor makes an empty revision with no useful data.

For performance reasons, file content is not read when a Revision object is created. Instead the content dictionary maps to functions that open the file (which isn’t always a local filesystem file). File’s read and close methods should be used to obtain the file content.

author = None

File author name and email. None if unknown.

content = None

A map from file name to a callable that returns the file content as a byte string. All files that this revision contains are keys of this attribute.

date = None

Date of origin. None if unknown, not in a real DVCS.

repository = None

The Repository object containing this revision.

revision_id = None

Revision identifier.

DVCS backends

Extension modules in the lethe.ext.dvcs package implement lethe.dvcs.Repository subclasses for use with specific DVCSes. Currently there is one experimental implementation, lethe.ext.dvcs.ephemural, designed only for development of Lethe: it doesn’t store the history.

lethe.ext.dvcs.ephemural – DVCS implementation with no history

A DVCS backend with no history.

Don’t use it if you have data.

Each file has exactly one revision, always represeting the version that is stored in the local filesystem. No extra files are added for version control. Every file is considered to have been created by its newest modification: while they have a creation date, there is no data to make a revision object with it.

This module is not designed to handle concurrent file modifications. Interruption or power loss should leave a partially committed state, while each modified file should be atomically replaced.

The name comes from Ephemural, an antagonist in Twokinds who made Trace Legacy forget his past. It also refers to the lack of persistence in this implementation.

Those who cannot remember the past are condemned to repeat it.

– George Santayana

class lethe.ext.dvcs.ephemural.Repository(path)

lethe.dvcs.Repository implementation.

lethe.ext.dvcs.git – storage in a git repository

A DVCS backend accessing a Git repository using the Dulwich implementation of Git file formats.

This module will evolve into a working and persistent data storage backend for Lethe wikis with support for history and sharing changes.

The data is stored as a bare repository, without the working copy nor index.

Todo

rework the lethe.dvcs interfaces so this can work

class lethe.ext.dvcs.git.Repository(path)

lethe.dvcs.Repository implementation.