lethe.node_import – Import nodes from other sources

This module provides support for import of data in other formats (supported by extension modules) into nodes to be added to a store.

With various node properties, Lethe can be used to organize your browser bookmarks or the bibliography of your article. With this module and node importer extensions, you can add new entries to such a site directly from the browser files or BibTeX databases.

Why name it ‘node importers’? Support for importing whole wikis with their histories should be implemented in future, while this interface won’t be appropriate for it. Complete nodes are imported, with no previous history.

To implement your own importer extension, add a module to the lethe.ext.node_import package (and update __all__ there) with a subclass of Importer or one of its abstract subclasses defined in this module. To use the importers, use get_importer, types or, for complex tasks, ImporterRegistry.

class lethe.node_import.Importer(store)

Base class for node importers.

An instance handles imports into a single store, while it doesn’t save the nodes.

Use get_importer instead of instantiating these objects.

Subclasses must implement nodes_from_path and define name_pattern or implement supports_file for more complex matching.

name_pattern = None

A string or a compiled regular expression describing the names of files that this importer supports. The string is the file name without directory names, the regular expression is matched on the whole path given.

nodes_from_path(path)

Iterate new uncommitted nodes for entries of the import file at path.

store = None

lethe.datastore.Store instance to which the imported nodes belong.

classmethod supports_file(path)

Return True if the file at path can be imported by this class.

type_name = None

A human-readable name for this import format.

class lethe.node_import.TextImporter(store)

An importer that can import nodes from a string.

Subclasses must implement nodes_from_string.

nodes_from_string(string)

Iterate new uncommitted nodes for entries in the string.

class lethe.node_import.XMLImporter(store)

An importer that imports nodes from an XML document.

Subclasses must implement nodes_from_xml.

nodes_from_xml(tree)

Iterate new uncommitted nodes for entries in an XML document.

Parameters:tree – an lxml.etree.ElementTree instance representing the document with entries to import
class lethe.node_import.SQLiteImporter(store)

An importer supporting SQLite database files as input.

Subclasses must implement nodes_from_db.

nodes_from_db(con)

Iterate new uncommitted nodes for entries in an SQLite database.

Parameters:con – an sqlite3.Connection instance for a connection to the input database
class lethe.node_import.ImporterRegistry

Manage node importer extensions.

add_default_importers()

Add all importers from lethe.ext.node_import.

add_importer(importer)

Register an importer.

get_importer(store, path, type_name=None, classes=(<class 'lethe.node_import.Importer'>, ))

Return an Importer instance for use with the file at path.

The nodes are not imported by this function, for this you need to call Importer.nodes_from_path or a method of its subclass on the returned object.

Parameters:
  • storelethe.datastore.Store instance for the imported nodes
  • path – input file path
  • type_name – optional type of the importer (from Importer.type_name in IMPORTER_TYPES) or None for one to be chosen automatically from the file name
  • classes – a sequence of classes that the importer chosen must be an instance of, use e.g. if you require the TextImporter.nodes_from_string interface
Returns:

an Importer instance that can handle the specified file or None

types

A sequence of Importer.type_name for supported importers.

lethe.node_import.get_importer = <bound method ImporterRegistry.get_importer of <lethe.node_import.ImporterRegistry object at 0x2df4310>>

ImporterRegistry.get_importer for the default registry providing all lethe.ext.node_import extensions.

lethe.node_import.types = (u'Places', u'recfile')

ImporterRegistry.types for the default registry providing all lethe.ext.node_import extensions.

Importer extensions

The lethe.ext.node_import package contains modules implementing specific importers. Don’t import them directly: lethe.node_import.get_importer should find an appropriate importer for the requested file.

lethe.ext.node_import.places – Mozilla Places importer

Node importer for Mozilla places.sqlite.

See <https://developer.mozilla.org/en-US/docs/The_Places_database> for documentation of the database format used. Only bookmarks are imported, history and annotations are not used. Favicons are not imported.

Each bookmark is imported into a single node. Metadata is represented using props and descriptions are stored as node text.

Todo

the functionality of this module is very limited to what the author needed, with some understanding of the format used it could be made more generic and useful for other ways of using bookmarks

class lethe.ext.node_import.places.PlacesImporter(store)

Node importer for places.sqlite.

lethe.ext.node_import.recfile – recfile importer

Node importer for a custom recfile record type.

The input syntax is the same as supported by GNU recutils.

Bookmarks are read from a record set compatible with the following type:

%rec: Bookmark
%key: URL
%mandatory: Title Folder
%unique: Title
%unique: Folder
%type: Visited date
%type: Tag line
%type: Language line
%sort: Title

Records like this one can be imported:

URL: https://www.gnu.org/philosophy/bsd.html
Title: BSD License Problem - GNU Project - Free Software Foundation (FSF)
Folder: Root
Tag: bsd
Tag: free software
Tag: licensing
Visited: 2012-08-30 14:10:00.669158

For simplicity, the parser skips record descriptors and uses all records that have the URL field. The Description field is used for node text. All unknown fields are imported as properties with lowercased keys. Visited dates are supported only with the above format, they are assumed to be UTC.

Todo

This code should use GNU recutils Python bindings instead of the custom parser that it has now.

class lethe.ext.node_import.recfile.RecfileImporter(store)

Importer for recfiles.