Source code

Important

The Python API is not guaranteed to be stable. Egaia is distributed with the intention that it will be used primarily as a command-line application; as such, all functions should be considered internal.

egaia.egaia module

Main dispatching module for the command-line interface.

egaia.egaia.tracefunc(frame, event, arg, indent=[0])[source]

Trace function calls, for debugging purposes.

egaia.egaia_archivedotorg module

egaia.egaia_archivedotorg.listVideos(uuid=None, new=True)[source]

Get a list of videos from the current collection that have not been uploaded, and matching the uuid if specified. If new is False, we will not filter out the previously uploaded items.

egaia.egaia_archivedotorg.post(items, upload=True, faststart=False, collection='opensource_movies', dry_run=False)[source]

Upload items to the Internet Archive. Takes a list of metadata dictionaries, corresponding to csv rows, as provided by listVideos(). If upload is not True, the remote metadata will be updated but items will not be re-uploaded.

egaia.egaia_archivedotorg.updateDocx(uuid)[source]

Populate the “remote_embed_url” metadata field for an item.

egaia.egaia_bag module

egaia.egaia_bag.init(description=None, title=None, new=True)[source]

Wrap the newBag function in the bagit utility, and create an empty csv file for user metadata entry.

egaia.egaia_bag.loadBag()[source]

Load bag-info.txt and return a dictionary of its contents.

egaia.egaia_bag.newBag(description='None provided', title='None provided')[source]

Check if a bag already exists in the current path, and if not, create a new one.

egaia.egaia_bag.updateBag()[source]

Update bag manifests

egaia.egaia_bag.updateDocx(docinfo)[source]

Update the collection docx file

egaia.egaia_bag.updateFindingAid()[source]

Rebuild the docx finding aid based on bag-info.txt.

egaia.egaia_bag.validateBag()[source]

Validate a bag

egaia.egaia_collage module

egaia.egaia_collage.getFilenames(uuids=None)[source]

Create a list of filepaths for items in the current collection that have thumbnail images available.

egaia.egaia_collage.getFilenamesFromUuids(uuids)[source]

Retrieve a list of filenames from a list of UUIDs.

egaia.egaia_collage.mkcollage(filenames=None, prefix='')[source]

Generate a collage out of a list of images.

egaia.egaia_collage.resize(src, target, size, border=None)[source]

Create a resized image for the collage.

egaia.egaia_config module

egaia.egaia_config.dumpConfig(cfg_path)[source]

Write the currently set configuration values to the file located at cfg_path.

egaia.egaia_config.getConfig(section, key, boolean=False)[source]

Get the value of a particular configuration setting

egaia.egaia_config.init(path)[source]
egaia.egaia_config.loadConfig()[source]

Load configuration values, and return a ConfigParser object.

egaia.egaia_config.printConfig(section=None)[source]

Print a list containing configuration values

egaia.egaia_config.setConfig(cfg_path, section, key, value)[source]

Set the value of a particular configuration setting

egaia.egaia_config.updateConfig(cfg_path, section='archive')[source]

Update the specified configuration file located at cfg_path, providing a user prompt with the key and output prefilled with the current value.

egaia.egaia_derive module

egaia.egaia_derive.findRules(fn, frame, force, update)[source]

Find the matching processing rule(s) for a given filename

egaia.egaia_derive.makeDeriv(rule, mtype, fn, UUID, tmpFile, outDir, outFile, basename, N)[source]

Create a derivative. This function contains the processing rules and commands.

egaia.egaia_derive.process(uuid=None, frame='0', force=False, update=False)[source]

Generate a list of files to process

egaia.egaia_derive.run(arguments)[source]

Run an external command. Do nothing if the “dry-run” flag is set.

egaia.egaia_docx module

egaia.egaia_docx.collectionMetadata()[source]

Provide a json string with all existing metadata

egaia.egaia_docx.csv2docx(csv_file, force=True)[source]

Convert csv to docx files

egaia.egaia_docx.docx2csv(docx_files, csv_file)[source]

Convert docx to csv

egaia.egaia_docx.docx2json(docx_files, json_file=None)[source]

Convert docx to json

egaia.egaia_docx.filter_files(pattern)[source]

Return a list of metadata files from a globbing pattern.

egaia.egaia_docx.getCoreFields(filtered=True)[source]

Return an ordered list of core metadata field labels.

egaia.egaia_docx.loadCsv(csv_path)[source]

Load metadata csv using the DictReader. Return a list of dicts.

egaia.egaia_docx.parseDocx(docx_file)[source]

Process a docx file and return a dict of metadata, or None.

egaia.egaia_docx.update(update_existing=False)[source]

Add new metadata files to the collection.

egaia.egaia_docx.writeDocx(filename, append=None, write=None, out_file=None, force=False)[source]

Write a dict to docx.

egaia.egaia_figs module

Place a hyperlink within a paragraph object. This code is from python-docx.

egaia.egaia_figs.writeDocx(uuids)[source]

Append figures corresponding to a list of uuids to a template document.

egaia.egaia_list module

egaia.egaia_list.applyExclude(name, exclude)[source]

Return True or False to indicate whether a given filename matches a list of exclusion patterns.

egaia.egaia_list.countItems(filepath=None)[source]

Provide a count of the items in a collection.

egaia.egaia_list.decodeName(name)[source]
egaia.egaia_list.listDirs(filepath=None, filter_type=None, uuid=None, exclude=None)[source]

List directories that can be tagged

egaia.egaia_list.listFiles(filepath=None, filter_type=None, uuid=None, exclude=None, include_dirs=False)[source]

List the files in a collection.

egaia.egaia_list.listItems(filepath=top)[source]

List the UUIDs corresponding to the items in a collection.

egaia.egaia_list.matchFilter(name, filter_type)[source]

Return True or False to indicate whether a given filename matches a filter.

egaia.egaia_list.setFilepath()[source]

egaia.egaia_log module

egaia.egaia_log.logRename(original, modified)[source]

Log renames so that we can undo changes later

egaia.egaia_log.writeLog(line)[source]

Write a line to the log file

egaia.egaia_make module

egaia.egaia_make.copyDerivatives(derivs, destdir, force=False)[source]

Copy derivatives to the “pub” directory for distribution.

egaia.egaia_make.generateList(items)[source]

Create a responsive grid of items with thumbnails.

egaia.egaia_make.generatePlainList(items)[source]

Generate a list of items, not as a grid.

egaia.egaia_make.getDestDir(uuid, type='item')[source]

create destination dir for items

egaia.egaia_make.getPreview(uuid, preview_type='item')[source]

Retrieve an html preview for an item or collection.

egaia.egaia_make.getThumb(uuid, size='thumb')[source]

Identify and return the base filename of a thumb image from a list of derivatives.

egaia.egaia_make.init()[source]

Initialize

egaia.egaia_make.loadJson(uuid, type)[source]

Load json metadata for an item.

egaia.egaia_make.makeAlias(alias, target)[source]

Generate an alias (redirect page) for an item. Target should be a UUID; alias should be a filename.

egaia.egaia_make.makeCollectionJson(delete=False, outdir=None)[source]

Create json representations for all items in the current collection.

egaia.egaia_make.makeCollectionPage(uuid, delete=False)[source]

Generate an html description page for the current collection.

egaia.egaia_make.makeCollectionPreview(uuid)[source]

Generate the preview image and brief description for the collection that will be included in the list or indexes of collections.

egaia.egaia_make.makeDownloadsList(derivs)[source]

Prepare an html list of download links for derivatives

egaia.egaia_make.makeEmbedTag(derivs, page)[source]

Prepare an embeddable media tag from a list of derivatives. Return a full embed object tag, with url reference to the media file(s). This is usable in an item description page. The “page” parameter should be the uuid of the item.

egaia.egaia_make.makeErrorPages()[source]

Generate 404 and other error pages for the archive.

egaia.egaia_make.makeHomePage()[source]

Generate a home page for the archive.

egaia.egaia_make.makeIndexes()[source]

Regenerate the indexes for the entire archive.

egaia.egaia_make.makeItemPage(uuid, force=False, delete=False, nolinks=False)[source]

Generate an html page for an item.

egaia.egaia_make.makeItemPreview(uuid)[source]

Generate an html preview for an item, to be included in indexes or lists.

egaia.egaia_make.makeLocalEmbed(embed_type=None, src=None, mtype=None)[source]

Assemble an html5 audio or video embed tag.

egaia.egaia_make.makeMetadataTable(data, nolinks=False)[source]

Create an HTML definition list from a dict containing item metadata (normally via json), or the values in bag-info.txt.

Return pagination links

egaia.egaia_make.makePreview(uuid, preview_type='item', delete=False)[source]

Create an html preview for an item or collection, to be included in indexes or lists.

egaia.egaia_make.makeRemoteEmbed(url)[source]

Prepare an embeddable video tag from a URL. Return an iframe. This is usable in an item description page.

egaia.egaia_make.makeStaticDir(force=False)[source]
egaia.egaia_make.makeTags(tags, tagtype='tags')[source]

Generate an HTML list of keyword tags, from an input list.

egaia.egaia_make.makeTextEmbed(fn)[source]

Embed a text or html file.

egaia.egaia_make.updateItemIndex(uuid, delete=False)[source]

Update the keyword indexes based on json metadata for a given item.

egaia.egaia_make.writeHtml(html_content, destdir, filename=u'index-en.html')[source]

Write html file

egaia.egaia_make.writeIndex(out, target, title, subtitle, paginate_size=24)[source]

Prepare an index page.

egaia.egaia_meta module

egaia.egaia_meta.getDate(filename)[source]

Retrieve the file modification date, for dc_date. This is not necessarily reliable but is sometimes useful.

egaia.egaia_meta.getDimensions(filename)[source]

Retrieve the dimensions of images, for dc_extent

egaia.egaia_meta.getDuration(filename)[source]

Retrieve the duration of audiovisual media, for dc_extent

egaia.egaia_meta.getMetadata(filename, existing=None, restrict=False)[source]

Extract all the metadata for a given file. Pass the “existing” parameter in order to merge new metadata with an existing record. The existing data will not be overwritten. The “restrict” parameter limits updating to DCTERMS.modified and DCTERMS.extent, which are always derived from the file on disk, and can therefore safely be updated without losing user input.

egaia.egaia_meta.getMtype(filename)[source]

Retrieve the MIME Type, for dc_format

egaia.egaia_meta.getSize(filename)[source]

Retrieve the file size in bytes, for dc_extent

egaia.egaia_meta.getType(filename)[source]

Identify the DCMI Type, for dc_type

egaia.egaia_parsefn module

egaia.egaia_parsefn.getBasename(filepath)[source]

Return the base part of a tagged filename.

egaia.egaia_parsefn.getExtension(filepath)[source]

Return the extension for a filename.

egaia.egaia_parsefn.getFormat(filepath)[source]

Return the format of a derivative file.

egaia.egaia_parsefn.getUuid(filepath)[source]

Return the uuid for a tagged filename, if available, or None.

egaia.egaia_parsefn.parseFilename(path)[source]

Parse a tagged file and return a triple: (basename, uuid, extension).

egaia.egaia_parsefn.separate(filepath)[source]

Search for the UUID in a string representing a filepath, and separate

egaia.egaia_root module

egaia.egaia_root.get_root(path=None)[source]

Return the root of the current collection.

egaia.egaia_roughcut module

egaia.egaia_roughcut.cat(clip_list)[source]

Concatenate clips from the file clip_list. The format is: #, reel, clip, inpoint, outpoint, title, caption, notes

egaia.egaia_roughcut.get_dict(clip_list)[source]

Read a csv file and return a dict.

egaia.egaia_roughcut.get_thumb(clipname)[source]

Extract a thumbnail image from the clip.

egaia.egaia_roughcut.intertitle(text, clipdir)[source]

Create an intertitle clip and return the clip name.

egaia.egaia_roughcut.make_finding_aid(clip_list, title='Video finding aid')[source]

Generate a finding aid with thumbnails and captions.

egaia.egaia_roughcut.processLine(clipdir, reel, clip, inpoint, outpoint, **kwargs)[source]

Parse a line containing a clip name.

egaia.egaia_sanitize module

egaia.egaia_sanitize.getSanitizedName(filepath)[source]

Create a sanitized filename

egaia.egaia_sanitize.makeSlug(tag)[source]

Create a latin-encoded keyword tag. This allows us to use a standard and predictable system for two-way mapping of URL/filenames and keywords in various character sets.

egaia.egaia_sanitize.sanitize(filepath)[source]

Sanitize a filename

egaia.egaia_sanitize.sanitizeFiles()[source]

Sanitize ALL the files in a collection

egaia.egaia_serve module

class egaia.egaia_serve.MyTCPServer(server_address, RequestHandlerClass, bind_and_activate=True)[source]

Bases: SocketServer.TCPServer

server_bind()[source]
egaia.egaia_serve.serve(port)[source]

egaia.egaia_setup module

egaia.egaia_setup.setBinaries()[source]

Locate binary files required for format conversions. Add paths to the configuration files, and list missing binaries on stdout. If a binary is not found automatically, the path in the configuration file is left intact.

egaia.egaia_setup.setup(interactive=True)[source]

Run all the setup commands.

egaia.egaia_tag module

egaia.egaia_tag.getTaggedName(filepath, UUID=None)[source]

Return a filename tagged with uuid

egaia.egaia_tag.mkuuid()[source]

Generate a new uuid.

egaia.egaia_tag.tagDirs()[source]

Tag the directories ending in ”.dir” or ”.vclips”.

egaia.egaia_tag.tagFile(filepath, UUID=None)[source]

Rename a file, tagged with a uuid

egaia.egaia_tag.tagFiles()[source]

Tag ALL the files in a collection

egaia.egaia_tag.untagFile(filepath)[source]

Rename a file, with uuid tag removed

egaia.egaia_tag.untagFiles()[source]

UNTAG ALL the files in a collection

egaia.strings module

These are css, javascript, and similar strings for use in html page generation. They are placed here in order to maintain the legibility of the Python code in the modules that use them.

egaia.utils module

Multipurpose utilities for egaia.

egaia.utils.byteSize(num, suffix='B')[source]

Give a human-readable representation of a filesize.

egaia.utils.current_time()[source]
egaia.utils.docx2html(document)[source]

Convert a docx document to a standalone file.

egaia.utils.docx2str(docx_file)[source]

Convert a docx document to a fragment string.

egaia.utils.fmtTime(secs)[source]

Convert seconds to hh:mm:ss

egaia.utils.isotime(timestamp)[source]
egaia.utils.makeDocument(uuid)[source]

Return a python-docx Document instance.

egaia.utils.md2html(text, meta=False)[source]

Convert Markdown to HTML.

egaia.utils.readDocument(docx_file)[source]

Return paragraph and table instances from a docx document

egaia.utils.rlinput(prompt, prefill='')[source]

Retrieve user input via interactive prompt

egaia.utils.rm(path)[source]

Remove a file or directory.

egaia.utils.run(command)[source]

Run a system command.

egaia.utils.truncate(description, length=200)[source]

Make a truncated description for index pages. The description truncates on the last full stop under the length limit, so very long descriptions should ideally be broken into several sentences.

egaia.version module

Get version identification from git

See the documentation of get_version for more information

egaia.version.get_version(pep440=False)[source]

Tracks the version number.

pep440: bool
When True, this function returns a version string suitable for a release as defined by PEP 440. When False, the githash (if available) will be appended to the version string.

The file VERSION holds the version information. If this is not a git repository, then it is reasonable to assume that the version is not being incremented and the version returned will be the release version as read from the file.

However, if the script is located within an active git repository, git-describe is used to get the version information.

The file VERSION will need to be changed by manually. This should be done before running git tag (set to the same as the version in the tag).