Installation

Warning

Egaia is an experimental program under active development. The latest code is generally usable, but can be expected to have a few bugs. Some features or commands may change in the next release. Please use at your own risk! Always be sure to keep backup copies of your data.

Requirements

Egaia should run on any operating system, though it has only been tested thoroughly on linux.

To run egaia you will need python 2.7 and, unless you do not wish to create derivatives, several other utilites for converting files – imagemagick, inkscape, ffmpeg, and libreoffice.

If you are using linux or OSX, you probably already have python installed. Otherwise you can obtain it from the Python Software Foundation.

The binary dependencies are all available for download for various systems. On Debian/Ubuntu you can install them as follows:

$ sudo apt-get install inkscape libreoffice imagemagick ffmpeg \
  libffi-dev wkhtmltopdf wget

The egaia setup command (see below) will attempt to locate these binaries and write their locations to the configuration file.

Installing egaia

The easiest way to get egaia and install it is using pip. The following command will install the current development version:

$ pip install git+http://mcdrc.org/git/egaia2.git [ --upgrade ]

This command will automatically download and install egaia along with the necessary Python libraries.

Configuration

Once you have the program successfully installed, run the following command to generate a configuration file:

$ egaia setup

You will be prompted to enter text identifying the name of your archive, your identity, and the locations on your computer where you would like the log file and html output saved. By default the local configuration file will be stored in /home/USER/.config/egaia/shared.cfg (you can use a “~” in the configuration settings to identify your user home directory).

The default settings are listed below. They will be loaded first, then overridden by anything you specify in your local configuration file. It is also possible to supply an additional configuration file, named local.cfg in the base configuration directory, to override some settings for the purpose of creating output in a different language, generating a beta version of a catalogue, or similar purposes.

A configuration file can contain any or all of the settings listed below, with each appropriate section name indicated in square brackets. (Spacing has been added for legibility, but has no impact on the parsing of this file.)


[system]

# The number of processor cores to use in video conversion
cores                           = 1

# If True, will not generate derivatives
no_deriv                        = False

# whether to generate lossless ffv1 versions of videos
# caution: these are extremely large!
ffv1                            = False
debug                           = False

# Commands for binary applications. Normally these paths should be
# inserted automatically by "egaia setup"
cmd_convert                     = 
cmd_montage                     =
cmd_ffmpeg                      = 
cmd_ffprobe                     = 
cmd_mogrify                     = 
cmd_inkscape                    = 
cmd_libreoffice                 = 
cmd_identify                    = 
cmd_wget                        = 
cmd_wkhtmltopdf                 = 
cmd_wkhtmltoimage               = 


[archive]

archive_name                    = Ethnographic Archive
organization                    = 
user                            = Anonymous
email                           = 

# The URL for the archive, with trailing slash. This will be used as the
# base for absolute hyperlinks.
archive_url                     = http://example.com/

# A prefix for RDF (should not include spaces etc.).
# This prefix will also be used as the archive short-name or acronym
# in the header for HTML output for very small devices
archive_prefix                  = ARCHIVE

# Log of file renames and similar reversible operations
logfile                         = ~/egaia/log.txt

# The default language. This will be used in HTML output.
language                        = en

# Location of a directory for HTML output
pub_path                        = ~/egaia/pub

# Location of the home page. This should be the full path to a docx file,
# which will be converted to HTML by the ``pub --home-page`` command.
home_page                       = ~/egaia/README-en.docx

# Whether or not to include still images from videos in HTML output.
# This is nice to have, but can be overwhelming for FTP transfer and might
# exceed shared web server inode quotas. The PDF version of video stills
# will be exported regardless of this setting.
export_stills                   = false

# Whether to generate html pages using embedded media from remote sites
# (YouTube, Vimeo, etc.) when given in the "remote_url" metadata field.
# For offline distribution, set this to False. If this value is True, video
# derivatives will still be copied to the public directory but download links
# will not be created.
remote_embeds                   = true

# When publishing videos to the Internet Archive, we have the option to
# exclude them from the index, making them unlisted. To do so, set this 
# value to "true" in the global or collection-level settings.
ia_noindex                      = false

# The list of default metadata fields that will be included in ALL item
# and collection descriptions. The labels for any terms given here should be
# listed in the [terms] section below, along with any optional fields, which
# will be included in metadata output below the core fields. Note that core
# metadata fields will always be listed for each item, even if empty, whereas
# optional elements will be ignored unless supplied by the user.
core_metadata                   = DCTERMS.identifier, 
                                  DCTERMS.title, 
                                  DCTERMS.creator, 
                                  DCTERMS.type,
                                  DCTERMS.coverage, 
                                  DCTERMS.description, 
                                  DCTERMS.publisher, 
                                  DCTERMS.source, 
                                  DCTERMS.rights, 
                                  DCTERMS.subject, 
                                  DCTERMS.date,
                                  DCTERMS.language,
                                  DCTERMS.tableOfContents


[terms]

# These are the terms and localized / customized labels in use in the 
# archive. The keys may be used as html template variables, in urls, and as 
# keys in json. Caution: changing the label for a term that is already in
# use will break metadata parsing, and may require manual updating in item
# descriptions through a global find-and-replace.
# Note that the keys are case-sensitive, but labels may be converted to
# title case or lowercase by the application depending on the context.
DCTERMS.abstract                = abstract 
DCTERMS.accessRights            = access rights 
DCTERMS.accrualMethod           = accrual method
DCTERMS.accrualPeriodicity      = accrual periodicity
DCTERMS.accrualPolicy           = accrual policy
DCTERMS.alternative             = alternative
DCTERMS.audience                = audience
DCTERMS.available               = available
DCTERMS.bibliographicCitation   = bibliographic citation
DCTERMS.conformsTo              = conforms to
DCTERMS.contributor             = contributor
DCTERMS.coverage                = coverage
DCTERMS.created                 = created
DCTERMS.creator                 = creator
DCTERMS.date                    = date
DCTERMS.dateAccepted            = date accepted
DCTERMS.dateCopyrighted         = date copyrighted
DCTERMS.dateSubmitted           = date submitted
DCTERMS.description             = description
DCTERMS.educationLevel          = education level
DCTERMS.extent                  = extent
DCTERMS.format                  = format
DCTERMS.hasFormat               = has format
DCTERMS.hasPart                 = has part
DCTERMS.hasVersion              = has version
DCTERMS.identifier              = identifier
DCTERMS.instructionalMethod     = instructional method
DCTERMS.isFormatOf              = is format of
DCTERMS.isPartOf                = is part of
DCTERMS.isReferencedBy          = is referenced by
DCTERMS.isReplacedBy            = is replaced by
DCTERMS.isRequiredBy            = is required by
DCTERMS.issued                  = issued
DCTERMS.isVersionOf             = is version of
DCTERMS.language                = language
DCTERMS.license                 = license
DCTERMS.mediator                = mediator
DCTERMS.medium                  = medium
DCTERMS.modified                = modified
DCTERMS.provenance              = provenance
DCTERMS.publisher               = publisher
DCTERMS.references              = references
DCTERMS.relation                = relation
DCTERMS.replaces                = replaces
DCTERMS.requires                = requires
DCTERMS.rights                  = rights
DCTERMS.rightsHolder            = rights holder
DCTERMS.source                  = source
DCTERMS.spatial                 = spatial
DCTERMS.subject                 = subject
DCTERMS.tableOfContents         = table of contents
DCTERMS.temporal                = temporal
DCTERMS.title                   = title
DCTERMS.type                    = type
DCTERMS.valid                   = valid

# The original filename, as set by egaia on ingestion
original_filename               = original filename

# Field designating a URL that represents a version of the resource that can
# be embedded in an iframe, such as a video
remote_embed_url                = remote embed url

# The parent collection UUID; normally set automatically by egaia
collection                      = collection

# A shortcut; this designates the filename for an html document in the
# root directory that redirects to the current item
alias                           = alias


[prefixes]

# URIs for metadata terms in RDF output
DCTERMS                         = http://purl.org/dc/terms/


[template_fields]

# Variables used in html templates. These are included here in order to
# enable end-user localization or extension.
toggle                          = Toggle navigation
Home                            = Home
Collections                     = Collections
All_items                       = All items
Keywords                        = Keywords
Subject                         = Subject
Coverage_area                   = Coverage area
Media_type                      = Media type
Creator                         = Creator
Language                        = Language
Items                           = Items
About                           = About
generator                       = Generated by 
                                  <a href="http://mcdrc.org/egaia/html/">
                                  egaia</a>.
last_updated                    = Last updated:
collection                      = collection
Metadata                        = Metadata
files                           = files
keyword_index                   = keyword index
index                           = index
language_index                  = language index
creator_index                   = creator index
subject_index                   = subject index
type_index                      = type index
coverage_index                  = coverage index
source                          = source