Emdros 1.2.0 pre231 review

by rbytes.net on

Emdros is an Open-Source text database engine for storage and retrieval of analyzed or annotated text. Emdros has a powerful query

License: GPL (GNU General Public License)
File size: 0K
Developer: Ulrik Petersen
0 stars award from rbytes.net

Emdros is an Open-Source text database engine for storage and retrieval of analyzed or annotated text.

Emdros has a powerful query-language for asking relevant questions of the data.

Emdros has wide applicability in fields that deal with analyzed or annotated text. Application domains include linguistics, publishing, text processing, and any other fields that deal with annotated text.

Here are some key features of "Emdros":
Linguistic analyses are the primary target domain. This includes all levels of analysis, such as morphology, syntax, and discourse analysis, and even phonology to some extent.
Publishing is also a field where Emdros can be useful. Emdros supports breaking a text down into pages, chapters, paragraphs, etc.
Text processing may benefit from Emdros if the problem involves annotating the text.

Emdros provides a conceptual model of text which can be quite liberating to use once it has been grasped.

Meta-data may also be stored, so long as there is some textual element with which it can be associated.

Emdros is good both for corpus linguistics (large amounts of text) and for field-linguistics (smaller amounts of data).

Fixed corpora, such as Biblical texts, are good candidates for making Emdros useful. Emdros is currently being used for large databases of the Hebrew Bible.

Dictionaries are also a target possibility. Emdros supports structuring of text documents down to minute details, while not losing the big picture.

Emdros embodies a particular model of text called the EMdF model. The primary advantage over XML's data model is that object types (such as pages and chapters) need not be hierarchically structured or embedded, but may overlap. In addition, objects (such as a clause or a phrase) need not be contiguous, but may have gaps.

Emdros can output its results in XML. The XML carries its own standalone DTD and validates with a validating parser.

Emdros architecture

Emdros fits into a software architecture as follows:

| Client | User-written
| MQL | Emdros
| EMdF | Emdros
| DB | PostgreSQL or MySQL

At the top, there is a client which you, the user, must write. This client will take advantage of Emdros's services to provide for the needs of your particlar database domain.

Then come the two Emdros-layers: The MQL layer and the EMdF layer. The MQL layer provides an interface to the MQL language. The MQL layer automatically takes advantage of the EMdF layer, which translates the MQL queries into SQL calls to the underlying database.

The underlying database takes care of storing the data, and retrieving it as directed by the EMdF layer.

The data domain which Emdros handles is that of text. Emdros provides a certain abstraction of text that makes it ideally suited to storing and retrieving annotated text, such as linguistic analyses of a text.

These analyses can be, e.g., syntactic analyses, morphological analyses, or discourse analyses, or all of these. Phonological analyses are also supported to some extent.

Emdros is particularly useful in domains where research questions need to be asked of databases of annotated text. This would include dictionary-making, Biblical language-research (Greek or Hebrew), other linguistic research, and research on annotated text in general.

Emdros has a particular model of text called the EMdF model. Users have attested, and our experience shows, that the EMdF model can be quite liberating when dealing with text as a programmer or program designer. Thus any application that deals with annotated text will likely benefit from the Emdros and the EMdF model.

What's New in This Release:
Multiple memory leaks were fixed.
SQLite 3 is now included.
Various build issues were fixed, especially on Mac OS X.
The Chunking Tool now has a complete User's Guide, and Doxygen docs were added. mqldump(1) now economizes with the characters it emits.
Lots of small bugfixes were made.

Emdros 1.2.0 pre231 keywords