Saturday, April 20, 2013

Reshaped relational algebra - From Amazon Wiki

Copied from http://www.amazon.com/wiki/Reshaped_relational_algebra

Reshaped relational algebra

The reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984; reformulated in Raz 1986) is a modification of the relational algebra (RA) with enhanced operators. A major motivation for defining RRA has been providing a missing manipulative component to the entity-relationship model (ERM). The primary distinguishing feature of RRA operators is that they possess strong analogies to Natural language (NL) constructs. The main difference between RRA and RA is with natural join and projection operations embedded in various RRA operators. RRA is equivalent to RA in expressive power, in the sense that each algebra's operators can be expressed by the operators of the other (Raz 1986).
RRA has been utilized to support ERROL (Markowitz and Raz 1983a), an entity-relationship model (ERM) database language, over relational databases. It has been utilized both to define the semantics of ERROL, and to conveniently and effectively implement ERROL over relational databases.

[hide]

Motivation

The motivation for RRA is explained in the Abstract part of (Markowitz and Raz 1984):

"The entity-relationship model (ERM) has been defined without a manipulative part. Since a link between the ERM and the relational model may be established by choosing the relation to be the structural unit of the data-representational level of the ERM, this model could benefit by somehow inheriting the relational algebra. This paper proposes as manipulative part for the entity-relationship model a reshaped relational algebra (RRA). It is shown that just as the entity-relationship model concepts are based on the way people perceive information, the RRA operators bear analogies to the way people communicate, i.e., natural language, and are therefore convenient in describing the semantics of query languages within ERM."

The linguistic aspect of the ERM

Main articles: Entity-relationship model and ERROL

Entity Relationship Diagrams (ERDs) and Schemas can be described accurately by simple sentences of Natural languages (NLs). These simple sentences can be utilized to construct a complex sentence, or a compound sentence, or a combination, or a sequence of sentences which comprise a logic predicate. A predicate is either "true" or "false" depending on the substitution in it of the data (attribute) values (from some specific database described by the related ER diagram/schema). A database query is a combination of a set of attributes (the target set or the target list) and a logic predicate. A set of values that correspond to the target set is in the query's result if their corresponding values (from the same entity and relationship instances) substituted in the predicate (possibly of other attributes) results in the value "true" for the predicate.

Straightforward examples

These examples relate to a factory database. The portion of the database relevant to the examples has departments, items in stock, and suppliers of these items as entities. Departments REQUEST (order) items from suppliers. Suppliers SUPPLY items to departments. Both last sentences above define ternary (three-way) relationships.

Entities are defined by the following relations:

DEPARTMENT(d_key, name, floor)

ITEM(i_key, name, color)

SUPPLIER(s_key, name)

The relationships are defined by the following relations:

REQUEST(d_key, i_key, s_key, quantity)

SUPPLY(d_key, i_key, s_key, quantity)

Possible simple ERROL queries over this schema for respective Natural language (NL) queries are as follows:

NL: "Find key and name for items supplied to the Engineering department."

ERROL: Get i_key, name of item supplied to department with name="engineering"

NL: "Find the names of suppliers from whom red or blue items are requested by the engineering department."

ERROL: Get names of suppliers requested items with color="red" or "blue" by department with name="engineering"

Each query has a target set (list) of attribute types to be retrieved, and a logic predicate. The predicate needs to assume a true value for related (same entity and relationship instances) attribute values substitution in order to extract the corresponding target set values.
For example, in the first query above, "id and name of an item" is the target list; "items supplied to the Engineering department" is the logic predicate.

Simple NL surface sentences of an ERM scheme

A sentence describing a Relationship among Entities.
A sentence connecting an Attribute to its Relationship.
A sentence connecting an Attribute to its Entity.
Is-a sentence – generalizing Entities.

Combining the simple NL sentences to NL logical predicates

Utilizing linguistic constructs

Restriction (by a constant or variable)
Relativization
Coordination (OR and AND)
Quantifiers
Aggregation
Referencing

NL queries

Navigating in the ER Diagram
Combining a target data type set (of the query; what is being retrieved) with a logical predicate to produce a query.

RRA operators and their linguistic counterparts

For major syntactic natural language (NL) constructs needed for queries in most common NLs, each has a matching RRA operator. The relationships between these NL constructs and RRA operators are summarized in the following table:

RRA operators and their relationship with Natural language constructs; from (Raz 1987)

The RRA operators are defined as follows:

Notation and definitions

RRA definitions; from (Raz 1987)

Renaming

RRA definitions; from (Raz 1987)

Projection

RRA definitions; from (Raz 1987)

Selection

RRA definitions; from (Raz 1987)

Restriction by a constant.

Product (natural join, Cartesian product, intersection)

RRA definitions; from (Raz 1987)

NL: Both Relativization and AND coordination (Conjunction).

Bordered union

RRA definitions; from (Raz 1987)

NL: OR coordination (Disjunction).

θ-join

Attribute join

RRA definitions; from (Raz 1987)

Restriction by a variable.

Set-join

Generalized division

RRA definitions; from (Raz 1987)

Set comparison; Universal quantification.

Negation/Complement

Not

RRA definitions; from (Raz 1987) Negation.

The not operator is defined in (Raz 1986) and replaces the subtruction/difference operator originally used for RRA in (Markowitz and Raz 1983b, 1984). not allows a simpler representation of NL negation (e.g., the English "not").

Aggregate functions

RRA definitions; from (Raz 1987)

Sum, Count, Minimum, Maximum, Average, etc.

RRA example: The not (negation) operator

Basic properties

RRA – The not operator 1; from (Raz 1986) page 36

RRA – The not operator 2; from (Raz 1986) page 37

Negation and operators with comparison

The operators with comparison are select, θ-join (attribute join), and set-join.
By applying negation (not) to the comparison they all obey the following formula (within the defined semantics):

$not(operator parameter1 comparison parameter2) = operator parameter1 not(comparison) parameter2$

Specifically,

select operator with comparison $θ$ from { =, <, > }:

$not(select attribute θ constant R) = select attribute not(θ) constant R$

Related NL (and ERROL) example:

NOT AN ITEM with COLOR = RED

is equivalent to:

AN ITEM WITH COLOR not= RED

attribute-join operator with comparison $θ$ from { =, <, > }:

$not(R 1 attribute-join attribute1 θ attribute2 R2) = R 1 attribute-join attribute1 not(θ) attribute2 R2$

NL (and ERROL) example:

DEPARTMENT THAT REQUESTS ITEMS FROM A SUPPLIER IN QUANTITY THAT IS SMALLER THAN (<) THE QUANTITY IT (reference; the same department) IS SUPPLIED WITH BY THE SAME (reference) SUPPLIER

is the negation of:

DEPARTMENT THAT REQUESTS ITEMS FROM A SUPPLIER IN QUANTITY THAT IS NOT-SMALLER THAN (not <) THE QUANTITY THAT IT (reference; the same department) IS SUPPLIED WITH BY THE SAME (reference) SUPPLIER.

set-join operator with comparison $δ$ from { =, contains, contained-in }:

$not(R 1 set-join set1 δ set2 R2) = R 1 set-join set1 not(δ) set2 R2$

NL (and ERROL) example:

SET OF ITEMS (or "all the items"; universal quantifier) SUPPLIED TO DEPARTMENT IS CONTAINED-IN ("are included in") SET OF ITEMS REQUESTED BY THAT (reference; the same) DEPARTMENT

is the negation of

SET OF ITEMS (or "all the items") SUPPLIED TO A DEPARTMENT IS NOT-CONTAINED-IN ("are not included in") THE SET OF ITEMS REQUESTED BY THAT (reference; the same) DEPARTMENT.

Comments:

For database context for the English expressions in the examples above see the Straightforward examples section above.
These English examples can be used by the respective application to check consistency between REQUEST and SUPPLY (what is not requested should not be supplied; a hidden aggregation on suppliers exists in the set containment example since SUPPLIER does not appear explicitly, which makes it "by all relevant suppliers").

De Morgan's laws

Main article: De Morgan's laws

De Morgan's laws are demonstrated below for RRA and English sentences:

RRA – De-Morgan's Laws; from (Raz 1986) page 39

Comment: For database context for the English expressions in the examples above see the Straightforward examples section above.

NL queries and their RRA expressions

Assumptions

An RRA expression allows one to compute the result of respective NL (or ERROL) query from a relational database. For this purpose two condition should be met:

Entities and Relationships as defined by the entity-relationship model (ERM) are represented by relations as define by the relational model (RM) in a certain general, natural, normalized form (see below; applicability to other representations of ERM and other data models is established by proper mapping from the normal form).
The NL sentence can unambiguously be parsed to extract certain syntactic structures (see table above). Most major NL languages are based on these structures and can be translated to RRA. From these structures and their order the RRA expression is constructed. NLs are vague and ambiguous in nature, and sometime several iterations by a user are needed to produce the intended unambiguous NL sentence (or a sequence of sentences). In contrast, ERROL which is a super-set (with added parentheses for disambiguation) of a subset of the English language, is unambiguous and can be parsed and translated to RRA immediately, if phrased correctly (by its own syntax rules; this has been the motivation for defining ERROL).

(see Straightforward examples at section 2 above)

RRA and its relational representation of the ERM

The ER normal form (of a relational scheme).

Constructing an RRA expression from an NL query

The attribute type set to be retrieved by the query and the NL-based logical predicate in the query (which together comprise the query) produce a Hyper-tree (acyclic Hypergraph) of RRA operators by navigating in the ER Diagram according to the predicate. The hyper-tree has a straightforward translation to the intended RRA expression for the query (Cohen 1984, Raz 1987).

RRA development and utilization

RRA has been developed by Victor M. Markowitz (together with an initial version of ERROL) as his M.Sc. thesis (Markowitz 1983) at the Technion – Israel Institute of Technology (advisor: Yoav Raz). RRA has been developed to support all needed (for queries) basic linguistic constructs in a Natural language (NL), with an RRA operator for each such construct. The thesis also includes RRA expressions implementation guidelines. RRA has been utilized both to describe the semantics of ERROL, and to effectively implement ERROL over relational databases. ERROL to RRA Compiler has been constructed in the M.Sc. project of Reuven Cohen (Cohen 1984; Advisor: Yoav Raz). This compiler has been utilized in the ERROL system, an entity-relationship model (ERD) based database system prototype developed during the years 1982–1988 by Yoav Raz together with graduate students at the Technion, Israel and later enhancements at UCSD, United states (Raz et al. 1984, Raz 1987). RRA has been reformulated in (Raz 1986) with a proof of equivalence to relational algebra (RA): Each of one algebra's operators can be expressed with the other algebra's operators.

References

Victor M. Markowitz, Yoav Raz (1983a): "ERROL – An Entity Relationship Role Oriented Query Language", In Entity Relationship Approach to Software Engineering (ER 1983), October 5–9, Anaheim, California. Davis, G.C. et al. (eds.), pp. 329–345, North-Holland.

Victor M. Markowitz, Yoav Raz (1983b): "A Modified Relational Algebra and Its Use in an Entity Relationship Environment", In Entity Relationship Approach to Software Engineering (ER 1983), October 5–9, Anaheim, California. Davis, G. C. et al. (eds.), pp. 315–328, North-Holland, 1983.

Victor M. Markowitz, Yoav Raz (1984): "An entity-relationship algebra and its semantic description capabilities", Journal of Systems and Software, Volume 4, Issues 2–3, pp. 147–162, Elsevier Science Inc., July 1984.

Yoav Raz, Reuven Cohen, Victor M. Markowitz (1984): "ERROL – An Entity Relationship, Role Oriented Query and Data Manipulation Language" (Extended abstract), The 9th National Conference on Information Processing together with The 4th Jerusalem Conference on Information Technology (JCIT4), May 21–25, Jerusalem, Israel (1984 ILA award in Computer Science for The ERROL System).

Victor M. Markowitz (1983): ERROL – An Entity Relationship Role Oriented Query Language, M.Sc. Thesis, Dept. of Computer Science, Technion – Israel Institute of Technology, Haifa, January 1983.

Reuven Cohen (1984): The Translation of ERROL to RRA – A Reshaped Relational Algebra, M.Sc. Thesis, Dept. of Computer Science, Technion – Israel Institute of Technology, Haifa, July 1984.

Yoav Raz (1986): A Precise Definition of RRA – a Reshaped Relational Algebra which Follows Natural Language Constructs (pdf), Technical Report TR #405, Computer Science Dept., Technion, Israel, March 1986.

Yoav Raz (1987): Supporting Structured English Interfaces for Relational Databases Using RRA (pdf), Technical Report TR CS-093, EECS Dept., UCSD (ERROL, RRA, ERROL translation to RRA, and The ERROL System examples). Earlier version appeared as Supporting Natural Language Interfaces for Relational Databases Using RRA, TR #410, Computer Science Dept., Technion, Israel, May 1986.

External links

ERROL page in Yoav Raz's site

ERROL - from Amazon wiki

Copied from http://www.amazon.com/wiki/ERROL

ERROL

ERROL (Entity Relationship Role Oriented Language; Markowitz and Raz 1983a) is a declarative database query and manipulation language for the Entity-relationship model (ERM). It is applicable to any data model on which ERM can be mapped, virtually any general purpose database data model. It is based on the capability of ER diagrams to be described accurately by simple Natural language (NL) sentences. A specification of a complex operation upon an ERM database can be described accurately by a complex and/or compound NL sentence constructed from the simple sentences describing the respective ER diagram. An ERROL expression mimics such NL sentence with one-to-one correspondence between ERROL subexpressions and NL subsentences: An ERROL expression can look like the corresponding NL sentence, or at least like a similar, equivalent one. This allows to write in ERROL very complex queries by simple conversion from their NL specifications. It also allows a straightforward checking of an ERROL expression meeting a complex NL specification. With such characteristics it can be a foundation for future Data management languages, more convenient for humans to use than existing languages which may need complex expressions even for moderately complex NL expressions (e.g., SQL; see Example below).
ERROL is also applicable to newer applications like querying the Semantic web using ontologies. As well it is applicable to other than ERM semantic data models, e.g., Object-Role Modeling (ORM), which has many similarities to the ERM. ERROL has some similarities to the later Gellish (in particular to Gellish English), a formal language with a strong connection to natural languages, and can use its dictionaries.
Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984; Raz 1986), with operators that follow the semantics of respective major NL constructs, has been developed to support ERROL over relational databases. It is used both to specify ERROL's semantics concisely and accurately, and to implement ERROL effectively over relational databases.
ERROL and its RRA translation expose and exploit the connection between the way humans reason and talk about needed information, possibly very complex, and the database operations needed in order to compute this information from the database data. A sequence of such RRA operations is generated automatically from an ERROL expression by an ERROL-to-RRA compiler. The compiler output has been applied directly to a relational database, and also has a translation to SQL, the standard interface, for a straightforward, "regular" execution by SQL database systems, to take advantage of their query optimization.

[hide]

Introduction

While Natural language (NL) can be vague and ambiguous, ERROL (Markowitz and Raz 1983a) is accurate with well-defined semantics, and thus provides a check for NL queries. With these properties ERROL is a good compromise between NL as a query language and other database languages: Long experience with database languages shows that simple queries are easy to phrase with both NL and most database languages. However, some complicated queries may be hard to phrase accurately in most languages, including NL. In most cases NL is the natural query tool for humans, but being sometimes vague and ambiguous, often a complex NL query's meaning is not accurately defined, and a dialogue that involves a well-defined formal language feedback is needed to verify the intended meaning when NL is used. In practice a query specification is typically initially thought of, formulated, in NL, and then translated by an expert to the desired database language. Validating translation correctness (semantic equivalence) is usually difficult for complex queries and relies completely on the expert's understanding of the NL specification. Wrong interpretation can be very costly, both in execution time and implications. Being very close to NL, and identical to it in most cases, ERROL eases these difficulties.
It is worthwhile noting that the English language utilization in ERROL is different in nature from its utilization in other computer languages like COBOL and SQL: While in the latter English-like language commands are utilized, in ERROL a predicate (in a query or other manipulation) is specified by an English-like expression.
Though originally motivated by the linguistic aspect of ERM, ERROL has been found to provide a powerful paradigm beyond this aspect: Specification of a complex query predicate by Navigation in an ER Diagram ("query by navigation"; the emphasis is on navigation in an ER Diagram, versus any other data structure). Even without exploiting many NL elements, the navigation itself together with basic schema elements provides accurate specification. As a matter of fact the navigation is independent of any specific NL, and presents the semantic relations (Relationships and comparisons) between objects of interest (Entities and attributes, and resulting objects by arithmetic operations, aggregations, and logic derivations), where the ER diagram provides a limited form of semantic network. All the needed information for a complete specification over a given schema exists in a skeleton ERROL specification which includes only ER Diagram elements, and may include constants, aggregate functions, logical connectives, arithmetic operations and comparison operators. Using such language-independent representation, a specification can be easily machine-translated accurately among different natural languages, using small numbers of each language's syntactic constructs, and having an ER diagram described by respective simple sentences in different languages. Specification reconstruction in English (that can be re-processed correctly by the ERROL compiler) from the language-independent representation has been done successfully (within the ERROL System project; see the Brief history section below).
Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984) has been developed to support ERROL over relational databases. RRA is equivalent to Relational algebra (RA) in expression power (each one algebra's operator can be expressed by the other algebra's operators (Raz 1986) ). A strong correlation exists between ERROL constructs (and corresponding NL's) and RRA operators. This is used both to specify ERROL's semantics concisely and accurately, and to implement ERROL effectively over relational databases. For any ERROL specification (expression; query or other manipulation) a corresponding RRA expression is derived in a straightforward way. This RRA expresstion computes the specification over any relational database with schema that is semantically relevant to the specification (possibly through schema transformation for ERM compatibility with the specification, when needed; the computation result is a relation).

Language overview

ERROL is a declarative language (a specification of what is requested is given; rather than a procedural language which specifies the way to compute it; for example, SQL comprises portions of each type). An ERROL expression describes a navigation hypertree (hypertree is an acyclic hypergraph), generated by navigating in the ER diagram. Names and constants are nodes. Entities and Relationships define one types of edges, sets of attribute names. A connecting ERROL construct defines a second type of edge, a "regular" tree (dual node) edge. (See (Raz 1987) below for more detail.) The navigation may include jumps in the diagram (in case of comparisons) or repetitions over same sections in the diagrams (if the query specification requires repeated utilization of same diagram elements). A complex specification of a Hypertree, a computaion by a long NL sentence, may suffer from ambiguities due to unclear connection between sentence parts. ERROL solves this problem by inserted parantheses (primarily parentheses) when needed, to uniquely define an expression's hypertree with the correct intended meaning.

Structured English

Structured English (SE; Raz 1987; not to be confused with Structured English used for pseudocode) is an extension of a subset of the English language and an enhancement of the initial ERROL. In SE several "syntactic sugar" elements of ERROL have been made more flexible, and further flexibility with word usage and sentence structure has been introduced, to get it closer to NL. However, the original correspondence between NL and respective ERROL basic constructs has been preserved. In what follows no distinction is made between ERROL and SE.

Highlights

The navigation hypertree with correct RRA interpretation (respective RRA operator substitution for any ERROL construct type) provides all the semantics needed to define accurately an ERROL (SE) expression (see examples in Raz 1987). A navigation hypertree has one-to-one correspondence with a skeleton expression of ERROL, a canonical form that describes the navigation hypertree only by Entitys, Relationships, and Attributes, together with Constants, Aggregate functions, Logical connectives, Arithmetic operations, and Comparison operators.
An ERROL expression has one navigation hypertree, but typically several navigation hypertrees exist (for different ways of NL phrasing) that provide the same result, by navigating in the ER Diagram through same elements in different ways (e.g., in different directions, in different orders).
Scope of an aggregate function is determined by parantheses when the intended scope is not properly determined by the defaults. Any ERROL subexpression can be aggregated by inserting aggregation function name and parentheses in the right place.
Decomposition: A long NL specification can be broken into several simpler sentences. Correspondingly a respective ERROL expression can be replaced by a set of respective simple expressions separated by delimiters, (e.g., " ; "), which is equivalent to the original expression (see example below).
Reference or correlation: Often it is required that entities can be referenced properly in a sentence or across sentences. In NL it is done by using expressions like "his", "the above", "the second above", etc. In ERROL it is done as well, primarily by reference (correlation) symbols like (x), attached to entity name repetitions in an expression, and by some supported, or ad-hoc defined per a given schema, English constructs. All entity name repetitions marked with (x) are the same entity occurrence in the database for any instance for which the query predicate is true (see examples below).
Assignment and Substitution: An ERROL subexpression can be named using assignment and be utilized in a long expression by its name (see example below). In particular entities and relationships can be defined and utilized.
Reserved words: Errol has a set of reserved words, for example, for commands (e.g., "get") and logic operators (e.g., "or").
Transitive closure. This feature enhances the expression and computing power of ERROL considerably. It is done by allowing to define (and name) relationships which are the transitive closure of existing reflexive relationships (i.e., between an entity to itself, e.g., EMPLOYEE MANAGES EMPLOYEE: A manager can be direct, or several levels above) either already appearing in the database, or defined by an ERROL expression.
An interactive graphical interface (Graphical ERROL) can be built in a straightforward way to construct the navigation hypertree (and corresponding ERROL expression; the ERROL system can already reconstruct an NL expression from an RRA expression using the system's dictionary) by picking up elements from the ER Diagram, and other needed elements (logical, comparison, constants, etc.). Such component has not been implemented within the framework of the ERROL System project described below (see the Brief history section below).
SE can be replaced by a respective subset of any other Natural language (e.g., French, Hebrew, etc.) with basic syntactic constructs that have semantics similar to the English constructs used for SE. Skeleton expression translations require only names' translations, and language fonts replacement when needed. Such components have not been implemented within the framework of the ERROL System project described below (see the Brief history section below).

The linguistic aspect of the Entity relationship model, and the way it is utilized by ERROL for navigation in an ER Diagram is demonstrated by the following example:

Example

Assume an ER diagram for a relational database of employees, with EMPLOYEE entity (as defined in ERM) and MANAGE relationship between two employees.

EMPLOYEE(id, salary) is the database relation for the entity EMPLOYEE

MANAGE(id1, id2) is the relation for the relationship MANAGE: employee with id1 manages employee with id2.

Consider the following query:

"Find the employees that earn more than their manager."

The following equivalent (in result; some also in navigation hypertree) ERROL queries provide a correct solution:

Get employees that earn more than their manager

Get employee that earns more than his manager

Get EMPLOYEE that EARNS more than his MANAGER

Get id of EMPLOYEE that has salary greater than that of his MANAGING EMPLOYEE

Get EMPLOYEE(x) having salary > salary of EMPLOYEE that MANAGES EMPLOYEE(x)

Get id of EMPLOYEE(x) MANAGED by EMPLOYEE having salary < salary of EMPLOYEE(x)

Get id EMPLOYEE(x) salary > salary EMPLOYEE MANAGE EMPLOYEE(x) ("Skeleton query")

Get EMPLOYEE(x) WHERE EMPLOYEE(x) is MANAGED by EMPLOYEE(y) AND has salary> salary of EMPLOYEE(y)

Get EMPLOYEE(x); EMPLOYEE(y) MANAGE EMPLOYEE(x); salary EMPLOYEE(x)>salary EMPLOYEE(y) (delimited expressions are connected by logical conjunction (AND) )

E1:= salary of EMPLOYEE(x) > salary of EMPLOYEE(y); Get EMPLOYEE(x) WHERE E1 AND E2; E2:= EMPLOYEE(x) is MANAGED by EMPLOYEE(y) (substitution; assignment location in expression does not matter)

Comments:

The equivalent ERROL examples above become unnecessarily more and more complex just to demonstrate ERROL's constructs, which allow to conveniently handle queries with very complex specifications.

Many more equivalent query variations are possible.

The symbols x, y in the examples above are used for reference, correlation: All entity name repetitions marked with (x) are the same entity occurrence in the database for any instance for which the query predicate is true.

Capital letters are used here for convenience only to emphasize entities, relationships, and certain reserved words; ERROL is not case sensitive.

Compare expression complexity: write this query in SQL; All SQL expressions for this query are relatively complex and long.

See additional examples in a section below.

Reshaped relational algebra

Main article: Reshaped relational algebra

The Reshaped relational algebra (RRA; Markowitz and Raz 1983b, 1984; an alternative formulation can be found in Raz 1986 ) has been developed to support ERROL over relational databases. RRA is equivalent to the Relational Algebra (RA) in expression power (each one algebra's operator can be expressed by the other algebra's operators (Raz 1986) ), and has strong analogies to some basic NL constructs. As such it is ideal for describing the semantics of ERROL, as well as for implementing ERROL over relational databases. For any ERROL specification (expression; query or other manipulation) a crresponding RRA expression computes the specification over a relational database. The main difference between RRA and RA is with natural join and projection operations embedded in various RRA operators.

RRA operators and their relationship with Natural language constructs; from (Raz 1987)

An ERROL expression (or respective hypertree) can be translated in a straightforward way to an RRA expression. An RRA expression is a partial order (typically with several compatible sequences) of RRA operations, which provides a procedure for computing the value of the ERROL expression over a relational database (i.e., computing the query or data manipulation resulting relation) by manipulating its relations. Each ERROL subexpression type has a corresponding RRA operator. The subexpression variables (entity, relationship, and attribute names) and constants comprise the operator's parameters. The operators' order is determined by the hypertree structure of the (possibly paranthesized) ERROL expression. By proper renaming, the join operation embedded in RRA operators automatically connects between corresponding attributes in entities and relationships, and resolves references by using a single entity identifying temporary name for all that entity occurrences in the query that have the same reference symbol (for same entity occurrences with different reference symbols, different temporary names are given; for same entity occurrences with no reference symbol, different temporary names are given).
RRA expression simplification, and consistency checking together with subexpression contradiction and tautology identification, can be done using RRA axioms and theorems (Raz 1986). RRA expression computation optimization can be done similarly to the ways it is done for Relational algebra (RA).

Brief history

Both RRA and initial version of ERROL, with all needed for queries linguistic constructs, including implementation guidelines, have been developed by Victor M. Markowitz as the subject of his M.Sc. thesis (Markowitz 1983) at the Technion - Israel Institute of Technology (advisor: Yoav Raz). Further ERROL enhancements and implementation, including ERROL to RRA translation (M.Sc. project of Reuven Cohen; Cohen 1984), have been done by Yoav Raz together with graduate students. In 1984 Yoav Raz, Victor Markowitz, and Reuven Cohen won the Computer Science Award of ILA – The Israeli Information Technology Association.

ERROL System output example: ERROL query, compilation output (c+), and then execution output (x+). t - intermediate relations; x - (renamed) temporary attribute names; the left-side t operand in each RRA operation is the operation's result relation; from (Raz 1986)

The ERROL System (Raz et al. 1984) implements database queries and manipulations over a relational database using SE and RRA. During the years 1982-1988 it has been developed at the Technion, Israel, using UNIX, Lex, YACC, and Ingres, and further enhanced at UCSD (see output examples in Raz 1987).

ERROL (SE) examples

The examples below are given primarily in their skeleton representation, or close to that, rather than their more close-to-NL forms (e.g., set operations rather than quantifiers) to more clearly hint on their RRA translations.

Example 1

This example relates to a factory database. The portion of the database relevant to this example (and other examples below) has departments, items in stock, and suppliers of these items as entities. Departments REQUEST (order) items from suppliers. Suppliers SUPPLY items to departments. Both last sentences above define ternary (three-way) relationships.

Entities are defined by the following relations:

DEPARTMENT(did, name, floor)

ITEM(iid, name, color)

SUPPLIER(sid, name)

The relationships are defined by the following relations:

REQUEST(did, iid, sid, quantity)

SUPPLY(did, iid, sid, quantity)

Possible simple ERROL queries over this schema for respective Natural language (NL) specifications are as follows:

NL: "Find id and name for items supplied to the Engineering department."

Get id, name of ITEM SUPPLIED to DEPARTMENT with name="Engineering", or

Get id, name of ITEMS SUPPLIED to the Engineering DEPARTMENT - (additional dictionary definition is required for using this ERROL expression)

NL: "Find the names of suppliers from whom red or blue items are requested by the Engineering department."

Get names of SUPLIERS REQUESTED ITEMS with color="Red" OR "Blue" by DEPARTMENT with name="Engineering"

Example 2

(from example 2 in (Raz 1987) )
Using the schema in Example 1 above consider the following imaginary complex query:

"Find departments such that each ( is located in floor lower than the floor of a department requesting number of items greater than 20) or ( (not supplied by supplier named "Tom") and has the name "Engineering" )."

A possible straightforward skeleton ERROL expression for this query is the following:

Get DEPARTMENT (floor < floor DEPARTMENT REQUEST COUNT ITEM > 20) OR (NOT SUPPLY SUPPLIER name = "Tom") AND name = "Engineering"

or the following:

Get DEPARTMENT(x) WHERE (E1 AND E2) OR ((NOT E3) AND E4); E1:=floor DEPARTMENT(x)< floor DEPARTMENT(y); E2:=DEPARTMENT(y) REQUEST COUNT ITEM>20; E3:=DEPARTMENT(x) SUPPLY SUPPLIER name = "Tom"; E4:=DEPARTMENT(x) name="Engineering" (comment: the parentheses in the first subexpression are unnecessary due to common defaults)

Comment: Judging by the ERROL compiler output in example 2 of the last reference below, it seems that some parantheses in the input expression there have been lost, probably due to a (paper) "cut and paste" error, that had also propagated to the example 2 text there.

Example 3

Using the schema in Example 1 above:

"Find pairs of supplier, department, such that the supplier supplies to the department more than half the number of all the "Red" items requested by all the departments."

ERROL query:

Get SUPPLIER(x), DEPARTMENT SUPPLIED by SUPPLIER(x) COUNT ITEMS > 0.5*COUNT ITEMS (REQUESTED AND have color="Red")

If half the number of "Red" items requested by any single department is considered, the NL sentence is slightly changed:

"Find pairs of supplier, department, such that the supplier supplies to the department more than half the number of "Red" items requested by any department."

A possible ERROL query is the following:

Get SUPPLIER(x), DEPARTMENT SUPPLIED by SUPPLIER(x) COUNT ITEMS > 0.5*COUNT ITEMS (REQUESTED AND have color="Red") by DEPARTMENT

Here in the second COUNT operator the aggregation is done separately for each DEPARTMENT.

The symbol x is for a reference.

Example 4

Using the schema Example 1 above:

"Find pairs of supplier, department such that the supplier supplies to the department all the items that this department requests from that supplier."

ERROL query:

Get SUPPLIER(x), DEPARTMENT(y) SUPPLIED by SUPPLIER(x) SET ITEMS = SET ITEMS REQUESTED by DEPARTMENT(y) from SUPPLIER(x)."

The symbols x, y are for references.

Example 5

Using the schema in Example 1 above:

"Find the departments that get all the quantity of Red Bolts they order."

ERROL query:

Get DEPARTMENT(x) SUPPLIED with SUM quantity of ITEMS with color="RED" AND name="Bolt" = SUM quantity of REQUESTED ITEMS with color="RED" AND name="Bolt" by DEPARTMENT(x)

The symbol x is for reference.

References

Victor M. Markowitz, Yoav Raz (1983a): "ERROL - An Entity Relationship Role Oriented Query Language", In Entity Relationship Approach to Software Engineering (ER 1983), October 5-9, Anaheim, California. Davis, G.C. et al. (eds.), pp. 329–345, North- Holland.
Victor M. Markowitz, Yoav Raz (1983b): "A Modified Relational Algebra and Its Use in an Entity Relationship Environment", In Entity Relationship Approach to Software Engineering (ER 1983), October 5-9, Anaheim, California. Davis, G. C. et al. (eds.), pp. 315–328, North-Holland, 1983.
Victor M. Markowitz, Yoav Raz (1984): "An entity-relationship algebra and its semantic description capabilities", Journal of Systems and Software, Volume 4, Issues 2-3, pp. 147-162, Elsevier Science Inc., July 1984.
Yoav Raz, Reuven Cohen, Victor M. Markowitz (1984): "ERROL - An Entity Relationship, Role Oriented Query and Data Manipulation Language" (Extended abstract), The 9th National Conference on Information Processing together with The 4th Jerusalem Conference on Information Technology (JCIT4), May 21-25, Jerusalem, Israel (1984 ILA award in Computer Science for The ERROL System).
Victor M. Markowitz (1983): ERROL - An Entity Relationship Role Oriented Query Language, M.Sc. Thesis, Dept. of Computer Science, Technion - Israel Institute of Technology, Haifa, January 1983.
Reuven Cohen (1984): The Translation of ERROL to RRA - A Reshaped Relational Algebra, M.Sc. Thesis, Dept. of Computer Science, Technion - Israel Institute of Technology, Haifa, July 1984.
Yoav Raz (1986): A Precise Definition of RRA - a Reshaped Relational Algebra which Follows Natural Language Constructs (pdf), Technical Report TR #405, Computer Science Dept., Technion, Israel, March 1986.
Yoav Raz (1987): Supporting Structured English Interfaces for Relational Databases Using RRA (pdf), Technical Report TR CS-093, EECS Dept., UCSD (ERROL, RRA, ERROL translation to RRA, and The ERROL System examples). Earlier version appeared as Supporting Natural Language Interfaces for Relational Databases Using RRA, TR #410, Computer Science Dept., Technion, Israel, May 1986.

External links

ERROL page in Yoav Raz's site

Saturday, April 20, 2013

Reshaped relational algebra - From Amazon Wiki

Reshaped relational algebra

Contents

Motivation

The linguistic aspect of the ERM

Straightforward examples

Simple NL surface sentences of an ERM scheme

Combining the simple NL sentences to NL logical predicates

NL queries

RRA operators and their linguistic counterparts

Notation and definitions

Renaming

Projection

Selection

Product (natural join, Cartesian product, intersection)

Bordered union

θ-join

Set-join

Negation/Complement

Aggregate functions

RRA example: The not (negation) operator

Basic properties

Negation and operators with comparison

De Morgan's laws

NL queries and their RRA expressions

Assumptions

RRA and its relational representation of the ERM

Constructing an RRA expression from an NL query

RRA development and utilization

See also

References

External links

ERROL - from Amazon wiki

ERROL

Contents

Introduction

Language overview

Structured English

Highlights

Example

Reshaped relational algebra

Brief history

ERROL (SE) examples

Example 1

Example 2

Example 3

Example 4

Example 5

See also

References

External links