Object Oriented DBMS Advanced Application Requirements

1.         Advanced applications required support for persistent data storage, new features 

2.         Limitations in RDM cannot be changed as they are using the standards laid down by Codd. and supported by mathematical formulation.

a.       Object Oriented programming paradigm was the most successful and popular, why not use the OOP features in DBMS.

b.      One approach adopted by industry/academia towards OODBMSs.

c.       Relational Extensions.

1.      Enjoy the benefits of both worlds.

2.      Stick to the basic relational model.

3.      Have the OO features like complex objects, UDTs etc

Design Techniques

In mid 80s, Stonebraker presented different proposals.

The Extended Type System for a OODBMS should allow:-

Definition of new data types.

Definition of new operations on newly defined data types.

Implementation of access methods from the hard disk and ram.

Optimized Query Processing for queries on new data types.

 Relational vs OO DBMS

It’s a huge debate in favor of both.

Caution: Be careful while comparing. Most of the objectives can be attained by using Relational and Extending Relational model.

Extensions Techniques in RDBMSs

Different steps adopted by different DBMSs, like

  1.     Support for variable length undefined data values. 
  2.     Using this support, generalized user-defined data types can be represented.

Like Oracle supported RAW, LONG and LONGRAW (up to 65535 bytes).

Sybase supported TEXT and IMAGE up to 2GB, and also others.


These features were partial support for storing complex data.

Another type of extension is TEXT and BULK (binary) supported by user-defined procedures that can operate on these types.

Such facilities were mainly used to capture non-text data, like voice, medical charts, finger prints.

Then user-defined keywords are associated with user-defined procedures.

These keywords and the specification of the procedures are stored in tables.

 The user calls the procedure and table is referred to resolve the call.

For example, consider the function.    

integer  dist(d1, d2)

The possible usage of this function may include. Select dist(x, y) from T.

Select from T where dist(x, y) = expr

One approach adopted by industry/academia towards OODBMSs

Relational Extensions

General Approaches/Tactics

Two Case Studies

 POSTGRES

Extended RDM DBMS.

Developed at UC Berkeley in mid 80s by Prof Stonebraker and his group.

Commercialized as ILLUSTRA.

Extended INGRES (a RDBMS) to support OO features.

Basic idea in POSTGRES as to introduce minimum changes in the Codds original relational model to achieve the objective.

Advantage is the continuity with the previous product (INGRES) and provision of OO features in the new product.

DESIGN AND ARCHITECTURE OF PG

Design objectives of POSTRGRES declared by Stonebraker were:

To Provide better support for complex objects.

Support for UDT, operators and access methods.

Design objectives of POSTRGRES declared by Stonebraker were:

Facilities for active databases, and inferencing.

PG provided extensions to define ADTs, that can be used as types attributes.

Conventional dot notation for referencing,

EMP.Name


POSTQUEL type fields can contain sequence of data manipulation command.

Procedure type can contain procedures written in General Purpose High Level Language.

Fields of type POSTQUEL and procedure can be executed using EXECUTE command.

For example: EMP(name, age, salary, hobbies, dept) where hobbies is of POSTQUEL type.

This field can store queries to retrieve hobbies from other relations, like:

EXECUTE (EMP.hobbies)    WHERE EMP.Name = “Kamran”

 Data Definition in POSTGRES

QUEL was provided in INGRES.

POSTQUEL was provided in POSTGRES.

Most of QUEL commands are included in POSTQUEL.

 However, many extensions have been included, like:

Complex objects.

UDTs and their access methods.

Time varying data (snapshots and historical data).

Iterative queries.

Alerters, triggers and rules

 Built-in Data Types include:

Integer, Floating Point, Fixed Length Character String.

Unbounded Varying Length Arrays with Arbitrary Dimensions.

POSTQUEL

Procedure

 Other features

Conventional dot notation for referencing, EMP.Name

POSTQUEL type fields can contain sequence of data manipulation command.

Procedure type can contain procedures written in General Purpose High Level Language. 

Fields of type POSTQUEL and procedure can be executed using EXECUTE command.

For example: EMP(name, age, salary, hobbies, dept) where hobbies is of POSTQUEL  type.

This field can store queries to retrieve hobbies from other relations, like:

EXECUTE (EMP.hobbies)    WHE‘E EMP.Nae = “Kamran”

 Complex Objects

Type POSTQUEL can store shared complex objects and multiple representations of data.

It can contain sequence of commands to fetch data from other relations that represent sub objects.    

 Simple Objects

                                    Polygon (id, name, other attributes)

                                    Circle (id, name, other attributes)

                                    Line (id, name, other attributes)

 Complex object can be defined as

                                     Create Object (name= char[10], obj=postquel)

Object type of postquel can store data manipulation commands and can access other objects which are considered sub objects. For example the complex object name is “a” Object (a) and obj type is postquel which is used to store multiple commands.

                                    Retrieve (Polygon.All) where (Polygon.id= 10)

                                    Retrieve (Circle.All) where (Circle.id= 40)

Another object is named as “b”

Retrieve (Line.All) where(Line.id= 25)

                                    Retrieve (Polygon.All) where (Polygon.id= 10)

Multiple representations help in placing appropriate data structure for caching, on the other hand maintaining relational structure for ease of access.

Transformations are supported by defining a procedure stored in database.

Create OBJECT (name = char[10], obj = postquel, display =  cproc)

 

cproc is a procedure written in C language that displays the object. But storing procedure for every object is wastage; solution is to store in separate relation and object is passed as parameter.

Object a and object b are sharing their sub objects as well.

           Procedures also have multiple representations like as they stored in the hard disk as well they are to be  stored in the RAM while processing. So appropriate data structure will be required for caching, on the other hand maintaining relational structure for ease of access.

                 Representing data in RAM for the user point of view is different. After accessing the data from the hard disk in the form of a table, it is to be displayed. While creating an object which is complex object if we store its display procedure as part of the value of that object, it will be the wastage of the storage space. The solution is to store it in separate relation and object is passed as parameter.

Create Object (name = char [10], obj = postquel, display = procDisplay)

procDisplay is a procedure written in any General purpose programming language.

 Time varying data, historical data

            We use Online transaction processing applications uses current data. They can be   designed to access and capture the current business state.  Normal queries access current     data.

            Historical data can be accessed through timestamp, which was provided by Postgres at     that time.

Memory Hierarchy is used by Postgres

Main Memory

Used for the data which is fetched, processed and manipulated in RAM.

Secondary Memory

The data which is permanently stored on the disk, which is called persistent data. current data is stored   in secondary memory.

Tertiary Memory (Optical Disk)

Historical data is stored on optical disk.

This distribution of data in all three types of storage is transparent from the user. The user is retrieving  data for OLTP from hard disk. If the historical data is required then database will fetch data from the  tertiary disk.

Version Controlling is also sported by Postgres

Version can be created from a relation or a snapshot.

Snapshot is a state of database tables for a specific period of time like 3 or 5 months and so on. The snapshot will be saved on secondary storage. Snapshot is a timestamp based data and updates cannot be applied on snapshots.

Updates in version will not be reflected in relations.  On the other hand, the updates in relation will be conditionally visible in versions.

Version creation example

                        NEW VERSION empver from EMP

Version changes into relation

                        MERGE empver into EMP

Iterative Queries

Postgres also provide the facility of iterative queries. The concept of iterative queries is similar to sub queries. A query is embedded within the body of another query or it is repeated multiple times. Iterative query will be executed repeatedly until its condition is true.

Emp(name char[20], mgr char[20]

RETRIEVE * into SUBORDS

(E.name, E.mgr) from E in EMP,

S in SUBORDS where

E.Name = “Ali” or E.mgr=S.name

Iterative queries can also be used with

            Append

            Delete

            Execute

            Replace

            Retrieve into

Alerters and Triggers

            Alerters and Triggers depicts an Active Database. An active database is an operational database which triggers a function or procedure when a certain event occurs. This is achieved by implementing Alerters and Triggers in Postgres.

                        Retrieve always (EMP.all)

                        Where EMP.name=”Nasir”

            The use of Always command with Retrieve command , it becomes trigger in Postgres. Whenever in a record there is a change in record “Nasir”, the trigger will be fired or executed.

Trigger in an update Query

Delete always DEPT dname where count(tmp.name by DEPT.name)=0

            By using Alterters and Triggers if we update or delete any tuple in a relation, the Trigger or Alerter will become the part of that schema.

            We have to define that Alerter or Trigger, anything which is defined in the schema, that will be managed by the DBMS itself.  Whenever there is a change or update, the Triggers will be fired automatically.

            If we say that a Data Model is semantically rich it means it has the capability to store more semantics more meanings of the real world modeling into the schema.  Anything which is saved as part of schema will be automatically managed by the DBMS more accurately.

            Forward chaining is supported by ALWAYS, any update invokes the chain of alerters, triggers and commands.

            It also supports KB (knowledge based) features in the DBMS. Forward chaining is a feature of AI.

Compilation and Fast Path

            All the SQL commands are interpreted, parsed and then they are executed.  Firstly the query will be converted into relational algebra and relational calculus and then that expression will be executed. It will take time and resources and size of intermediate tables, in terms of cost.

            In Postgres the query will be converted as command, which will be compiled and the execution plan will be saved as command in database. This command will be executed each time without recompilation again as it is preprocess already. A Command will be stored in system catalogue like CODE (id, owner, command). Code indicates that this is required to be compiled, id is to locate the command address, command is the instruction which will be compiled and saved.

2nd approach is to improve the Command compilation which is called Fast Path.

         Fast Path will directly move the execution control of the computer at the desired id of the command and execute it.

Postgres Conclusion

        Postgres is the first extended Relational DBMS. The objective of that database is to build a layer on top of the existing RDM to get the benefits of both the technologies. It is also provided with the features of relational DBMS like integrity rules, superset of SQL,

Provided OO features like

                        Complex objects

                        Triggers

                        Alerters

                        Versions

                        Historical data

                        Inference rules

                        Compiled Queries

                        Fast Path

Post a Comment

28 Comments

  1. Very important rare information

    ReplyDelete
  2. advantages of database systems are obvious

    ReplyDelete