1.
Advanced applications required support for
persistent data storage, new features
2.
Limitations in RDM cannot be changed as they
are using the standards laid down by Codd. and supported by mathematical
formulation.
a. Object
Oriented programming paradigm was the most successful and popular, why not use
the OOP features in DBMS.
b. One
approach adopted by industry/academia towards OODBMSs.
c. Relational
Extensions.
1. Enjoy
the benefits of both worlds.
2. Stick to
the basic relational model.
3. Have the
OO features like complex objects, UDTs etc
Design
Techniques
In mid 80s,
Stonebraker presented different proposals.
The Extended
Type System for a OODBMS should allow:-
Definition of
new data types.
Definition of
new operations on newly defined data types.
Implementation
of access methods from the hard disk and ram.
Optimized Query
Processing for queries on new data types.
Relational vs OO DBMS
It’s a huge
debate in favor of both.
Caution: Be careful while comparing. Most of the
objectives can be attained by using Relational and Extending Relational model.
Extensions
Techniques in RDBMSs
Different steps
adopted by different DBMSs, like
- Support for
variable length undefined data values.
- Using this
support, generalized user-defined data types can be represented.
Like Oracle supported RAW, LONG and LONGRAW (up
to 65535 bytes).
Sybase supported TEXT and IMAGE up to 2GB, and also others.
These features were partial support for storing complex data.
Another type of
extension is TEXT and BULK (binary) supported by user-defined procedures that
can operate on these types.
Such facilities
were mainly used to capture non-text data, like voice, medical charts, finger
prints.
Then
user-defined keywords are associated with user-defined procedures.
These keywords
and the specification of the procedures are stored in tables.
For example,
consider the function.
integer dist(d1, d2)
The possible
usage of this function may include. Select dist(x, y) from T.
Select from T
where dist(x, y) = expr
One approach
adopted by industry/academia towards OODBMSs
Relational
Extensions
General
Approaches/Tactics
Two Case
Studies
Extended RDM
DBMS.
Developed at UC
Berkeley in mid 80s by Prof Stonebraker and his group.
Commercialized
as ILLUSTRA.
Extended INGRES
(a RDBMS) to support OO features.
Basic idea in POSTGRES as to introduce minimum
changes in the Codds original relational model to achieve the objective.
Advantage is the continuity with the previous
product (INGRES) and provision of OO features in the new product.
DESIGN AND ARCHITECTURE OF PG
Design
objectives of POSTRGRES declared by Stonebraker were:
To Provide
better support for complex objects.
Support for
UDT, operators and access methods.
Design
objectives of POSTRGRES declared by Stonebraker were:
Facilities for
active databases, and inferencing.
PG provided
extensions to define ADTs, that can be used as types attributes.
Conventional
dot notation for referencing,
EMP.Name
POSTQUEL type fields can contain sequence of data manipulation command.
Procedure type
can contain procedures written in General Purpose High Level Language.
Fields of type
POSTQUEL and procedure can be executed using EXECUTE command.
For example:
EMP(name, age, salary, hobbies, dept) where hobbies is of POSTQUEL type.
This field can store queries to retrieve hobbies from other relations, like:
EXECUTE
(EMP.hobbies) WHERE EMP.Name = “Kamran”
QUEL was
provided in INGRES.
POSTQUEL was
provided in POSTGRES.
Most of QUEL
commands are included in POSTQUEL.
Complex
objects.
UDTs and their
access methods.
Time varying
data (snapshots and historical data).
Iterative
queries.
Alerters,
triggers and rules
Integer,
Floating Point, Fixed Length Character String.
Unbounded
Varying Length Arrays with Arbitrary Dimensions.
POSTQUEL
Procedure
Conventional
dot notation for referencing, EMP.Name
POSTQUEL type
fields can contain sequence of data manipulation command.
Procedure type
can contain procedures written in General Purpose High Level Language.
Fields of type
POSTQUEL and procedure can be executed using EXECUTE command.
For example: EMP(name, age, salary, hobbies,
dept) where hobbies is of POSTQUEL type.
This field can
store queries to retrieve hobbies from other relations, like:
EXECUTE
(EMP.hobbies) WHE‘E EMP.Nae = “Kamran”
Type POSTQUEL
can store shared complex objects and multiple representations of data.
It can contain sequence of commands to fetch
data from other relations that represent sub objects.
Polygon
(id, name, other attributes)
Circle
(id, name, other attributes)
Line
(id, name, other attributes)
Create Object (name= char[10], obj=postquel)
Object type of postquel can store data manipulation commands and can access other objects which are considered sub objects. For example the complex object name is “a” Object (a) and obj type is postquel which is used to store multiple commands.
Retrieve (Polygon.All) where (Polygon.id= 10)
Retrieve
(Circle.All) where (Circle.id= 40)
Another object
is named as “b”
Retrieve (Line.All) where(Line.id= 25)
Retrieve (Polygon.All) where (Polygon.id= 10)
Multiple representations help in placing appropriate data structure for caching, on the other hand maintaining relational structure for ease of access.
Transformations
are supported by defining a procedure stored in database.
Create OBJECT (name = char[10], obj = postquel, display = cproc)
cproc is a procedure written in C language that
displays the object. But storing procedure for every object is wastage;
solution is to store in separate relation and object is passed as parameter.
Object a and object b are sharing their sub objects as well.
Procedures also have multiple
representations like as they stored in the hard disk as well they are to
be stored in the RAM while processing. So appropriate data structure
will be required for caching, on the other hand maintaining relational
structure for ease of access.
Representing data in RAM for the user point of view is different. After accessing the data from the hard disk in the form of a table, it is to be displayed. While creating an object which is complex object if we store its display procedure as part of the value of that object, it will be the wastage of the storage space. The solution is to store it in separate relation and object is passed as parameter.
Create Object
(name = char [10], obj = postquel, display = procDisplay)
procDisplay is a procedure written in any
General purpose programming language.
We use Online transaction processing
applications uses current data. They can be designed
to access and capture the current business state. Normal queries
access current data.
Historical data can be accessed
through timestamp, which was provided by Postgres at that time.
Memory Hierarchy is used by Postgres
Main Memory
Used for the
data which is fetched, processed and manipulated in RAM.
Secondary Memory
The data which
is permanently stored on the disk, which is called persistent data. current
data is stored in secondary memory.
Tertiary Memory (Optical Disk)
Historical data
is stored on optical disk.
This
distribution of data in all three types of storage is transparent from the
user. The user is retrieving data for OLTP from hard disk. If the
historical data is required then database will fetch data from the tertiary disk.
Version Controlling is also sported by Postgres
Version can be
created from a relation or a snapshot.
Snapshot is a
state of database tables for a specific period of time like 3 or 5 months and
so on. The snapshot will be saved on secondary storage. Snapshot is a timestamp
based data and updates cannot be applied on snapshots.
Updates in
version will not be reflected in relations. On the other hand, the
updates in relation will be conditionally visible in versions.
Version creation example
NEW
VERSION empver from EMP
Version changes
into relation
MERGE
empver into EMP
Iterative
Queries
Postgres also
provide the facility of iterative queries. The concept of iterative queries is
similar to sub queries. A query is embedded within the body of another query or
it is repeated multiple times. Iterative query will be executed repeatedly
until its condition is true.
Emp(name
char[20], mgr char[20]
RETRIEVE * into SUBORDS
(E.name, E.mgr)
from E in EMP,
S in SUBORDS where
E.Name = “Ali”
or E.mgr=S.name
Iterative
queries can also be used with
Append
Delete
Execute
Replace
Retrieve
into
Alerters and Triggers
Alerters and Triggers depicts an
Active Database. An active database is an operational database which triggers a
function or procedure when a certain event occurs. This is achieved by
implementing Alerters and Triggers in Postgres.
Retrieve always (EMP.all)
Where EMP.name=”Nasir”
The use of Always command with Retrieve command , it becomes trigger in Postgres. Whenever in a record there is a change in record “Nasir”, the trigger will be fired or executed.
Trigger in an update Query
Delete always
DEPT dname where count(tmp.name by DEPT.name)=0
By using Alterters and Triggers if we update or delete any tuple in a relation, the Trigger or Alerter will become the part of that schema.
We have to define that Alerter or
Trigger, anything which is defined in the schema, that will be managed by the
DBMS itself. Whenever there is a change or update, the Triggers will
be fired automatically.
If we say that a Data Model is
semantically rich it means it has the capability to store more semantics more
meanings of the real world modeling into the schema. Anything which
is saved as part of schema will be automatically managed by the DBMS more
accurately.
Forward chaining is supported by
ALWAYS, any update invokes the chain of alerters, triggers and commands.
It also supports KB (knowledge
based) features in the DBMS. Forward chaining is a feature of AI.
Compilation and
Fast Path
All the SQL commands are
interpreted, parsed and then they are executed. Firstly the query
will be converted into relational algebra and relational calculus and then that
expression will be executed. It will take time and resources and size of
intermediate tables, in terms of cost.
In Postgres the query will be
converted as command, which will be compiled and the execution plan will be
saved as command in database. This command will be executed each time without
recompilation again as it is preprocess already. A Command will be stored in
system catalogue like CODE (id, owner, command). Code indicates that this is
required to be compiled, id is to locate the command address, command is the
instruction which will be compiled and saved.
2nd approach is to improve the Command compilation which is called Fast Path.
Fast Path will directly move the execution control of
the computer at the desired id of the command and execute it.
Postgres Conclusion
Postgres is the first extended Relational DBMS. The
objective of that database is to build a layer on top of the existing RDM to
get the benefits of both the technologies. It is also provided with the
features of relational DBMS like integrity rules, superset of SQL,
Provided OO features like
Complex
objects
Triggers
Alerters
Versions
Historical
data
Inference
rules
Compiled
Queries
Fast
Path

28 Comments
Informative
ReplyDeleteUseful
ReplyDeleteYes
DeleteSuperb
ReplyDeleteGood sharing of knowledge
ReplyDeleteInformative
ReplyDeleteVery important rare information
ReplyDeleteVery useful 👌
ReplyDeleteabsolutely technical
ReplyDeleteappreciated
ReplyDeletelovely information
ReplyDeleteadvantages of database systems are obvious
ReplyDeletevery informative
ReplyDeleteout standing information
ReplyDeleteabsolutely
DeleteVery good
ReplyDeleteExcellent information
ReplyDeleteSuperb
ReplyDeleteVery informative
ReplyDeleteGood work
ReplyDeletevery useful information
ReplyDeleteFantastic
ReplyDeleteResearch work
ReplyDeleteSuperb
ReplyDeleteSupporting materials
ReplyDeleteGood
ReplyDeleteImpressive details
ReplyDeleteVery well explained 👏
ReplyDelete