Three prototype systems
Various
approaches ae used for exploring the problems and to solve them, most
importantly three systems of prototyping are suggested. By using these systems,
valuable experience has been gained and challenging situation arises to refine
the existing methodologies.
The first prototype, OWLIR, is an example of a system that takes ordinary text documents as input, annotates them with semantic web markup, swangles the results and indexes them in a custom information retrieval system.
OWLIR
is such a system that takes the input of text documents and annotates these
documents by semantic web markup, swangles and indexes the results in
customized IR system.
Custom
query interface is used which is designed to accept structure data items or
free text through querying the OWLIR. Swangler, which is the second prototype
specially designed to annotate RDF documents encoded by XML, having RDF
statements additionally, attaching terms processes by swangle. These terms can
be easily indexed by Google and other standard search engines. On the web,
these documents can be easily identified and are indexed by search engines.
These documents can also be retrieved using text based queries, swangle terms
and XML.
OWLIR
To markup free text and semantic
markup in RDF, DAML+OIL or OWL, for retrieval of documents OWLIR system is
implemented. OWLIR demonstrated the working of HAIRCUT and WONDIR for
information retrieval locally. OWLIR is further explored for hybrid information
retrieval and for the general issues with the implemented system.
Inference System
During
the extraction of text to infer semantic relations additionally, OWLIR uses the
metadata information during these processing. To provide relevant responses,
the above mentioned relations are used to identify the scope of the search. The
reasoning functionality of OWLIR uses the basic structure of DAML Jess KB. DAML
Jess KB allows reasoning for the hierarchy of ontology for facilitating reading
and interpreting DAML+OIL based document files.
The
time required for incorporating text and markup of documents containing free
text retrieval is comparable. The
effectiveness of information retrieval increases for a document being indexed including
semantic markup representation. Semantic markup is performed prior to indexing
of document with additional increase in performance benefits with inference.
The
documents in the form of RDF and OWL is a world of web, parallel to the web of
HTML documents, all of these are the essential part of semantic web currently.
To reference the documents which carried meaning also, the standardized way is
required to embed RDF and OWL markup. Semantic web documents use same pattern
to develop reference with one and another as used by HTML. RDF documents can be
discovered and indexed by internet search engines such as Google. The well
defined systems like Google analyzes semantic web documents as text files,
there are other problems addressing the same situation as faced by Google. The
mechanism of XML namespace is addresses a problem to these search engines. Secondly
tokenizing rules causes problems with XML documents as these rules are designed
for natural languages. Lastly, by taking the advantage of markup as its nature
is semantic, we used the swangling techniques to enrich SWDs and by adding
swangle terms, RDF statements additionally.
By
using the OWLIR, the components of each swangle term or a triple with one of
its components is encoded. For instance the RDF triple generates seven
combinations of article, predicate, subject and object. A simple interface is
used to retrieve the documents subsequently that triples provided by the user,
consequently the processed terms are used to swangle and to compose a query.
Swoogle
The
document encoded in RDF consists of semantic web. For semantic web documents
considering that specialized indexing and retrieval engine apparently. For the
understanding and processing of machines which are specially designed for
search engines to process semantic web documents. Search engines available in
the market are not in a certain capacity to interpret the meanings of the
documents as natural language processing is not up to the desired task.
Woogle3
To
process semantic web documents which are encoded in RDF and OWL, as prototype
internet indexing and retrieval engine Woogle3 is developed. Software agents as
well as users and services are provided with the intended support by the newly
developed system. For the automatic discovery of RDF documents, semantic web
researchers and developer, engaged in retrieval, exploring and querying the
collection of metadata of the SWDs. Using classes or properties, similar to
SWDs containing certain terms, Software APIs will need SWDs matching descriptions.
The system required a database to store
data about data relating with SWDs, web crawlers are used to identify new and
updated SWDs., such interface that can be used for computing documents meta
data and scientific relationships between these SWDs. For querying the system a
simple interface based on n-gram base indexing is used alongwith retrieval
engine, agent based and web service APIs to provide useful services.
For
computing SWD rank, the property that is used is metadata. To measure the
importance or popularity of the semantic web document, page rank concept of SWD
is used as essential measure. This measure is used by the retrieval engine for
ordering of returned results. The advantage of using this algorithm is richer
set relations that are consisting of hypertext documents, forming graphs by
SWDs.
We
have identified that the broad use of swoogle retrieval system and others similar
to it, in the following capacity which is defined like finding appropriate
ontologies, finding instance data and studying the structure of semantic web as
well.
Ontology
is loaded to RDF editor, which allows the user to do so typically, then it is
used to define assertions. It is very hard task to locate the accurate
ontology. It contributes to propagate the ontologies. Some developers ignore it
and do prefer to write their own. The advantage of using Swoogle is that any
query containing specified terms in the document for ontologies can be
initiated by the user. In such a case the ontologies can contain classes and
properties as specified terms. On the other hand as well as for the ontologies
about a specified term determined by IR engine. Ranking algorithm is used to
rank the returned ontologies. The use of Swoogle will burden or ease the
process of marking up data and contribute to the canonical ontologies for
emergence.
Distributed
information is integrated by enabling the semantic web, but firstly the
necessary information is to be found. Swoogle facilitates the user to query
about all data instances as well as specific class and subject. The resultant
triples of SWDs are loaded into KB for further processing.
The computation process about how to connected, ontologies are referred by which document and which of the documents are referred to certain ontologies, which kind of relations exist between documents are defined by the Swoongle and its provide metadata computed and structural information about the semantic web as well.
1 Comments
Very informative
ReplyDelete