Three prototype systems

Three prototype systems

            Various approaches ae used for exploring the problems and to solve them, most importantly three systems of prototyping are suggested. By using these systems, valuable experience has been gained and challenging situation arises to refine the existing methodologies. 

              The first prototype, OWLIR, is an example of a system that takes ordinary text documents as input, annotates them with semantic web markup, swangles the results and indexes them in a custom information retrieval system.

            OWLIR is such a system that takes the input of text documents and annotates these documents by semantic web markup, swangles and indexes the results in customized IR system.

            Custom query interface is used which is designed to accept structure data items or free text through querying the OWLIR. Swangler, which is the second prototype specially designed to annotate RDF documents encoded by XML, having RDF statements additionally, attaching terms processes by swangle. These terms can be easily indexed by Google and other standard search engines. On the web, these documents can be easily identified and are indexed by search engines. These documents can also be retrieved using text based queries, swangle terms and XML.

             Swoogle is the 3rd prototype which is used to process RDF documents with the help of crawler based indexing and retrieval system. A database is populated with the metadata about the document after identification of the documents and after processing them. A special version of HAIRCUT information retrieval engine inserts this information that is being used to index the terms as character n-grams.

OWLIR

          To markup free text and semantic markup in RDF, DAML+OIL or OWL, for retrieval of documents OWLIR system is implemented. OWLIR demonstrated the working of HAIRCUT and WONDIR for information retrieval locally. OWLIR is further explored for hybrid information retrieval and for the general issues with the implemented system.

Inference System

            During the extraction of text to infer semantic relations additionally, OWLIR uses the metadata information during these processing. To provide relevant responses, the above mentioned relations are used to identify the scope of the search. The reasoning functionality of OWLIR uses the basic structure of DAML Jess KB. DAML Jess KB allows reasoning for the hierarchy of ontology for facilitating reading and interpreting DAML+OIL based document files.

            The time required for incorporating text and markup of documents containing free text retrieval is comparable.  The effectiveness of information retrieval increases for a document being indexed including semantic markup representation. Semantic markup is performed prior to indexing of document with additional increase in performance benefits with inference.

 Swangler

            The documents in the form of RDF and OWL is a world of web, parallel to the web of HTML documents, all of these are the essential part of semantic web currently. To reference the documents which carried meaning also, the standardized way is required to embed RDF and OWL markup. Semantic web documents use same pattern to develop reference with one and another as used by HTML. RDF documents can be discovered and indexed by internet search engines such as Google. The well defined systems like Google analyzes semantic web documents as text files, there are other problems addressing the same situation as faced by Google. The mechanism of XML namespace is addresses a problem to these search engines. Secondly tokenizing rules causes problems with XML documents as these rules are designed for natural languages. Lastly, by taking the advantage of markup as its nature is semantic, we used the swangling techniques to enrich SWDs and by adding swangle terms, RDF statements additionally.

            By using the OWLIR, the components of each swangle term or a triple with one of its components is encoded. For instance the RDF triple generates seven combinations of article, predicate, subject and object. A simple interface is used to retrieve the documents subsequently that triples provided by the user, consequently the processed terms are used to swangle and to compose a query.

Swoogle

          The document encoded in RDF consists of semantic web. For semantic web documents considering that specialized indexing and retrieval engine apparently. For the understanding and processing of machines which are specially designed for search engines to process semantic web documents. Search engines available in the market are not in a certain capacity to interpret the meanings of the documents as natural language processing is not up to the desired task.

Woogle3

            To process semantic web documents which are encoded in RDF and OWL, as prototype internet indexing and retrieval engine Woogle3 is developed. Software agents as well as users and services are provided with the intended support by the newly developed system. For the automatic discovery of RDF documents, semantic web researchers and developer, engaged in retrieval, exploring and querying the collection of metadata of the SWDs. Using classes or properties, similar to SWDs containing certain terms, Software APIs will need SWDs matching descriptions.

            The system required a database to store data about data relating with SWDs, web crawlers are used to identify new and updated SWDs., such interface that can be used for computing documents meta data and scientific relationships between these SWDs. For querying the system a simple interface based on n-gram base indexing is used alongwith retrieval engine, agent based and web service APIs to provide useful services.

            For computing SWD rank, the property that is used is metadata. To measure the importance or popularity of the semantic web document, page rank concept of SWD is used as essential measure. This measure is used by the retrieval engine for ordering of returned results. The advantage of using this algorithm is richer set relations that are consisting of hypertext documents, forming graphs by SWDs.

            We have identified that the broad use of swoogle retrieval system and others similar to it, in the following capacity which is defined like finding appropriate ontologies, finding instance data and studying the structure of semantic web as well.

            Ontology is loaded to RDF editor, which allows the user to do so typically, then it is used to define assertions. It is very hard task to locate the accurate ontology. It contributes to propagate the ontologies. Some developers ignore it and do prefer to write their own. The advantage of using Swoogle is that any query containing specified terms in the document for ontologies can be initiated by the user. In such a case the ontologies can contain classes and properties as specified terms. On the other hand as well as for the ontologies about a specified term determined by IR engine. Ranking algorithm is used to rank the returned ontologies. The use of Swoogle will burden or ease the process of marking up data and contribute to the canonical ontologies for emergence.

            Distributed information is integrated by enabling the semantic web, but firstly the necessary information is to be found. Swoogle facilitates the user to query about all data instances as well as specific class and subject. The resultant triples of SWDs are loaded into KB for further processing.

            The computation process about how to connected, ontologies are referred by which document and which of the documents are referred to certain ontologies, which kind of relations exist between documents are defined by the Swoongle and its provide metadata computed and structural information about the semantic web as well.

Post a Comment

1 Comments