Facet Analysis and Semantic Web

| ? Most of the atom is empty space ? Most of the Semantic Web is Meta Data ? WWW is becoming Winding Way Web ? J

FACET ANALYSIS AND SEMANTIC WEB : Musings of a student of Ranganathan

F.J. DEVADASON

devadason_f_j@yahoo.com

TO THE MEMORY OF TWO OF MY BEST TEACHERS OF CLASSIFICATION


DR.S.R.RANGANATHAN	DR.G. BHATTACHARYYA

(Both enjoyed humor, anecdotes and stories. The former enjoyed my Thirunelvaeli Thamilzh too,
and so I have tried to provide some).

Machinery or no machinery, for retrieval classification is a necessity, not just superficial classification but depth classification.
S.R. Ranganathan.

Note: Classification here stands for Faceted Classification such as the Colon Classification.
Ranganathan's Colon Classification had SEVEN facets and not FIVE : [Basic Facet], [Personality Facet], [Matter Facet], [Energy Facet], [Space Facet], [Time Facet] and [Common Isolate Facet]

Note About This Write-Up	PROLOGUE
Purpose 1: Is Our Paper ?Not Relevant??	Purpose 2: Pondering Over the Semantic Web Intrigued by the Course of its Development
The Paper: Faceted Indexing Based System for Organizing and Accessing Internet Resources	EPILOGUE

NOTE ABOUT THIS WRITE-UP

This is a first draft and a causal style write up and so contains several repetitions and statements which you may just ignore if you do not follow. However after reading the whole lot and the paper linked to this write up, you will definitely follow what I am trying to convey. I hope to edit and level these out at a later time. If I get feedback that I should be a bit more serious and re-write this paper properly according to scientific paper writing rigid rules and citations and bibliography and review of the recent works and so on, then I will do so later. Those of you, who know Thamilzh (Thamizh, Tamil), may play as background music the song "ettaNaa irunthaa ettu ooru en paattu kaetkum." You may have to register to download it from:
http://music.cooltoad.com/music/song.php?id=100300 . This is the theme song for this write-up ! I am sure you will enjoy. The statements in transliterated Thamilzh are for enjoyment and do not convey any significant meaning and so others may simply ignore them. But I am sure they will force some soft movement of the lips leading to a softened smile (Punmuruval pookka vaikkum). They did the same to Dr. Ranganathan too.

By the way do you know that the famous "Wall-Picture" principle is an old saying (proverb) in Thamilzh?. (Chuvarintich chiththiram yelutha mudiyaathu). And its derivation the "Cow-Calf principle" another common proverb in Thamilzh?

PROLOGUE

Purpose 1: Is Our Paper "Not Relevant"
One of the purposes of this write up is just to find out whether one of our papers entitled " Faceted Indexing Based System for Organizing and Accessing Internet Resources" published in "Knowledge Organization : Journal of the International Society for Knowledge Organization Vol. 29 (2002); No. 2; p 65-77" [reproduced in its entirety down below] is really NOT RELEVANT as mentioned in one of the bibliography entitled "Putting Facets on the Web: An Annotated Bibliography" Oct. 2003. < http://www.miskatonic.org/library/facet-biblio.html> .

Normally when one compiles a bibliography she or he compiles according to a certain criteria, be it a subject or topic or what ever, and includes only those that are found relevant. If something is found "not relevant" then it is simply ignored and not included in the bibliography. But this bibliography for some reason has put in a heading "NOT RELEVANT" and included our paper and another one. To me it appears to be cheeky!. If some one wants to criticize our paper they have every right to write to the Editor of the Journal. The criticism would be sent to the authors and the authors' view will also be published in the "Letters to the Editor" section. Unilaterally branding a paper as "NOT RELEVANT" without giving an opportunity to the author to reply, is actually academic dishonesty JJ. If our paper is partially relevant it may be stated so in the annotation below the citation, with the objective of the compilation and how the objective is not met or partially met by our paper. If it is totally "not relevant" then the search criteria used by the compiler for compiling the bibliography should not have retrieved our paper, if it has, then the conduct of the search and the selection of the paper is just flawed ! You select a "not relevant" paper and list it as " not relevant"? This is not fair ! Also when a paper is published in a journal, the paper generally does not reproduce all the information necessary to understand it from scratch. A certain amount of scholarship is expected of the readers, at least an understanding of the papers cited.

Our paper is not a tutorial on Facet Analysis ! If any one wants a tutorial, there are quite a few available, most of them formed by copiously copying from Ranganathan's works or from works that copied his works. Some are formed by a generous selection from Ranganathan and a limited selection from CRG's (CRG itself is an offshoot of Ranganathan's teaching) and those of his students in UK. I have not come across any significant development to what Ranganathan has said except that a few more additions to the Categories (facets) as Prof. Vickery did long time ago. If there has been any addition to the principles of facet sequence or array isolate sequence or any enhancement thereto, please let me know. I would be grateful.. Our paper does not copy and reproduce Ranganathan's, it is a further development of POPSI and the DSIS which are based on Ranganathan's facet analysis. The origin being Ranganathan's paper "Subject Headings and Facet Analysis" Published in Journal of Documentation (1964), Vol. 20. p 109-119. The fact is, though Ranganathan's facets are generally understood to be just the five Facets (PMEST), he had in actuality seven Facets !.J They are [Basic Facet], [Personality Facet], [Matter Facet], [Energy Facet], [Space Facet], [Time Facet] and [Common Isolate Facet] (the anteriorising and the posteriorising types). Never mind if you do not understand because you have to read his works. [Strange that it is NOT FIVE but SEVEN. How this escaped the analytical mind of stalwarts who say our paper is "Not relevant" is strange indeed ! Now onwards every one will say SEVEN and not FIVE . But there are some points for argument too]. In our paper, we have managed with just four Elementary Categories [Discipline], [Entity], [Property] and [Action] and a concept called [Modifier] as enunciated by Bhattacharyya in his works on POPSI.. To generate different types of organizing sequences we have the concept of [Base] and [Core]. Well, if you read both Bhattacharyya's papers on POPSI and my FID/CR report No 21, you will understand it all. When one writes a paper for a journal, the space is limited.

I would like to put all our documents in this site whenever I find time. Though the origin of POPSI was Ranganathan's paper "Subject heading and Facet Analysis"; in 1969 DRTC Annual Seminar 7, there was a paper entitled "Postulate based subject headings for a dictionary catalogue system" by Prof.Bhattacharyya and Prof.Neelameghan. But this Seminar volume is the old cyclostyled one fit to become an antique and perhaps available in some library here. If you really want to understand a concept you have to trace the origin and get at the original documents! One caution though. Deductive type of reasoning alone does not work well with Facet Analysis and its understanding! (Library Science is a social science and so is Library Classification). (Cheththaal thaan chudukaadu therium).

Our paper is not concerned about how to put a faceted classification scheme on the web. We are not "putting any facet on the web". We are not designing any faceted classification scheme and storing it any where. We are showing how a faceted indexing can produce an organizing classification effect and that it can be used for organizing and accessing any resource including web resources. Our paper is not concerned about how an information resource is identified and its structure explicated. The concern is not about whether the structure of the information resource is described using Dublin Core or the fantastic templates developed by the ROADS project in UK < http://www.ukoln.ac.uk/metadata/roads/what/ > (forgotten by many and the project now ceased to exist), or the "Wordstar" reminder HTML, or the verbose xml, or RDF etc.

The concern of the paper is how a facet analyzed subject heading (structured subject heading assigned on the basis of the theory of POPSI (POstulate based Permuted Subject Index) and elaborated as the Deep Structure Indexing System (indicating how the different types of index displays could be generated using computer "Computerized Deep Structure Indexing System. FID / CR Report NO. 21. FID/CR Secretariat, Frankfurt. 1986"), could be used to provide an expression that is meaningful and has the capacity to produce an organizing sequence when sorted alphabetically resulting in the organization of web resources and also the fact that the same could be used in a retrieval environment. The paper is only on subject retrieval and not retrieval based on other data elements such as the name of the creator or language of the resource etc. Detailed discussion of the formation of the index files for retrieval are not presented because the system uses the age old technique of Inverted Index files (index sequential files) and such discussion would be trivial.. Almost all information systems (Googele, Yahoo ..) use inverted index files.

I am reproducing our paper of 2002, as such below for you all to see and read. I may set up a blog to help you give your comments. Of course you have to register your name in order to be able to add your comments. For the time being send me e-mails. I will reproduce the "good" ones J.

Purpose 2: Pondering Over the Semantic Web Intrigued by the Course of its Development

The second purpose is to ponder in a lighter vein some factors in the development of the information scenario leading to the web and wonder what it would take to get to the semantic web if the stalwarts stick to the ?stay the course? policy in its development. Well, I am intrigued by the fact that the visionaries did not envision the need for the ?semantics? of the web early enough and do not realize that real evolution and development could be achieved only if a historical and developmental study of information systems is carried out diligently to get at and recognize the necessary factors for the evolution to be carried forward.

I am reminded of the phrases I heard at IIT Madras Computer Centre such as "you IBM 370 assembly programmers will always revolve around the 16 registers"(some IIT Madras Computer Centre friends, may remember simple assembly language macros developed to do this without using up another base register) , "a good systems programmer is not necessarily going to be a good systems administrator or system designer", "you higher level programmers will always think in terms of 'if then else' and miss the niceties of the quantum jump that is necessary to recognize the connection between the seemingly unconnected", and so on J .

Well, there was this ARPA net and ASCII won over all others such as the EBCDIC and all such codes, and well, communication between computers became a possibility in spite of different Operating Systems. There was Apple and "hypertext" which was developed as an easy page flipping and display technique with the added ability to jump to related pages and sections of text put on the computer as an information resource using the embedded links (Librarians called directive links as "See" references and related page / section links as "See also" references). It was basically a text displaying technique and no database methodology at all. There was Gopher and the Browser followed, and experts (non Librarians) who perhaps had no exposure to information system design but had used enough of word processing, took a fascination to the WordStar like text displaying hypertext and anointed it as the standard for putting Information on the net. Users having used word processors took a fascination to this and the web grew in a cooperative manner - but grew wildly! I am at a loss to think there was any one who actually designed the web consciously. If there is one please let me know. I would be grateful. Well, there were and are, several types of information systems and visibly there were / are INIS, AGRIS, the MEDLARS, the Chemical Abstracts, Engineering Index and a few on Computer Science too. How the designs of these have never been taken into cognizance is a mystery, and it is sad.

Had a Librarian been involved in the development of the web, he would have realized that it is going to be a cooperative, global database and would have tried to set up the correct guidelines and standards. It is because Librarians have been involved in at least designing Bibliographic Databases. From the sixties onwards the librarians knew that there are data elements identified by Data Names (name for the data - what a beautiful meaningful (semantically rich) simple label compared to the semantically confusing label - "meta data") (some people are always interested in expressing simple things with high sounding semantically poor terms to highjack the ideas and make them their own. More examples may be found below), and that most of them have variable length and that some of them can repeat in the description of an information resource. Using this knowledge, Librarians developed Machine Readable Catalog standards. "United States Standard for Information Exchange. Journal of Library Automation. Vol.1; 1968".

Librarians' involvement in information systems design is well known - the CAN/SDI, CAN/OLE systems of NRC, Canada for instance. Librarians did contribute to the development of international cooperative bibliographic information systems. For instance, if we take the FAO's AGRIS, Librarians knew that there should be a well defined "data element directory" (Librarians called it the AGRIS Cataloging Manual) giving the Data Names, their definitions, what should form the data element for a particular data name, including sub elements, where a particular data element be found in the document described, and how it is to be recorded along with examples. As the system is being developed cooperatively by member countries and that there could be records in different languages, to cut across the language barrier the data names were identified with unique codes (tags) and the data name itself is replaced by the codes in the records. This saved a lot of space in the records too and made it non-verbose unlike the XML of to day. Also to help easy ascertaining of the documents for inclusion and categorize them, there was a Categorization/ Classification scheme with codes for the topics, though it was not a faceted classification scheme displaying full hierarchy. Apart from these, there was a vocabulary control tool called "AGROVOC thesaurus" displaying up to seven levels of hierarchy of the terms to be used as index terms. The index terms themselves are to be selected from the AGROVOC and recorded according to a "standard syntax" to further categorize the documents coextensively and completely semantically, and in a consistent manner. Several standard templates for the different types of documents were also developed and the Librarians in all the participating countries were trained to ensure consistency.

Librarians knew well that to be successful, at least three important tools are needed for an information system of the type they were handling. They are:

Data element directory (Cataloging Manual)
Classification Scheme (unfortunately in the West only monolithic, enumerative, voluminous variety was known, unlike in India where classification scheme was almost synonymous with faceted classification scheme) for categorization of the documents; and
Thesaurus (vocabulary control tool) for consistent indexing (assigning index terms) that complemented the pitfalls of the enumerative classification scheme in not being capable of representing the subject of the document coextensively and completely, and brought in the faceted model as it was a compilation of uni-terms standing for uni-concepts. [Prof. Fugmann has cautioned about the pitfalls of using thesaurus alone for information retrieval in his paper "The glamour and the misery of the thesaurus approach" published in "International Classification", 1974., Vol. .1, p 77-]

The web has grown without any of these tools (though Deep Web Databases use them) and now attempts are being made to rectify the web to make it more useful and processable J. Perhaps the same approach would help achieve the goals at least partially.

An universal directory of data elements giving data names for the elements in different languages and each identified by a specific tag or code, along with sub-elements with definitions, illustrative examples etc. It is doubtful whether this data elements could be given in any hierarchy except showing the sub-elements to an element. It is because hierarchy is "purpose oriented" and there is no absolute universal hierarchy. The data names would not be repeated in the front and back of the data element in full conformity with KISS principle -- a sort of derivation from the "Law of Parsimony" which used to be quoted time and again by Ranganathan in his books and articles on Cataloging.
An universal scheme of a collection of Classauri - (Classaurus is a faceted scheme of terms indicating hierarchy enriched with synonyms) (first enunciated in 1982) -- developed for several disciplines and subjects / "domains" J in different languages or to be language independent and economical, fitted with codes (class codes) for each of the terms just like in a faceted classification. The software can easily use the code to pick up the terms in the universal classification scheme and translate into the desired language of the user. The code is the key because same terms may fall into different hierarchies necessitating the need for looking up all of those hierarchies in which the term occurs and decipher which hierarchy is the appropriate one to make sense out of it. It would be wise to use class codes following KISS principle rather than force logico alogorithmico formalisms.

When we were discussing about classaurus, I did oppose Dr. G. Bhattacharyya when he wanted a class code to be fixed to each of the terms (isolates -- this is a peculiar term but well understood by Librarians knowing Colon Classification and we take it for granted people understand this word. Well it just stands for an idea represented by a term in a faceted classification scheme) because I thought that there is no need for any class code for the terms in the classaurus. And it won't make it that different from Faceted Classification schedule except that it will have synonyms also. But he was not convinced and wrote his paper to the Augsburg Conference (4^th International Study Conference on Classification Research, Augusburg, Edited by I. Dahlberg. Indeks Verlaag, Frankfurt 1982. p. 139-.) that way. Well, I thought, it is a vocabulary control tool and not a classification code assigning tool. In the Library when books are returned they are to be put back in the organizing sequence. Similarly the organizing sequence is to be restored every time it is disturbed. To mechanize the restoration of the sequence and make it simple and mechanical for the library clerks, the code is necessary. They need not have been educated in English Medium schools as they need not read the titles to determine their place on the shelves. It is enough if they can read numbers. But in a computer environment there is no restoration of the disturbed holy sequence. The documents are there always unless there is a hard disk crash. So I went on to develop the alphabetical classaurus and said, to update it, there is no need for any class code to denote the position of the terms. Now I realize that it may be necessary to identify each isolate term by unique hierarchy reflecting code so that it becomes universal, piercing the language barrier armor with ease (terminology influenced by current events). To avoid code you have to beat around the bush with many logical derivations and deduction of inclusiveness, implication and so on. Wish you good luck with your winding ways. (WWW stands for Winding Ways of the Web!).

Yet another interesting factor is the development of DBMS of various types, independent of the Bibliographic Databases of the Library field. There were the Hierarchical DBMS, the Network DBMS and the famous Relational DBMS. Most of them used fixed fields (fixed length data) and the idea of repeating data elements got incorporated quite late. Librarians recognized variable length and repeatability of data long time ago. The bibliographic type databases and the DBMS types existed separately and grew separately each not recognizing the existence and the niceties of the other. May be it is better to keep them that way instead of mixing and creating an omnibus solution.

The inverted index and the post coordination technique of search (Librarians called it "coordinate Indexing", post-coordinate retrieval and so on --, others preferred to call it "Boolean search" based on inverted indexes, whether Boole himself would have allowed his name to be used this way is doubtful), emerged as the panacea for information systems.

Well, Librarians knew very well from the day "Coordinate indexing" as a concept and as a method was founded by Mortimer Taube in 1951, that this post-coordinate indexing would result in many pitfalls and waste of time to further scan and discover the relevant ones from the retrieved results. There have been several papers published in the Library literature, even some entitled "Pitfalls of the post coordinate index", "Why post-coordination Fails" and so on. Pre-coordinate Vs post-coordinate indexing studies have been a regular part of the Library science curriculum for the past 30 years. In fact, those who have now found out that the search engines are not retrieving as they should and advocate other forms and winding solutions, must go through the texts and research articles taught in Library Science schools so that they evolve better solutions. I understand that faceted classification type of schemes are re-invented with new names and slightly altered representation, and called by high sounding awe inspiring names like Ontologies !.

I did see two of the papers addressing this. One is that of Prof. Dagobert Soergel's - The rise of ontologies or the reinvention of classification http://www.dsoergel.com/cv/B70.pdff> published in the Journal of the American Society for Information Science. 1999; Oct; Vol. 50(12); p 1119-1120. How I wish it is published again in the new "Journal of Ontology" also and in several other computer science journals and a copy sent to each and every one working on semantic web. (I would celebrate such an event with Sivakasi crackers).

The other one is that of Prof. Marcia J. Bates :After the Dot-Bomb : Getting Web Information Retrieval Right This Time, published in First Monday < http://www.hastingsresearch.com/net/08-net-information-retrieval.shtml > I also found a small note that asks a pertinent question ? do we have to invest in pre-coordinate indexing which was thrown away for the cheap coordinate indexing <http://www.websearchguide.ca/netblog/archives/003621.html> . A full article <Metadata--Think outside the docs! By Bob Doyle - is available at <http://www.econtentmag.com/Articles/ArticleReader.aspx?ArticleID=7947"> If I meet these authors Prof. Soergel, Prof. Marcia J. Bates and Prof. Bob Doyle I will give them each a mouthful of sugar. This is the way Thamilzhs tell you how pleased they are with what one said. (I am giving the Thirunelvaeli Thamilzh expression for this in transliterated form "Avunga vayilae ainthu aru cheeni allippodanum").

Perhaps a well researched scientific paper like that of Prof. T.D. Wilson's : "The Nonsense of Knowledge Management" < http://informationr.net/ir/8-1/paper144.html> would throw much light on this. . And I wish that it gets published in the Journal of Ontology. Is it representation of a classification scheme such a great research concern! I am surprised. Any hierarchy could be represented as a multi-linked tree structure and finally as a binary tree. To represent the hierarchy in "word / text" form (instead of the hierarchy shown by notation (code) or by indentation) , simply keep (repeat) all the super-ordinates ! To me it appears to be the case of "catching the horse by the tail having let off the bridle" (Thumbai vittu vittu vaalaip piditha maathiri J). While looking at the semantic web development proposing XML, RDF, RDF Schema, DAML+OIL, OWL, "Semantic Cloud", Purgatory, etc., I do not know why I am being reminded of the ancient labyrinths.

If you load too much on a thin framework it is bound to lose balance and collapse. Patch work to keep up the flimsy framework, however ingenious it may be, will not hold for a long time. If you start adding more snow to an already built small snowman and try to make it a big snowman, you will end up getting a heap of a pyramid instead -- of course a big one J.

King Nebuchadnezzar was really lucky to have had the Prophet Daniel. This "Agent" told what the King dreamt of and its interpretation. How I wish I had an "Agent" like Daniel and also like Joseph who not only interpreted the dream for the Pharaoh but also managed and executed what has to be done. How nice it would be if the "Agent" automatically identifies my "information needs", searches for the required information, evaluates and identifies that some information retrieved is not a hoax investment request J , executes proper instructions to carryout tasks that are high priority and so on. But taking care of the "Agent" would be my first priority and I would never be able to sleep because I would have to watch over him with wide open eyes so that he is not kidnapped or corrupted in any way!! Then I would become a glorified watch man! Perhaps when your head / ear phone is connected to a handheld, it will automatically analyze your brain waves, find out what you have in mind, (what you dream about), find answers/ resources, and even act on your behalf giving instructions to appropriate other trustworthy agents, and then flash to you what actions have been taken. But if all you dream about are acted up on by the super agent, you may get interesting law suits too :). Oh well, there would be a series of further developments to help the glorious watchman. Anti Agent High-jacking Agent suites would appear. "Protect Your Agent -- use our Armor Software" ads will flash. But software for clandestinely corrupting your agent too will emerge. "How to seduce an agent" would become the main topic of research for some groups. Double agents will start lurking in the network somewhere. Some of these "dark agents" will be constantly moving agents. You may have to move your stuff and your agents constantly on the web to avoid the cross wires. There will be Satanic agents and Angelic agents and so on. You may have to scan the entire web constantly to find out where your agent attacker (Satan) is hiding. Money will be gathered from frightened users and channeled for top class research and the possibilities would be immense.

I am not sure, if we have an agent doing too many things, how efficient the agent would be in each of the different tasks. We do have Heart Specialists and Cancer Specialists. If we have one doctor who is a specialist in all then he becomes a general practitioner!. How efficient an all doing software agent would be is quite interesting to imagine. So we have to have several agents each for specific domains and tasks and another super agent to select the appropriate one (Database of Databases of Librarians). Will each of these expert agents process the whole web for each fragment of the query submitted is difficult to imagine. Oh, I forgot where are the "expert systems" now? What are they doing. Are they the same as agents? Will the semantic web too follow suit? Is the Winding Way Web (WWW) technology is to spin money to have new enhanced browsers, web editors, ontology builders and to generate excitement and revenue for a group? Time will tell.

It would be better to have the textual bibliographic database type and the DBMS database types to have a peaceful mutual coexistence and not try to develop agents that would do all mixing up and messing up.. It would be better to solve the semantics of one type of information resource first and then tackle the other.

It must be kept in mind that Ranganathan had clear vision of what a document is, its parts and also the different planes where in the work of classification has to be carried out. These are to be kept in mind while developing tools and techniques to manage them accordingly. If we mix them then we may have difficulty as HTML tags had been developed to play two or more roles. Codes for identifying the structure of documents, codes for physical display of the text and codes for identifying the intellectual content of the documents. Now there are attempts to separate these. What I wish to say is that some analysis of the types and variety of resources and their characteristics and the way to manage them would have to be done before attempting any other resource description work is started.

I am not concerned about the DBMS type databases on the web. They will exist and also the deep web will exist as many companies' survival depend on them.

The question is "Can we make the text documents more meaningful --- just the intellectual content of them to begin with ? ". The structured subject expression / POPSI /DSIS heading (may be if I call it "Logico Semantic Domain Expression" [LSDE] it would attract the terminology savvy web designers) which is proposed to index web resources in the following paper, has some use at least in certain respects.

Consider each of the articles in an encyclopedia which is arranged alphabetically by the titles of the articles is fitted with the structured subject expression (Logico Semantic Domain Expression -- this phrase has started gaining acceptance in my mind too J) derived according to the method proposed in our paper. If you sort the articles according to this structured subject expression then what you would be producing is a logical organization of the articles fit to form as such a text book ! This is what Bhattacharyya tried to show in his paper "POPSI : Its fundamentals .." published in Library Science with a Slant to Documentation Vol. 16., March 1979. (Never heard of this journal? If you do not read them you will miss a lot on Facet Analysis).

Yes, the articles would be arranged in a classified/ organized sequence (hierarchic and semantically related sequence). Imagine you have fitted each of the resources on the web with such a structured subject expression in the header, then if you sort them on this header, then you have produced an organized web ! Imagine that each of the sections, sub-sections, paragraphs, sub-paragraphs and each semantic unit of all information resources on the web have each been assigned such a structured subject expression, then information units on the same minute topic could be just gathered cutting across several resources and consolidated. I do not advocate sorting and reloading the whole web. Oh No ! Far from it. While I cannot approve / even imagine the agents processing the same web document / data again and again for each query and travelling the whole World Wide Web processing it and processing it, how can I advocate the sorting of the whole web. I won't. The solution of Librarians, would do the trick ! More on this later. Well, the design and development of classaurus, a faceted classification scheme but enriched with synonyms ? the so called ontology ? would be effortless and would be a cooperative affair which could be built from the structured subject expressions assigned to individual information units. But to cut across the language barrier, you may have to use class numbers reflecting the hierarchy. If class codes (notation) are to be assigned universally, then such a tool may have to be managed and updated by a central agency. In a cooperative information system like that of AGRIS a few of the things are done by the central agency, especially those relating to the updating of the Categorization scheme, data element directory (cataloging Manual) and the vocabulary control tool. It is also interesting that AGRIS had plans for a Level II of AGRIS where in those documents found to have high value would be formed as full text (classics) database and so on. (AGRIS happens to be a bibliographic reference retrieval system backed up by full text supply). AGRIS was a Decentralized Input, Centralized Processing and Decentralized Utilization model Information System.

[ Please refer to the papers 1) G.Bhattacharyya : Classaurus : Its fundamentals, design and use. In Universal classification, subject analysis and ordering systems : Proceedings of the 4^th International Study Conference on Classification Research, Augusburg, Edited by I. Dahlberg. Indeks Verlaag, Frankfurt 1982. p. 139-. 2) F.J. Devadason : Online Construction of Alphabetic Classaurus : A Vocabulary Control and Indexing Tool. Information Processing and Management. Vol. 21, 1985, No . 1, p 11-26. 3) G.Bhattacharyya: A General Theory of SIL (Subject Indexing Language), POPSI and Classaurus : Results of current classification research in India Paper presented at the International Classification Research Forum, Organized by SIG (CR) of American Society for Information Science. Minneapolis. October 1979.. Please study also at least one information system such as AGRIS and the global model of such an information system].

Here is our paper published in 2002!. You may read the Epilogue given below afterwards.
Click to get Faceted Indexing Based System for Organizing and Accessing Internet Resources ==========================================================================

EPILOGUE

Web is an information system not just computer system alone. Without information web is empty network. Consult those who have handled information for decades. I heard the phrase "Content is King" but those who have managed the Mighty King have been left out. Brain storming having a good representative set of persons involved with information (communications specialists, librarians, bibliographic information system design experts, deep web system designers, DBMS and IM stalwarts and the new wave of web processing language experts) would be needed to have a new direction for the web, based on a thorough analysis of the types and variety of resources their characteristics and how the different professions have dealt with them to develop a solid foundation. The DBMS type web resources can have a header giving the record structure of their data base, allow any agent to understand and formulate the search query according to the search system as explicated in the record structure / database structure and just submit the query to the database system and wait for the answers. The answers could then be taken by the agent and further processed. The record structure could be the age old easy, COBOL Data Division structure. The hierarchy could be indicated by the level numbers, the data names could be defined using the same structure and defined as variable or fixed along with its Universal Code ! (Ah what an abhorrent idea in the days of the OOS and OOA !). Well, COBOL Like structure would indeed be liked by all. Web is to be built up by common people who may not even care to know how to indent "if then else" properly. Even if user friendly design tools are made available it would be more creative to make the users participate in "developing" the web, not just growing the web in terms of size. (Maattai thanni kaattaththaan mudium).

FINAL WORD:

As a final word I wish to state that Library Profession is a noble profession. The profession as such did not involve in any business to make money. I am sad to see the contributions of such a profession which has helped every one to overcome their intellectual weaknesses / knowledge gaps is being kidnapped, hijacked, for the glory of a few, without acknowledgement and recognition. Librarians and Library schools should start research projects to develop better techniques of organizing the web. Monolithic enumerative classification schemes will not help. Independent of what the so called web standard builders are doing, Librarians and Library schools should carry on their research without worrying about the winding ways of the web standard developers. .

Please make copies of Prof. Dagobert Soergel 's paper "The rise of ontologies or the reinvention of classification" <www.dsoergel.com/cv/B70.pdf>published in the Journal of the American Society for Information Science. 1999; Oct; Vol. 50(12); p 1119-1120. as well as that of Prof. Marcia J. Bates :After the Dot-Bomb : Getting Web Information Retrieval Right This Time, published in First Monday < http://www.hastingsresearch.com/net/08-net-information-retrieval.shtml > and circulate them to all the Web experts and researchers.

If any one can prepare a paper following the model of Prof. T.D. Wilson's, please prepare and try to publish it in computer science and web science journals.

"Like Library Science, Web science is evolving as a Social Science dealing basically with information. To force it with nuts and bolts to keep it as a hard science may retard this evolution".

To ignore the Basic Laws (Law of Parsimony, Law of Symmetry, Law of Impartiality etc.), applicable to intellectual work in all areas of knowledge; Fundamental Laws (the Five Laws of Library Science), applicable to a discipline dealing with information;Canons (Canons of Cataloguing, Canons of Classification etc.), applicable to a branch of Library Science; Principles (Principles for Helpful Sequence of Array Isolates in Classification etc. ? array here does not refer to the allocation of an array in memory, but refers to a set of coordinate ideas in a hierarchy); constituting the theoretical foundation along with the Tools and Techniques developed to handle Information;would lead to reinventing the wheel after spending a lot of manpower, money, resources, time and energy.

I welcome your comments and suggestions.

24 Feb, 2007

E-mail: devadason_f_j@yahoo.com

Return to My Old Home Page GeoCities, Athens.