Intellectual activity, knowledge, information, data... An attempt to define it in an applicable way

Konstantin M Golubev, Tatiana A Golubeva
General Knowledge Machine Research Group
Kiev, Ukraine, E-mail: gkm-ekp at users.sf.net

Intellectual activity, knowledge, information, data... An attempt to define it in an applicable way

Knowledge Management Discussion Paper
By Konstantin M Golubev

Revised 9-Dec-1999, 1-Sep-2008

"We are drowning in information but starved for knowledge"
(John Naisbitt, author of Megatrend)

Preface

If you have no problems, just enjoy your life! You need no any papers on intellectual activity, knowledge etc. This rather complex paper is intended for those who have problems, want to solve them and don't know how. It is also can be helpful for those which business is to help others to solve their problems and find opportunities.

What is intellectual activity?

As we know, the main tool, which humans use to survive, is an intellect. Without intellect all other tools are useless and even can do harm instead of good. Let's try to understand what is intellectual activity of a man and how to make it fruitful.

"He possesses two out of the three qualities necessary for the ideal detective. He has the power of observation and that of deduction. He is only wanting in knowledge, and that may come in time." (Mr Sherlock Holmes)

THE SIGN OF FOUR, p.91. Sir Arthur Conan Doyle. The Penguin Complete Sherlock Holmes. With a preface of Christopher Morley. Penguin Books, 1981.

We would like to mention that Sherlock Holmes stories were written by Sir Arthur Conan Doyle to illustrate methods of the intellectual activity of brilliant experts like Dr. Joseph Bell of the Edinburgh Infirmary (see preface of Christopher Morley). Sir Arthur Conan Doyle was known expert too. Therefore we believe that he described the intellectual activity in a right way.

Following Mr Sherlock Holmes, we can find following steps of intellectual activity:

Observation (getting information and data)

Producing propositions, based on the knowledge

Selection and verification of the most appropriate propositions

Memorizing (converting data to information and creation of new knowledge item)

1st step. Observation

The first step is always collection of information and data. Without it all other steps are senseless. We propose to treat as 'data' everything that could be perceived by a man: text, sound, pictures, multimedia etc. Certain part of data, which is directly connected to the knowledge possessed by a man, we propose to call 'information'. It is the part that is really involved in a problem's solving. For example, imagine that you are listening to a very interesting lecture and speaker is using sometimes a language that you don't understand (though many others can understand). Then that part of the lecture in a foreign language is simply 'data' for you just like music (but may be 'information' for others). And part of the lecture in a local language may contain valuable information helping you. It is obvious that this part is variable depending on the experience of the concrete person - what is valuable for one person, may be useless for other.

2nd step. Producing propositions based on the knowledge

Famous experts in Artificial Intelligence (AI) Alan Newell and Herbert Simon, developers of General Problem Solver, proposed to define memory elements as rules called 'productions' of the following type 'If Situation Then Action'.

Taking into account this definition, we propose to define knowledge elements as 3-parts stable memory patterns. Each pattern contains:

1. Description of a problem (data, memorized at the time of a problem's solving)
2. Name of a problem (should be unique text)
3. Description of a problem's solution (the actions needed for a verification of a solution applicability and for a problem's solving) and expected result

How to describe a problem and it's solution?

People, as a rule, use words to describe their problems. There are so many words, and number of their combinations is countless. It may seem impossible to find a sense in a such huge amount of data. But, fortunately, what is really important - it's ideas. Plenty of words are just like clothes that people wear. The same man can wear different garment and remain unchanged as a personality. Therefore we believe that description may be transformed into a set of standard ideas. We propose to define "idea's text" as a standard text unequivocally and directly defining a specific side of a situation i.e. representing a stable structure in a brain's right part responsible for working with images of the world. We think that intellectual activity is based on ideas as images of the world, but not on specific words representing them. This text should not include any excessive words and the words included should always have the right sense. For example, people may say: "It looks so green to me"; "I think it's a greenish stuff"; "It reminds me a fresh grass." The idea's text should be: "The color is green." Note that people can express the same idea with words absolutely different.

And what the number of such ideas could be?

Famous American psychologist Mr Cattel in his work "Universal Index of Source Traits" has made an attempt to propose a list of items for a human personality features description. Preliminary list included 4,550 different items used by many authors. After excluding synonyms it was appeared that only 171 were left. The same result we have always got from our experience (medicine, art, banking, business etc). Number of ideas used for a description of problems and their solutions in a particular area of knowledge, which may be learned by one man, was always not exceeded several hundreds. For example, we have found that Oriental Acupuncture (medical theory) is based on less than 500 ideas. But it is the great knowledge. It seems that this limitation of ideas' number might be the individual human brain restraint. Take into account that human brain is a multi-floor building and what we see is only upper floor. Though we think that there should exist areas which are complex by the nature (for example, total list of possible diseases of World Health Organization exceeds several tens thousand items - absolutely beyond possibilities of any expert).

We should note that only humans have the ability to define ideas and to exclude synonyms. It is a highly creative intellectual activity and the people developing e-knowledge bases should be respected in an appropriate way. The results of their activity may exceed expectations.

An expert produces propositions based on his own knowledge:

"As a rule, when I have heard some slight indication of the course of events, I am able to guide myself by the thousands of other similar cases which occur to my memory." (Mr Sherlock Holmes)

THE READ-HEADED LEAGUE, p.176

3rd step. Selection and verification of the most appropriate propositions

"...you now pretend to deduce this knowledge I could only say what was the balance of probability. I did not at all expect to be so accurate." (Mr Sherlock Holmes)

THE SIGN OF FOUR, p.93

"For example, observation shows me that you have been to the Wigmore Street Post-Office this morning, but deduction lets me know that when there you dispatched a telegram... The rest is deduction...Why, of course I knew that you had not written a letter, since I sat opposite to you all morning. I see also in your open desk that you have a sheet of stamps and a thick bundle of postcards. What could you go into the post-office for, then, but to send a wire ? Eliminate all other factors, and the one which remains must be the truth." (Mr Sherlock Holmes)

THE SIGN OF FOUR, p.91

4th step. Memorizing (converting data to information and creation of a new knowledge item)

If the problem was solved (proven success or proven fail), then there should appear a new memory element including problem's description, problem's name and description of problem's solution. Data is transforming into information and any new problem with similar description will cause appearing of a possible solution proposition based on this knowledge element.

Data Processing, Information and Knowledge Management

The following are quotations from Datamation, June 1998.

PROFIT AND VALUE

By Dan Richman

Here's the pitch: Gain corporate intuition, an organizational response to new situations, that lets your company meet and defeat a competitive challenge. The miracle of knowledge management electronically harnesses the know-how and experience of every employee.

...

Using the Dataware II Knowledge Management Suite from Dataware Technologies, AnswerThink created a hierarchy--also known in KM jargon as a taxonomy--of its partners' knowledge. This huge outline includes topics, subtopics, and sub-sub-topics. AnswerThink also identified who knew the most about which topics by scanning e-mails and by identifying who had read or written most prolifically on those topics. The taxonomy and links to the items it indexes (such as e-mail messages, white papers, presentations, checklists, and memos) is accessible over the firm's extranet.

...

What is knowledge management?

Knowledge management is "a set of practices that includes identifying and mapping intellectual assets within organizations, generating new knowledge for competitive advantage, making vast amounts of corporate information accessible, sharing best practices, and applying management strategies and technology that support all of the above." --CAP Ventures, http://www.capv.com/index.html

It is "a business activity with two primary aspects:

Treating the knowledge component of business activities as an explicit concern of business reflected in strategy, policy, and practice at all levels of the organization;

Making a direct connection between an organization's intellectual assets-both explicit (recorded) and tacit (personal know-how)--and positive business results."

-"Knowledge at Work," an on-line publication, http://www.knowledge-at-work.com

It is "a business practice that refers to the concept of harnessing information and knowledge, and making it effortlessly available to all employees to help them do their jobs more effectively." --Doculabs http://www.doculabs.com

...

"Giving context and revealing who knows what are the two most important aspects of knowledge management," says Geoffrey Bock, a senior consultant with the Patricia Seybold Group in Boston. Bock says KM products can be divided into several categories.

One category is text search-and-retrieval engines, a well-established technology that is being expanded to encompass relational data, HTML pages, and other forms of information. This software usually includes filtering and notification features, which are designed to reduce information overload. It lets organizations create their own hierarchies of knowledge, Bock says. Included in this category are products from Excalibur Technologies, Fulcrum, Sovereign Hill, and Verity.

A second type of KM software undertakes the task of organizing knowledge into hierarchies, like those at the Yahoo! site, through which users can browse. Companies in this group include Perspecta, Plumtree Software, and Wise Wire.

A third type of KM software is groupware, such as Lotus Notes.

Finally, some software is designed to interview workers to determine what they do, and how, why, and with what information they do it. It then captures that expertise and makes it available organization-wide. An example is Hyperknowledge.

...

Building a knowledge base

At Broderbund Software of Novato, Calif., internal and Web-based Inference and similar products create casebases from unstructured data, much as database management systems create databases from structured data. These casebases house 7,000 reported problems and solutions for 700 products, says Jim Wilmott, Broderbund's product-support manager. Half of all users' problems are resolved by using the company's Web-based question-and-answer casebase, and nearly three-quarters of the users prefer on-line help to a free phone call, Wilmott says.

The following are quotations from Fulcrum White Paper. Fulcrum is one of the leading suppliers of knowledge management products (www.fulcrum.com).

...

In Enterprise Information Retrieval (1996) the Gartner Group notes that:

Information is being created in today’s enterprises at a rate that staggers the imagination of all but IT professionals. Some is being stored in individual "silos" with access only by those directly involved in the function such as departmental staff. Other information is stored in thousands of linear miles of file cabinets in offices, shared departmental areas, libraries and corporate repositories. There is little representation of currency (timeliness) or other indicator of value…(and information often goes) unsighted because those in need are unaware of its existence…

...

The Knowledge Management Brief Knew Language for New Leverage (1996) concludes:

(There is an) overwhelming necessity for adopting a corporate knowledge management strategy that has been created by the pervasiveness of computers and the huge amounts of information they help us create – and, specifically, the networked organization that makes global access and widespread sharing possible. Too much information is almost as bad as not enough. You have to identify what’s relevant, important, and effective.

...

While a recent Document Software Strategies Analysis by CAP Ventures (1996) states that:

As the pace of our move towards an information economy accelerates, however, we’ve developed a greater appreciation of knowledge as a business asset… And as companies re-engineer processes and infrastructures, it’s become increasingly important to provide the "new" organization with knowledge that enables them to truly do more with leaner resources. We’re moving towards a real acceptance of knowledge as intellectual capital, and away from the traditional view of knowledge management as a decision support activity.

...

Knowledge organizations have been characterized as enterprises in which the key asset is knowledge. Their competitive advantage comes from having and effectively using knowledge. Examples include the law office, accounting firm, marketing firm, software company, most government agencies, universities, the military, and significant parts of most manufacturing companies, whether they make cookies or cars.

...

Dataquest has forecast that the level of spending on developing knowledge management delivery capabilities will increase from $US 410 million in 1994 to $US 4.5 billion in 1999. By 2005, more than 50 million jobs will belong to knowledge workers (US Department of Labor).

...

Industry leaders such as Dow Chemical Co., Coca-Cola Co., Monsanto and major services organizations such as Ernst & Young, Coopers & Lybrand and Andersen Consulting have created a new role within their organizations: the chief knowledge officer. The knowledge officer works to "create a knowledge management infrastructure, build a knowledge culture and make it pay off."

...

The Fulcrum Knowledge Network™ solution has been developed to dramatically improve the way an organization manages information resources. Users can conduct single, unified searches across multiple information sources, including Lotus Notes, Microsoft Exchange, Web sites, files systems, and document management systems.

Simply stated, the Fulcrum Knowledge Network is an integrated suite that intelligently collects the information users want, from any location, and presents it in a way so that users can take immediate action. The result is a more complete and precise information set for your users, thus paving the way for the creation of actionable knowledge.

Montague Institute White Paper (www.montague.com)

What is a knowledge base?

Many companies are using a knowledge base to capture and deploy their intellectual capital.. For example, at consulting firm Booz, Allen, and Hamilton, a knowledge base contains the following kinds of information:

Searchable database with links to job histories, resumes, etc.;

Listing of calendar items, business news, and personnel information;

Informative and interactive training materials;

"Forums" or discussion groups;

Links to departmental Web pages;

Searchable database of consulting specialties

Ideas and examples for clients and external groups (e.g. the media).

Why are knowledge bases important?

Knowledge bases are key to creating, preserving, and deploying intellectual capital -- the know how of employees and the databases, reports, and other intellectual assets they product. The information they contain is:

less expensive to store and disseminate than paper reports;

easier to use for problem solving and decision making than individual computer databases;

easier to search than physical libraries.

Knowledge bases add value to intellectual assets, mostly by making it easier for people to collaborate and get the information they need to solve business problems -- quickly and efficiently. They are important to:

investors (and CFO’s), since intellectual capital is a big factor in determining future earnings and cash flows;

human resources people because they want to develop and capture the skills and know how of employees;

Managers and CIO’s because they can increase the return on investments in new information technology.

What is knowledge base publishing?

Knowledge base publishing is a term we use to describe the process of creating, maintaining, and "promoting" a knowledge base. It's a combination of print and electronic formats as well as a new system of relationships among authors and readers.

While traditional publishing is linear and segmented, knowledge base publishing is weblike and interconnected. The difference is most striking when different departments load existing documents onto a corporate Web site, when they use electronic press releases instead of hard copy releases, or when they give customer access to internal databases, such as ordering and shipping information.

New skills are needed

To minimize the amount of time spent in assembling and updating information in a knowledge base as well as maximize its effectiveness, new skills are needed, such as:

identifying and classifying knowledge repositories;

"mapping" business relationships and designing information flows to support them;

structuring source material so that it can be used in a variety of formats for different kinds of users;

cultivating a collaborative culture and designing incentives to promote information sharing;

designing navigation aids that make information easy to locate;

recruiting and leading an interdisciplinary publishing team;

recognizing and capitalizing on new opportunities for revenue generation.

Verity Knowledge Suite White Paper (www.verity.com)

Since its pioneering work for the intelligence community in the late 1980s, Verity has led the market in precision of information retrieval. The Verity Search Engine and related corporate product, Information Server, go well beyond basic Boolean Full Text retrieval. Verity's core search products include the following advanced knowledge retrieval capabilities:

Advanced query expansion and disambiguation tools, including linguistic stemming and thesaurus expansion.

Custom thesaurus creation.

Natural language query input.

Automatic highlighting of search terms and linguistic and thesaurus equivalents.

Combined metadata and full text search: Verity search products maintain user-definable metadata (such as author, title and date) fields. Document indexing, classification, and retrieval can make simultaneous use of full-text and metadata search.

Advanced query navigation. Verity provides advanced query navigation facilities making the most intelligent use of query results to aid in further narrowing search results. Advanced query navigation facilities include clustering (grouping related documents together), relevancy ranking, documents summarization, and query by example.

Rich query language, including 30 query operators, including proprietary fuzzy search technology enabling more precise retrieval.

Dataware White Paper (www.dataware.com)

Executives in large organizations know that they must develop better techniques to manage knowledge, which is increasingly becoming their greatest asset. Organizations currently create and maintain knowledge in isolated systems targeted at specific workgroups. For users outside of the workgroup, that knowledge is virtually invisible. Their only options are to spend time looking for it, recreate it, or do their job without it. Each of these options has a price: time, energy and bad decisions.

Innovative organizations are examining how they can better manage their intellectual capital. This emerging field, called knowledge management, addresses the broad process of locating, organizing, transferring, and more efficiently using the information and expertise within an organization.

...

With this type of return on investment, the market for knowledge management tools is growing and many vendors of information-oriented products are introducing new knowledge management products or re-labeling their existing products as knowledge management systems. Vendors of all manner of tools from intranet development tools to document management systems to search engines are calling their products knowledge management systems, without regard to what that means. Without new technologies designed to implement the revolutionary changes in the way knowledge workers create, communicate and manage knowledge, a knowledge management system has little chance of improving enterprise knowledge sharing.

...

The promise of technologies aimed at knowledge management is that they will help organizations use the knowledge they have more efficiently without changing the tools they currently use to create and process it. This is the promise, but unfortunately what many software vendors tout as knowledge management systems are only existing information retrieval engines, groupware systems or document management systems with a new marketing tagline. What executives really need are new technologies designed to implement the revolutionary changes in the way knowledge workers create, communicate and manage knowledge.

...

While corporate knowledge silos and the barriers they often erect contribute to a perceived lack of information, often referred to as infofamine, most knowledge workers increasingly have access to too much information, often called infoglut. The Internet has led to a deluge of information, but most of it is not useful for any given task. Most people have used Internet search engines to look up information on a specific topic, only to have it return thousands or tens of thousands of hits, most of them irrelevant.

Another contributor to infoglut is the overuse of email. While email is undeniably a useful communications medium, it also provides the means to drown people with "just-in-case" information. Most people only need information on a "just-in-time" basis—the right information at the time that they need it.

...

No single technology fills all the criteria required of a knowledge management system, because knowledge management is not solely about technology. It is a multi-disciplinary field that draws on aspects of information science, interpersonal communications, organizational learning, cognitive science, motivation, training, publishing and business process analysis.

...

Document management systems are repositories of important corporate documents and are therefore important stores of explicit knowledge. They are also valuable tools for creating and processing complex documents, such as new drug applications in pharmaceutical companies. Document management systems excel at controlling the process of document creation, processing and review.

Some companies are approaching enterprise knowledge management based on document management. However, many have found that the bulk of knowledge workers resist using highly structured document management processes for all of their document creation and management tasks. Most users do not participate directly in complex document creation and therefore do not realize enough value from those systems to make an investment in learning how to use them. Therefore, document management systems are important knowledge silos that must be integrated into the corporate knowledge infrastructure, but are not used by most organizations as the basis for a complete knowledge management system.

...

Information retrieval technology, whether it be in the form of corporate text repositories or intranet search facilities, exists in many organizations as a knowledge silo containing legacy information. Information retrieval vendors continue to be concerned with satisfying the needs of information seekers and have added features such as relevancy ranking, natural language querying, summarization and others that have increased the speed and precision of finding information.

A number of information retrieval vendors have even repackaged their existing products as knowledge management technology without expanding their functionality beyond searching. Yet to search is not in itself a knowledge management application, but rather a function within such an application. Specialized repositories will continue to exist, but will evolve into more complex applications and not just search-enabled text repositories.

Text repositories can be better integrated into the enterprise knowledge management system when they adhere to open standards, such as SGML. Proprietary APIs work to isolate the knowledge contained in the text repository from the knowledge management system, usually resulting in the loss of knowledge about the knowledge. Often this metadata is more important than the original knowledge asset itself.

...

As a device-independent markup language for content, HTML is critical to knowledge management applications. It provides the richness of presentation and structure of the original content but allows access through standard web browsers. Ultimately, the web will evolve richer implementations of SGML, which has its roots in traditional publishing, such as the proposed XML (eXtended Markup Language) standard, which adds information about the structure of the content to the formatting information present in HTML. This metadata, or the knowledge of the structure of the content, is often more valuable than the content itself. To unleash the power of publishing management systems to deliver targeted, styled and branded content, knowledge management systems must preserve the original presentation and the original structure of the information.

...

Many organizations use help-desk technology to respond to both internal and external requests for information.

...

Brainstorming tools help inspire creative thinking and convert tacit into explicit knowledge. These end user applications help categorize, organize and identify knowledge resources and are therefore useful knowledge creation tools.

...

Organizations are creating data warehouses and arming their business managers with data mining tools to optimize existing and discover new relationships between customers, suppliers and internal processes. Used primarily by business managers, leading organizations are now broadening their use since everyone in a knowledge-based organization needs to make decisions based on increasingly complex sets of data.

...

The goal of a knowledge warehouse—the core component of the knowledge management system—is to preserve the creation and processing functionality inherent in knowledge silos, while offering all users access to the knowledge contained in the silos. In addition, a knowledge warehouse allows users to submit valuable knowledge even when they are not frequent contributors and therefore do not work through an established knowledge silo. This eliminates the need for all end users in the organization to install and maintain complex client software for all of the application silos.

...

Internet information seekers are familiar with entering a seemingly precise query, only to be presented with thousands or tens of thousands of hits, with no easy way to navigate them. To solve this exasperating problem, search results should be clustered or categorized using the knowledge map categories. This enables the user to quickly drill down to or mine the most relevant knowledge assets without having to learn complex query languages. No one search method is best for all people at all times. Whether users prefer to browse or search the knowledge warehouse, knowledge assets should also be clustered by other methods including physical system source, content type, author or other metadata fields.

...

Finding "who knows what" in an organization has always been a time-intensive process. A knowledge management system allows users to quickly access peoples’ skills and areas of expertise through an integrated knowledge directory. The knowledge directory should allow queries by taxonomy area (for example, who are the experts on marketing?) and return a list of experts ranked by experience.

...

Many techniques exist for categorizing knowledge, ranging from manual, human-centric approaches to completely automated processes based on artificial intelligence methods. While fully manual processes are time and labor intensive, fully automated approaches do not yield accurate enough results. However, a categorization server that automates a first-level classification of knowledge assets by using knowledge map categories saves a good portion of the labor required to fully classify information. The organization can then incorporate the final classification as part of an editorial or content management process.

...

A knowledge management system that leaves content management up to end users quickly succumbs to "information pollution." Successful knowledge management implementations appoint knowledge managers or content editors whose job is to evangelize knowledge management processes and to validate and edit content in their area of expertise. Without a content manager to ensure that information is categorized appropriately and that the content is useful and understandable, users quickly begin to have difficulty finding what they are looking for. The system soon overflows with knowledge assets of questionable value.

...

Chief Information Officers, Chief Knowledge Officers, and other knowledge professionals are finding that vendors of intranet development environments, document management systems, information retrieval engines, relational and object databases, electronic publishing systems, groupware, workflow, push technologies, agents, and other technologies are now presenting their products as knowledge management systems. These technologies are well suited to creating, processing and managing particular knowledge assets; however, they rarely meet the need of unifying all of an organization’s knowledge.

...

Individuals are the source of two types of knowledge: tacit and explicit. Information technology has traditionally focused on explicit knowledge, or knowledge that you can codify and transmit in a package, such as a spreadsheet. Tacit knowledge is personal, context-specific and is difficult to transmit. Types of tacit knowledge include hands-on skills, special know-how, intuitions, and the like. As Michael Polanyi, the first to distinguish tacit from explicit knowledge, stated "We can know more than we can tell." Knowing how to effectively perform a job means understanding both types of knowledge.

...

The organization gains only limited benefit from knowledge isolated within an individual; to realize the full value of a knowledge asset it must be transferred from one individual to another. Ikujiro Nonaka and Hirotaka Takeuchi describe four different modes of knowledge conversion in The Knowledge Creating Company. Although the four processes have been widely referred to, their names have varied in different representations of Nonaka and Takeuchi’s work. They will be referred to here as: socialization, capture, dissemination and internalization.

Socialization is the process of sharing experiences and is often done through observation, imitation and practice. It occurs in apprenticeships and at conferences, as well as at the water cooler. Capture is concerned with articulating tacit knowledge and turning it into an explicit form, for example, writing a report on what you learned at a workshop. When you copy and distribute the report, you convert knowledge from one explicit form to another, and dissemination takes place. Internalization is the process of experiencing knowledge through an explicit source. For example, you read a report about the workshop, mentally put yourself in the situation and combine that experience with previous experiences.

...

This small knowledge management group cannot effect enterprise-wide changes by itself. Content managers or knowledge editors are needed to manage the capture and classification of knowledge to guard against information pollution. They are typically spread throughout an organization and spend some part of their job framing and structuring knowledge. Tom Davenport has remarked that
"In the rosy future I envision, categorization and organization of knowledge will be a core competence for every firm. This will require strategic thinking about what knowledge is important; development of a knowledge vocabulary (and a thesaurus to accommodate near misses); prolific creation of indices, search tools and navigation aids; and constant refinement and pruning of knowledge categories. Knowledge editors will have to combine sources and add context to transform information into knowledge."

Haley Corporation (www.haley.com)

Reasoning with Case Bases

...

Unlike Remind, which provides only decision tree-based CBR, and unlike CBR Express, which provides only nearest-neighbor text-based CBR, The Easy Reasoner supports both extensively.

CBR Express' nearest-neighbor indexing is based on nominal and ordinal database fields with additional support for text-valued fields. Conceptually, CBR Express maps a text-valued field into a set of morphological features. These morphological features represent the occurrence of contiguous sequences of 3 letters with the text in a field. Thus, CBR Express builds an N-dimensional feature space of large dimensionality in which nearest-neighbor retrievals are performed.

CBR Express' ability to retrieve related cases based on text matching is based on the statistical properties of the co-occurrence of 3-1 grams. The performance of CBR Express in a domain is a probabilistic function of these statistical properties. One result worth noting is that 3-1 grams give CBR Express better recall in the face of misspelling or morphological variations, but this comes at a definite cost in terms of precision. In short, CBR Express does pretty good, but frequently presents morphologically related cases that are of no semantic relevance.

The Easy Reasoner improves on each of the CBR Express capabilities. The Easy Reasoner goes beyond CBR Express by supporting arbitrary N-M grams rather than just 3-1 grams. This can dramatically reduce the dimensionality of similarity space along with the size of the index files with a resulting increase in retrieval performance. The Easy Reasoner also supports word indexing as an alternative to N-M indexing. The Easy Reasoner also supports stop lists and normalizes most English affixations and inflections so as to delimit and normalize the indexing vocabulary or the derivative text from which N-M grams are derived. This further improves recall and does so at lower cost in precision.

Desicion making.
J.W. Sandker, Leusden, The Netherlands 1998/99 Janwillem@bikerider.com

Example: Presume marketing a particular product and trying to decide how much advertising to do. There are two choices; doing a lot of advertisement and have much profit, or doing partial advertisement and sell less products.
Advertising: Intensive advertising costs $10,000,-, while simple advertising costs $1,000, so 9:1 will do.
Profit: The profit when there is intensive advertisement is probably $30,000,- and when simple advertising takes place it will be $10,000,-, so 3:1.
Strengths - Advertisingcosts: The advertisingcosts arre one third of the maximum profit to be expected so a strength of x1 versus x3. - Profit: However the chance that the profiit of $30,000,- will become valid is estimated at 60%, so the strength is 6x in comparison to the 'advertisingcosts' which gets x10. (or x3 and x5) - These strengths can be enterd under the ''+' buttons or calculated and enterd directly as x3 and x3. (calculated as x(1+5) and x(3+3) ).
Conclusion: Evaluation results in doing the advertisements at a small scale. The profit is then estamated as $10,000,- and for advertising the sales- department will spend $1,000,-
Remark: The basis in this example is the type of argument which is called 'uncertainty' in the literature of decision-making. This means that an argument is not clearly known and must be estimated, or as said in another way; one has to guess about the outcome in the future. As not cleared out further in this example, this must be done by determining the behaviour in the past in similar cases, and extrapolate it to the future. Supplementary doing interviews with some customers -if possible- will further foundate the guess.

Knowledge management

There are many attempts to define it.

One part is connected with general assets management theory. If you have assets, you should manage them, have warehouses for them etc. It results in development of super-database, containing all possible kinds of data sources, with super-search and retrieval engine accessed with universal type of client software, mostly Internet browser. These tools are intended for knowledge consumers.

The other part is connected with knowledge ecology, virtual team development, communities of practice, providing knowledge exteriorization and possibilities for learning. This part is for knowledge producers.

We think that there should be part having the goal to pass significant part of intellectual work to machines. Why? Ability of a person to learn and therefore to apply knowledge is very limited (remember school, university etc). Amount of knowledge is becoming greater, but who will learn and apply it? Intranet/Internet and emerging e-knowledge systems methodology may allow all people to learn in an adaptive way and to use great knowledge immediately, without learning. We think that advantages are obvious, that's why our research group is developing e-knowledge publishing. It is working, you may test it at http://gkm-ekp.sf.net

An attempt to apply definitions made in the paper

We think that any external data source (main object of IT) contains the following parts:

I. Data

It is everything that could be perceived by a man: text, sound, pictures, multimedia. We believe that the main task on which Information Technologies (IT) now oriented is data management. All kinds of hardware and software are well suited for data capturing and distribution. But who needs this data? If it is collected regardless of people using it - it is senseless activity. As a rule, pure data consumers are people who are trying to find new regularities in unstructured sets of facts - researchers, managers involved in data mining etc. Ordinary users don't need data as it is and can not do such a high skill demanding work as data mining.

II. Information

That part of data, which is directly connected to the knowledge possessed by a perceiving man. This part is really involved into problems solving. It is obvious that this part is variable depending on the experience of the concrete person - what is valuable for one person, may be useless for other.

Usually what users are expecting from IT is to get an information helping them to solve their problems. For example, if you want to fly by plane, you should like to get flights schedule. But data that you will get from computers became an information only in that case when it is relevant to your experience, to your own knowledge. If you don't know what is an airplane - flights schedule will be useless data for you. Therefore the main difficulty for users is how to find data that will become an information. That's why search and categorization systems are extremely important part of IT. For users it is not very interesting where needed data resides - in databases, document management systems, HTML files etc. They will prefer to have only one data access point to all sources, including human experts. We see from quotations above that understanding of this fact leads many companies to the implementation of Knowledge Management (KM) systems. But it is rather complicated task to find information with search engines based on the morphological retrieval indexes and not on ideas searching. Try Internet search engines, for example, to find definition of airplane if you don't know what is it. Categorization, like in Yahoo, is easier to use, but it is too rigid and relatively poor for a good description of data sources.

III. Explicit knowledge (external)

From our point of view this is constant and the most valuable part of data source, not depending on any person. And it is the part that we are learning on.

We already have proposed to define knowledge elements as 3-parts stable memory patterns.

1. Description of a problem (data, memorized at the time of a problem's solving)

2. Name of a problem (should be unique text)

3. Description of a problem's solution (the actions needed for a verification of a solution applicability and for a problem's solving) and expected result

There is 1st level knowledge (concrete), which was developed during concrete problems solving, and 2nd level knowledge (abstract), which was developed on the basis of concrete knowledge - including typical situations and solutions, general rules etc.

IV. Individual knowledge (tacit)

Types of tacit knowledge include hands-on skills, special know-how, intuitions, and the like. Michael Polanyi, the first to distinguish tacit from explicit knowledge, stated "We can know more than we can tell."

Sure, almost any kind of knowledge is initially tacit. All intellectual activity of person goes in sub-consciousness, and therefore does not need words. Words appear at level of consciousness (co-knowledge), which is many times poorer than sub-consciousness.

People often think that any question and answer dialogue is based on the knowledge. And therefore knowledge frequently is losing freshness. From our point of view knowledge can not be non-fresh. Data, information can. When you ask question, it does not mean, that you need to use knowledge of other people. In many cases you need information to apply your own knowledge. Knowledge of a man is a history of his own successes and failures - what part of it you want to ignore? What part will never be applied? Knowledge is tightly connected with our image of personality - what part of personality is unneeded?

Knowledge, as we think, appears always as a result of solution by a man of important problem. We don’t regard as a knowledge images that placed in transitory memory, like in case of presence in our view of attention of some object etc. They are not stable and have no effect on further activity of person. In reality, we treat them as data or information. So if someone asks: "Do you know where is my last report?", and you see it on your table and reply: "Please look at my table." - it does not mean that it was knowledge exchange, it’s information exchange. Such kinds of message are really losing actuality.

The Problem

Any kind of data will be useless if a man has very little own knowledge resources. Therefore people need to learn to work effectively. And in many cases IT can help. For example, you can use distance learning. But evidently it is very time-consuming activity and, therefore, amount of knowledge learned is extremely small comparatively to all existing knowledge.

There is great amount of applicable knowledge in the world. Before using a man should learn it. To learn it is the task far beyond possibilities of any man. We are becoming richer in knowledge every day but can not use our treasures. Oddly?

Possible solution

We see the following solution. There is need to develop such a machine that has an ability to accept explicit knowledge (external) found in printed sources (books, articles, databases) as knowledge elements (3-parts stable memory patterns defined earlier) and transform it into machine-simulated tacit knowledge in a form of intellectual activity support systems. These e-knowledge systems should be used both for adaptive learning and distance on-line consulting.

They should assist a man during 4 steps of intellectual activity:

1. Observation (getting an information and data)

2. Producing propositions, based on the knowledge

3. Selection and verification of the most appropriate propositions

4. Memorizing (converting data to an information and creation of a new knowledge)

People access to these systems may be provided with Internet/Intranet. Since these machines have no human restrictions on knowledge volume, it will be possible to input all existing knowledge into them and all people will use it immediately for adaptive learning and distance on-line consulting.

All explicit knowledge should be converted by knowledge presentation developers to 3-parts knowledge elements based on ideas. It must be respected as extremely valuable work. You will see that thick manuals and extensive knowledge bases built on knowledge exchanges will transform to small list of several hundreds ideas.

This plan is not a dream. It is a description of a new approach to knowledge presentation called Electronic Knowledge Publishing developed by General Knowledge Machine Research Group since 1986. Detailed information about this approach and already developed products may be found at Internet addresses:

http://gkm-ekp.sf.net , http://gkm-ekp.berlios.de

Comparison of AI Expert Systems and E-knowledge Systems Methodology

Expert Systems E-knowledge Systems

Intended to replace human experts Intended to assist human intellect

Based primarily on mathematics Based on neurophysiology, psychology, knowledge management theory and mathematics

It is practically impossible to transform directly external knowledge sources to expert systems It is further advancement of traditional publishing, external knowledge sources (books, articles etc) may be transformed into e-knowledge systems easily

Based on the decision rules concept Based on general knowledge concept

The more complex expert system is - the worse it works The more complex e-knowledge system is - the better it works

Development has many stages and very expensive Development has one stage and relatively inexpensive

It is relatively hard work to incoroprate expert system into other information systems due to sequential nature of data input and output E-knowledge system may be easily incorporated into any kind of information system due to support of wide range of data input and output sources

It is practically impossible to use expert systems for learning, because they are not based on the human knowledge It may be easily used as forefront for Distance Learning information systems, providing ability of Adaptive Learning, based on the Just-In-Time Knowledge concept

May not be used for new knowledge creation May be used for new knowledge creation

Expert Systems	E-knowledge Systems
Intended to replace human experts	Intended to assist human intellect
Based primarily on mathematics	Based on neurophysiology, psychology, knowledge management theory and mathematics
It is practically impossible to transform directly external knowledge sources to expert systems	It is further advancement of traditional publishing, external knowledge sources (books, articles etc) may be transformed into e-knowledge systems easily
Based on the decision rules concept	Based on general knowledge concept
The more complex expert system is - the worse it works	The more complex e-knowledge system is - the better it works
Development has many stages and very expensive	Development has one stage and relatively inexpensive
It is relatively hard work to incoroprate expert system into other information systems due to sequential nature of data input and output	E-knowledge system may be easily incorporated into any kind of information system due to support of wide range of data input and output sources
It is practically impossible to use expert systems for learning, because they are not based on the human knowledge	It may be easily used as forefront for Distance Learning information systems, providing ability of Adaptive Learning, based on the Just-In-Time Knowledge concept
May not be used for new knowledge creation	May be used for new knowledge creation

You are welcome to see knowledge that really helps you.

General Knowledge Machine Research Group Team