Where the streets have no name

An interim report by the sub-committee on Internet Security

Background :

Over the last few years, there has been a series of breaches in Internet Security. This has exploded over the last one year into a serious problem, at times seeimg to threaten the very existence of the internet. Viruses, hackers, site breakdowns, browser failures and security leaks, collaborative crimes between company insiders and information thieves - all have been blamed - and have contributed to the insecurity on the internet. Is that a complete list, and does it account for all the recent problems ? More importantly, can it account for the breaches, and the occasional collapses ? Or is there a deeper, and more dangerous, issue that we ignore at our peril ?

This sub-committee - consisting of Mr. U. Barlow, Mr. S. Asher, Ms. A. Shakalov, and Mr. B. Uchan - was constituted to examine the various recent breaches of Internet security. After 3 weeks of extensive study, this interim report is being submitted while the conclusions are being debated, and action plans are being formulated. The reason for the early release is the need to ensure that - in the face of the wild rumours circulating all over the world, the facts of the case will be clarified to all.

The subject is already one of great concern to many. Many sites were involved in our research of the problem. This sub-committee would like to thank each site for sharing with us the work done, the decisions taken, and the problems faced. No names will be mentioned as a part of the confidentiality agreement between the sub-committee and the corporations. We understand the need for this confidentiality to avoid paranoia and unwarranted blacklisting of sites. Yet, we feel that hiding the problem has helped to spread it; companies have not been able to benefit from the research done independently at each site. Bringing all of it together, we feel, has been an important first step for this committee.

History :

It is important to first understand the early history of breaches in Internet security. The first known cases were in e-mails, later in web pages. In most cases, technical loopholes were identified; in some cases, the individual hackers were identified, sometimes punished. A few viruses have also been reported in e-mail systems. The technical reasons for their success had been identified, published and corrected.

Though the Internet has always seen a few e-mail errors, it is popularly perceived that over the last one year, this has risen sharply. To test this, we conducted a study with the help of webmasters and system administrators at the 200 largest web-sites. The study shows an increase of 1541% over the last year. (See Appendix 1: Country-wise analysis of erroneous e-mails.) While no study can claim to have analysed even a high proportion of the traffic of the net, our results have been validated using Chi-Squared hypothesis testing. For all results arrived at, probability estimates range between 75.47% and 95.61%.

It is interesting to note that - in the first six months - a substantial majority of these looked like system errors. They do not appear to be in any language, or perhaps in several within one mail. Most have no identifiable words; some have a few words, sometimes in different languages. When users reported these, these were put down to "accidental " technical errors, either by the system or by the individuals concerned. Unfortunately, there was no analysis of this phenomenon across different sites; thus, the similarity of the pattern did not emerge. At a few large sites, the cases were examined in detail to test for viruses on the server. Initial attempts were largely unsuccessful, later attempts found some patterns that are described later.

Meanwhile, most receivers of such "nonsense" mails from strangers had replied to the sender, most using a "reply with history" and explaining that it was not clear. While this attests to the high level of politeness on the web, and perhaps the uniformity of human reaction, we suspect that these responses helped shape the next level of the problem.

Last six months :

Over the last six months, our survey showed a significant drop in such nonsense mail. Customer complaints dropped and in most sites, the technical teams that had been unsuccessfully analysing were assigned more urgent tasks. A pattern that was not identified at that time was a marked increase in incorrect but coherent e-mails. Most of these were replies to the sender; the replies are related to the original mail but a close reading indicates that these were "lifted" from other mails. For example, a mail with a question on a movie got a reply with a link to the correct web-site, a few lines about the movie {copied from a review on the web}. Yet the sender claimed that she had not replied to that mail. If this were an individual case, it would be put down to human forgetfulness but given the scale, it would have to be mass amnesia. These "erroneous" mails that are more organised have been classified as secondary responses.

At this time, one of the technical teams researching the primary errors had found some unusual indicators. It is well known that it is possible to trace e-mails to ISPs, and then to individual phone numbers. This method of tracing is rarely used as this requires Federal approval. A defence site sought and obtained such approval on grounds of national security. The e-mails were traced to an amazing variety of countries. In some cases, the locators could not be found at all.

Attacking the heart :

This was the first indication that the problem could attack the heart of the Internet. By now, the scale of the problem suggested that there could not be a solitary hacker. The team at the defence site suspected an organised crime ring of trying to steal secrets. Days of surveillance, and perhaps phone-tapping of known criminals and placing "insiders" into the crime ring led to the capture and arrest of several people, but no association with the security breach was ever found. Apart from any proof, motive was never established.

A find by the security team at one of the large sites had led in a different direction. It was found that some of the "fraud" e-mails had been sent from newly created e-mail addresses. Attempts to trace the creation of these addresses led to an astonishing result : they were not created from the Web. Either all traces of each creation had been systematically removed, or it had been done internally. Due to the awesome complications of the first possibility, the team suggested the second.

This was the first time an "insider" theory had been propounded. Focus now shifted to the administrators of many sites. The theory "explained" to the management of the site several odd events : the continued breaches in spite of increasing security standards, in spite of plugging each software loophole, and the scale of the problem. They set about trying to find the criminal in their midst. Psychological test for system administrators, enhanced site security, rotation of shifts, a few appointments, a few transfers. Everything was tried; nothing worked.

Meanwhile, the situation worsened. The mystery mails were becoming larger, more widespread, and coherent. The non-existent ids were talking to people all over the world, through e-mails and chat-rooms. They were making friends and enemies, irritating and entertaining people. None of the web-sites wanted to be the first to announce the problem, though many suspected that they were not the only site to face it.

Corporations - Approach and problems :

To explain the occasional visible problem, a few press releases of viruses were made. Statistics of time and efforts wasted by viruses were compiled and supplied to several leading publications. None of these announcements linked the virus problem to the Internet; all of them connected the problems to applications on PCs. Thus, in the public imagination, the Internet was safe, the PCs were not; this was an essential part of the success of the e-stocks and for financial reasons, all efforts were made to keep this perception.

News sites were the next to be hit. Mysteriously, with almost no reason, the pages on the web seemed to be getting altered. Again, system administrators went through the stages of thinking as with e-mail sites : a solitary prankster, a serious hacker, an organised group ... maybe from a rival company ? Interestingly, before the problem of the missing page ( named problem 404 by Netizens ) could become a serious visible threat, the problem disappeared. A few new pages appeared "on their own" and with the right links from Web-sites.

One of them was the key event that pushed the problem into the public eye. Known popularly as "F.O." case, it was a about a joint venture between two mega-companies. This fictional news item stayed on a popular news site for a few minutes before the Webmaster spotted and removed it. Thousands of surfers read it and informed their friends, who could not find the information. The original readers claimed they had seen it; their sceptical friends insisted that they were imagining things.

Two days later, the story came true. The impact on the NASDAQ was significant; the claim that the story had been published on the Web early was even more significant. Merger talks had been attended by a few; the discussions were in closed confidential meetings, and in very secure e-mails. The confidence of the commercial world was shaken. As a result, this sub-committee was formed.

During the last three weeks, the situation has worsened. There have been several instances of altered web-sites, and fictional information and mystery chats have increased substantially. The basic security of the net stands compromised. The theories of the solitary hacker, the organised crime ring, the occasional error have all failed. None of these can explain the sheer volume of the problem and its tenacity, growth and increasing maturity in the face of the toughest security the Internet sites have ever organised.

Testing for virus :

Since the theory of the Internet virus has gained some ground, we decided to test this theory. Some sites were temporarily delinked from the Web, and new servers were brought in to replace the current ones. The current servers were then analysed with Web simulations. The results were surprising: no virus was found at any site. Smaller servers had almost no problems; in a few large ones, symptoms of the problem appeared before dying out. The new server linked to the Web continued to have the old problems.

The results are as meaningful as the dog that did not bark in the Sherlock Holmes story. It suggests that no individual server has a virus. While each server cannot be independently checked in parallel, a chi-squared test estimates this to be true with a probability of 85.72 %. It also suggests that the problem surfaces only when several machines are connected, that the problem has something to do with scale. It seems that the problems emerge with high volumes, and at the level of full Internet traffic, become visible.

The latest theory :

A new theory suggested by some evolutionists has been startling and highly controversial. This is partly because of the language and the analogies used to explain the theory to the public. Analogies of biological evolution, genetic information coding, and self-replication have left the public in a state of panic. The scientific nature of the theory has been eclipsed by the paranoia. We would like to clarify the real issues.

The theory suggests that the internet can be thought of as a vast primordial sea of electronic ripples, bits and bytes of information. Just as a self-replicating chemical tends to outnumber the non-replicating chemicals, or a self-replicating meme tends to spread itself, a self-replicating informational structure could also spread without external involvement. In the past, this has been used by computer viruses; in fact, that is why they were so named. Evolutionists have now suggested that there is little need to seek the solitary hacker of the computer virus, just as "there is no need to have a God for evolution to have succeeded". If the result is intelligence, it does not mean that an intelligent being started it, but that there is a process to create and grow intelligence.

These scientists of informational evolution have suggested that the size of Internet traffic has reached a critical mass. Byte-streams flowing on the web have interacted with one another for years; there has always been a history of "occasional" errors. But with growing volumes, the information reaction has now become statistically significant enough to create a self-replicating information structure. This has evolved to the emergence of an information-based virtual species, which interacts with the vast resources of the Internet and with a huge number and variety of human minds. This interaction is much more intense and immense, rapid and rational than any creature has ever been exposed to during the history of biological evolution.

The patterns of maturity are similar. The first set of responses would be primitive, with attempts to mimic the static or dynamic data. This is akin to the way a baby reacts to the information he finds in the world, or even a species in its infancy. This would result in meaningless responses, incoherent collections of noises on the Web. Depending on the feedback received, the "baby" would grow and be able to slowly put together words which will be understandable to the recipient. This process of interactive learning through a feedback mechanism makes a baby or a species mature. Why should this not work for informational evolution ?

Rumours and measures :

As a consequence of this theory, wild rumuors have spread : IT is out to devour the world. We must stress that this is completely unfounded. Even if there were such a creature, its powers would be limited to information handling and it would be unable to "create" or destroy or impact anything outside the Web.

Except, of course, your mind. By spreading information or misinformation, the Internet is a powerful vehicle of mind control of millions. The rush for e-stocks is proof of the belief that the Internet will dominate mind share. If this source of information becomes controllable or corrupted, not by an organisation that we know and understand, but from within, there is danger. The info-creature does not, in our opinion, have an understanding of the world except through its interaction with us. The real world is as alien to it as the internals of the info-creature is to us. Perhaps, it is as incredible too.

We have suggested some interim measures. The best-known security algorithm should be available publicly to all individuals. This 128-bit Public key - Private Key encryption algorithm will make the coded mails humanly impossible to decode ( see The NP-complete Knapsack problem ) ; if we are battling information structures which are super-human, then it will slow them down. We are glad to note that the U.S. government has acted speedily ( the news report on BBC ) and the 128-bit key is now available publicly to almost all organisations. Other interim measures are best kept confidential currently, and the final report due in 2 weeks will have more measures which are being currently discussed.

In the long run, we believe that it is time to come to grips with the realities of the world of information. There is no need to imagine that the info-creature is evil; it is shaped by our responses and thus is a reflection of us, a creation in our own image. Perhaps, it is a statement about mankind that we imagine an evil destructive force rather than an all-knowing, omnipresent Being on the Internet.

We must also bear in mind the limitations of information transfer. With the massive information flow, we are probably close to understanding how scale affects information. Perhaps, there are limits which can be prescribed by an Uncertainty Principle of Information. Soon, we may have to accept that perfect certainty of information is as impossible as defining accurately the position and momentum of an object. Not all pages appearing on the Web should be treated as accurate. Every creature lives on an understanding of self and soul; so it is likely that the informational creature will spread stories about itself to sustain its own image, and to help create and define itself. Therefore, till the final report of the sub-committee is published, we request readers to be cautious in believing reports on the Web, and in cross-verifying reports, especially those about a virtual species.