The Basileiad Library at Manor College.
Information Literacy course
.

The Internet and the World Wide Web.

The internet was originally developed by the U.S. Department of Defense in 1969 as a means of sharing information. It was soon adopted by scientists and researchers, who were able to share their findings with others in the world who were working on the same research topic.. With the explosion of the popularity of computers after they became more affordable, everyone else in the world discovered the internet, and it became the everyday tool it is today.

Today's internet is a chaotic confusion of web sites, with many unreliable and disreputable pages. Because of this, two new internets are being created. The U.S. Government has created a new internet that will work thousands of times faster and not include commercial content. This is called the Next Generation Internet (NGI). Another internet is being created by academic scientists and researchers:  this one is to be extremely powerful, use much higher bandwidths and will only be accessible, at least initially, by colleges and universities. This one is called Internet2. It will use revolutionary new technologies that will enable  performance not possible on the ordinary internet.

The current internet that we use is the home of millions of web sites that range from simple blogs (web logs - people's diaries) to elaborate web sites offering a wealth of information. These web sites host billions of web pages, making it difficult to find accurate information. The World Wide Web is in fact only part of the internet, much of which cannot be searched. In spite of all of it's pitfalls, the internet is the most popular source of information among students.

A web site's address is called a "url". Most commercial web sites end their url with .com. Government web sites end theirs with .gov, colleges and universities with .edu, organisations with .org, and the military with .mil. There are many others.

To find information on the internet, you can use a directory, a search engine or a meta-search engine.  These work in different ways: a directory is compiled by people, of useful web sites; a search engine compiles an index of search terms and browses this index with the keywords you enter; a meta-search engine sends your keywords to the indexes of a number of search engines. Of all of these, directories are the most reliable. You can find examples of all of these on the library's web page, under the heading Find Internet Sites 'Search Engines' from the library homepage 

Another form of a directory can be found on the library's web page under Find Internet Sites 'By Subject/Reference' on the library homepage. These are web sites that are considered useful for those particular subjects. The advantage of these sites is that they have already been evaluated.

Because there is no criteria for creating a web page, there is a great deal of information on the world wide web that is not accurate. In addition, there is no guarantee that the results of your search are current.

Every search engine uses a different algorithm to sort the results of their search of the World Wide Web, and to rank the findings. Most search engines use link analysis, where a keyword close to a link (say, 'book' next to a link to Amazon.com) results in that linked web page being placed high on the list of relevant hits for that keyword. Google's founders invented page linking, which they patented, which means that high priority is given to web sites that are frequently referred to by other web sites.

Because people rarely go past the second page of hits, companies try to be listed on the first two pages. The problem with page linking algorithms is that they can be manipulated by web owners. It is possible to hire an SEO (Search Engine Optimizer) who will show you how to get your site mentioned high on the list of hits. Sometimes people will hack into a web site and place links to their own web site in white letters on a white background, thus ensuring that Google recognizes them. There is a lot of competition among search engines to develop a better algorithm. Teoma uses an extension of an algorithm developed by IBM where related terms are included, allowing you to expand your search based on their suggestions. Most search engines don't exclude duplicate pages, which can contribute to the number of hits. 

Not all web pages are accessible through searching. A web master may use HTML terms which don't allow access by search engine's robots crawling the web. Computers often can't search a web site composed solely of images, which is why search engines offer a different search area for this. Proprietary web sites, which require a password, cannot be searched by searched engines. The very large segment of the web that cannot be searched is called the 'invisible web'.

The next module will give you some hints about searching the internet.

Next

HOME