World Wide Web The World-Wide Web (WWW) project began in 1992 at the Geneva-based European Centre for Nuclear Research (CERN) and its commercial possibilities were quickly recognised.

WWW, or Web, can be thought of as a collection of documents resident on thousands of servers around the world. Each document, written in the HyperText Mark-up Language (HTML) can obtain text, images, sounds and video. More importantly, it can contain links to documents held on other machines accessible through the Internet, creating what is usually called "hypertext".

These links appear as hotspots in the document, and can be a word, phrase or even an image. Selecting the hotspot automatically connects your computer with the one referenced in the underlying hypertext document, and loads a new hypertext document held there on your machine. As the user, you need know nothing about where the machines are, or how the connection is made.

Since these hypertext links often take you to other documents rich in hotspots, the ensemble can be thought of as a huge hypertext system spanning the world, a kind of global version of a Microsoft Windows Help file.

To access WWW you need a browser: these programs let you read hypertext documents, view any built-in images and activate hotspots.

Two of the most important browsers are Netscape and Microsoft Internet Explorer.

Browsers can also be used to link to file transfer protocol (FTP) servers (where files are stored), gophers (where information can found using a menu structure) and wide area information servers (WAISs, which allow free-text searches).

How it all works and why nobody runs it

The Internet is not online service, but a collaborative collection of networks that adhere to certain basic standards when exchanging information among themselves.

Each Internet service provider pays for the cost of the connections it runs, and makes profit from the charges to its subscribers for their use of them. Economics of scale and continuing advances in technology make these cheap, even for intercontinental traffic.

Extensive connectivity among major Internet providers means that information is sent around the Internet efficiently, traversing only a few separate networks.

Unsurprisingly no one body controls it. Although various standards need to be adhered to by Internet service providers, there are no Internet police who check and enforce them. The system is self-policing; if any organisation strays from collective standards, it loses the benefits of universal connectivity - which is the whole point of becoming part of the Internet in the first place.

There are bodies that carry out central functions for the Internet such as the InterNIC ( http://www.internic.net ) which, among other things, registers companies that are connected to the Internet, and the Internet Society (gopher,isoc.org). The society has various engineering committees that help make technical recommendations for the future development of the Internet, but none of these has the power to force a particular direction or action on the Internet community.

Much of the information available over the Internet is held on university computers and is managed by public-spirited individuals. Similarly, many of the Internet indexing and cataloguing services available - Archie, Gophers, World Wide Web servers - have been set up and run by university departments. Manufacturers, who effectively sponsor an Internet site, often provide the computers themselves and their huge storage requirements free. For example the important UK FTP site at Imperial College (src.doc.ic.ac.uk) is sponsored by Sun Microsystems, which receives due acknowledgement each time you log on.

Internet Providers

There are hundreds of Internet providers and it is difficult choosing between them. Most however rely upon the services of a selected few. These, including Demon Internet Services, EUnet, GB, Pipex (owned by UUnet, which itself is partly owned by Microsoft) and BT. These companies, together with Ukerna (the UK Education and Research Networking Association - the body that runs the UK academic network Janet) formed Linx; the London Internet Exchange. This is a neutral point of interconnect where they can exchange data among themselves directly.

Hops

The structure of the Internet consists of many joined links. Cyber road maps show the information as a hierarchical tree, with local Internet service providers connecting to national connectivity providers. For users a more natural way of conceiving the Internet is as a series of hops between your computer and the site you are trying to reach. Different parts of the same message - that is different packets - may take different paths that vary from moment to moment according to the status of the intervening network.

Whatever path is taken, it consists of sections between computers that decide how to pass on the packets (the routers). These sections are commonly called hops. How many hops there are between computer and destination depends upon the end-points, the time of day and other factors. It can be as low as three of four, or may extend into the 20s or even 30s.

The number of hops required to reach the destination matters: the more there are, the longer the transit time, and the more chance that the packet will get lost. Indeed, if the number of hops is too large, the system may simply give up, since for reasons of efficiency packets are generally discarded automatically once they have passed through a predetermined number of hops.

How best to get your company name on the Net

An early problem of registering Internet names for companies was that sharp individuals in the US were registering the names of major corporations. These companies then found themselves unable to use their main brand online. There is now a formal requirement in the US that the requested name to be used on the Internet does not infringe the intellectual property of any third party and will not be used for any unlawful purpose (full details at ftp://rs.internic.net/policy/internic/internic-domain-4.txt).

In the UK, things have always been simpler. A considerable amount of material has appeared online that will prove useful for companies considering making an application for their first or subsequent Internet names.

The main .UK country domain is divided up into various sub-domains: .ac.uk (the academic world); .gov.uk (for government bodies); .mod.uk (Ministry of Defence): .net.uk (network suppliers); .org.uk (general organisations that do not fall into any other category); .nhs.uk (for NHS organisations); .sch.uk (for schools); and .co.uk as the main commercial domain. See the URL http://www.nic.uk/for details.

There are formal sub-domains within these (called neutral sub-domains), such as .music.co.uk, .law.co.uk, .tv.co.uk, .radio.co.uk and .internet. co.uk (for Internet suppliers), see http://www.britain.eu.net/naming-co/other-domains.html.

EUNET GB administers the .co.uk sub-domain. However, the final decision as to whether a name will be accepted or not is down to an informal committee, called the naming-co list. It has voting and non-voting members. The former are essentially the UK Internet suppliers who have their own international connections. See http://www.britain.eu.net/naming-co/members.html.

Remember that the name should reflect the company applying for it, and that two- and three-letter names will not in general be accepted (unless the company is very well known, like BT or HP).

Names that are already registered will obviously not be given out again but neither will new names be given to a company that does not intend to stop using its old name. Requests for a large number of similar names for related organisations are frowned on and should be created at the next level down. One easy way of finding out if a name has already been registered is to use the search engine at http://www.britain.eu.net/naming-co/whois-form.html which will look through all the registered names in the .co.uk sub-domain. Also useful is the list at http://www.hensa.ac.uk/uksites/co/index.html, which lets you look through a consolidated list of company names in this sub-domain.

Finally, to keep up-to-date with who has applied recently, see http://http.demon.net/external/networks/ncprov.html, which shows the various company names that have been applied for by each of the Internet suppliers.

The normal procedure is to go via your Internet provider in order to register your name, although you can also apply to EUnet GB directly using the form at http://www.britain.eu.net/naming-co/user-form.html.

Creating A Web Site

Quite how you establish a World Wide Web site is perhaps the least important aspect of the whole process. A company might choose to develop the skills of people in-house (for example in the marketing and IT departments), exploiting the fact that setting up HTML pages is extremely simple (though hard to do well). The pages will then be placed on a server run as part of the company's network and connected to the Internet through a firewall, or isolated from it completely for total security and corporate peace of mind.

Alternatively, the work can be farmed out to specialists who will write pages (rather as advertising agencies produce copy for marketing purposes) and then arrange for them to be held on a Web server where space can be rented. This has the advantage that there are no security worries, and design is left to experts. The downside is that you will obviously pay a premium for these services and there is no opportunity to learn from the process of creating pages - which means forgoing the chance to develop skills that in the future all companies will need to understand if not practise.

More crucial to the success of a Web site is the content. It is content that draws people to pages and holds them there. Design can help or hinder, but never acts as a substitute.

The three most important aspects of Web site content are that it needs to be appropriate to its intended audience, genuinely valuable (not marketing fluff) and constantly changing. You cannot just place anything on a Web page and hope that somebody finds it interesting: you need to have a clear idea of what kind of visitor you want to attract (generally this will be the same as your typical customer profile). In fact, even more than with conventional marketing, online activities need to be very highly targeted so that visitors can tell at a glance whether it is worth their while lingering and exploring further.

Assuming that they decide to do so, it will then be the richness of the online content that determines whether they stay long and what their overall impressions will be. There is nothing worse than a site that promises much and delivers little. Word will soon get out on the Internet grapevine (and in a sense the Internet is all grapevine) that the pages in question are not worth visiting, and the site will languish.

Finally, assuming that visitors find your pages of genuine interest, you need to ensure that there is at least some element that changes on a regular basis so as to draw people back. There is so much competition online that Web sites must fight for their audiences every day.

Assuming this battle is won, and visitors find the site interesting and worth returning to, you will have created a powerful online marketing tool. All the time that people view your site they are imbibing your corporate messages (either implicit or explicit). If what they have seen there is useful, impressive or entertaining, they will leave with an enhanced opinion of the site's brand and owner.

But there is another, rather novel benefit. In creating a site which meets the criteria mentioned above, you will effectively be putting together an online periodical, targeted at a particular sector (in fact the same as that served by your company). What you gain is an online readership - readership, moreover, that if big enough might even be sold to online advertisers in the form of links to their Web pages. Every company setting up a WWW site adds the second business of Internet publishing to its traditional activities.

How companies can set up stalls on the Internet

Once a company has decided to use the Internet for promotional or sales purposes by creating some kind of publicly-accessible site - whether FTP, Gopher, telnet or World Wide Web - it is faced with the problem of letting potential visitors know about its existence. For one of the novel aspects of the Internet is that its users are active rather than passive: it is they who decide to go to the company rather waiting for the company to come to them as with conventional marketing.

An obvious place to start in the process of informing people about a new site is to join the great list of commercial sites at Yahoo (http://www.yahoo.com/Business/). This offers perhaps the most comprehensive list of companies on the Internet, and has an extremely large number of visitors who use it as a jumping-off point.

The entries in Yahoo are strictly informational, with little scope for imaginative design, subtle approaches or more comprehensive material. To meet this clear need for a forum where companies on the Internet can inform and attract potential visitors to their sites a new kind of Web site has evolved, generally called an Internet shopping mall by analogy with the physical equivalents that are such a feature of the US.

The pre-eminent source of information in this area is called, appropriately enough, The Internet Mall. It was begun in February last year as an E-mail document with just 34 companies offering various items for sale over the Internet. Today, it has over 1000 cybershops, with tens joining each day, and the main document is almost a megabyte in size; if your E-mail system can cope, it can be retrieved by sending the message send fullmail to the address taylor@netcom.com. It might be more advisable is to retrieve instead one of the fortnightly updates: use the message send mall passed to the same address. The full list can also be retrieved by FTP from ftp://ftp.netcom.com/pub/Gu/Guides/Internet.Mall.

But perhaps the best way to access the information is through its hypertext incarnation at the URL http://www.mecklerweb.com/imall/. This very impressive site offers various ways of approaching the material: a thematic organisation (based around mall 'floors') and via a search engine. There is also information (at http://www.mecklerweb.com/imall/howto.htm) on how to add your own company to the list.

It costs nothing to join the Internet Mall since the Internet Shopping Network sponsors the project. The latter is part of the US cable TV company Home Shopping Network, which had sales of $1.2 billion in 1993. The Internet Shopping Network can be found at the URL http://shop.internet.net/, and claims to be the world's largest Internet shopping mall with 600 companies (presumably the original Internet Mall is not included since it is more of an information source than a commercial venture).

Nor is the Internet Shopping Network the only Internet mall to be owned by a major company with plenty of experience in online selling. NetMarket (at http://www.netmarket.com/) is part of CUC International, which sells goods at discount to its 30 million members using conventional means, and obviously hopes to do the same in cyberspace. Particularly interesting are the pilots of the advanced interactive shopping services (follow the Business Solutions link on the home page above).

Alongside these giants there are many other smaller Internet malls. A comprehensive list can be found at the URL http://www.yahoo.com/Business/Corporations/Shopping_Centers/. In the UK, for example, there are sites at Apollo UK (http://apollo.co.uk/) and MarketNet (http://mkn.co.uk/). Also of note is Downtown Anywhere (http://www.awa.com/) which offers both commercial and other information in a form that extends the basic metaphor of the US mall to include additional areas in a virtual city.

A Web page has been set up to aid the submission of new Web sites to the main global listings and search engines such as Yahoo, EInet Galaxy, Lycos, Web Crawler and so on. It allows you to cut and paste your URL to the various submission forms. It can be found at http://www.cen.uiuc.edu/~banister/submit-it/. or http://submit-it.permalink.com/submit-it/.

Why everyone on the Net should know about HTML

Although most of the essential components of the Internet have been in existence for nearly 25 years, it is only in the last two or three that the Net's use in business has become widespread.

In part this has been caused by the increasing appreciation of how E-mail can simplify and extend all kinds of business contacts. However, this corporate awakening has been largely due to the introduction and extraordinarily rapid take-up of the World Wide Web. Even before the exciting latest developments of Java and VRML, the Web offered a business medium with immediate impact, interactivity and a sense of immersion hitherto lacking in Internet services. Its application for marketing and sales purposes, particularly the former, have been so obvious that there has been surprisingly little resistance within corporate structures to at least experimenting in this new arena.

And so it is that some sense of how Web pages are put together - what is and is not possible - is increasingly becoming a prerequisite for modern managers.

If you use a Web browser that offers a bookmark/ hotlist feature to store your favourite Internet sites, this is effectively your own home page, although represented on screen in a slightly unconventional form. In fact it is a trivial matter to export a set of bookmarks and turn them into a fully-fledged and even more accessible home page that can be loaded as the default every time you run your browser.

For those who remain sceptical about the relevance of Web page creation in the ordinary business context, there is now another reason why some knowledge of HTML is likely to become an indispensable skill for the modern manager.

The use of internal TCP/IP networks - the so-called "intranets" - is already a major trend, especially in the US. There, such internal E-mail, news and Web systems are fast turning the long-promised groupware/ Executive Information Systems into reality. Where intranets flourish, personal pages are usually encouraged as a way of allowing staff to participate and to add a human touch.

A by-product of this is that those unable to write their own HTML pages will be denying themselves the opportunity to stake a personal claim in this new, very public corporate space. Adding your home page gives you the chance to present to both peers and superiors a honed self-image that can act as a kind of online CV in permanent application for that next promotion.

It is surely not too fanciful to predict that in the future the ability to knock up a solid Web page will be just as useful as good memo or report writing skills are today.

How to get the best from your Net site pages

Writing World Wide Web pages is very easy, but as with so much else on the Internet, understanding the underlying principles can be an enormous help in getting the most from the tools available and avoiding the various pitfalls. For example, few are aware that the HyperText Markup Language (HTML) used for creating Web pages derives part of its name and much of its underlying philosophy from the world of the Standard Generalised Markup Language (SGML).

SGML is about defining logical structures of documents in a formal and self-consistent way (for an excellent introduction see Readme.1st, SGML for Writers and Editors, £30.47, ISBN 0-13-432717-9). Although this might appear to be a fairly dry, academic exercise, it has considerable benefits. It allows you to take a SGML document and use it in many different contexts. For example, it may be displayed on a screen, printed on a dot matrix or laser printer, or even converted into Braille. Because SGML codifies the structure of the document, it allows each logical element of the document to adjust itself according to the final medium, without the need for further user intervention. HTML is what is known as an application of SGML. This means it uses the conventions and ideas of SGML to define the basic structural elements of its documents. HTML is an extremely simple application of SGML, but there are two very important properties that must always be remembered when using it to create Web pages.

First, HTML is not a language for describing the appearance of a page, however much it might seem to be. Instead, it describes the underlying structure of that page. In this it differs radically from the more familiar desktop publishing programs. These explicitly define the size and position of all the components of a page.

Because HTML describes the structure, not the appearance, when creating a Web page you do not know how exactly it will appear on the screen of the person viewing it. To understand why this is so, it is worth examining what happens when a Web page is requested and retrieved.

When you use a browser such as Netscape to access a Web page, say the one at http://www.gm.com/index.htm, a request is sent across the Internet to the Web server at that address. Assuming that the requested page exists and is freely available (some require passwords before they are sent), the server returns that page to the browser over the Internet.

The page itself consists of nothing but a text file, written according to the rules of HTML. Within this document there may be references to multimedia elements (graphics, sounds etc), in which case these are sent separately. When the main HTML document arrives at the browser, it is processed in a simple but important way. The HTML structural markers are located (following SGML conventions, they are all written between angled brackets <>), and then converted to an on-screen representation. However - and this is crucial to remember - it is the browser that chooses what form this will take.

For example, some of the commonest structural elements within an HTML document are various levels of headings. When creating the HTML file, you simply specify that certain phrases are first or second level headings, etc. You are not able to specify the size of headings or a certain typeface, since these are determined by the settings of the browser that processes your HTML document. These setting can often be altered by the user.

The fact that HTML is an SGML application therefore changes the way you must think about the Web pages you create. Remember that what you see on your screen is not what other users will necessarily get, and that special and "clever" effects will almost certainly be lost on some browsers. For HTML documents, logical structure, not dramatic layout, is paramount, and simple but effective design is the order of the day.

How to survive the data deluge from the Internet

The culmination of the incredible progress the various Web search engines has been the recent appearance of services offering full-text searches of a very large proportion of the Internet, with all of the Web promised for some time in the near future.

The more you use these search engines, the more you become aware of the yawning gap between what is available globally and what is available locally. The problem is exacerbated by the sheer richness of the Internet and the fact that most of what you find is free. The temptation to download a file or a page is almost irresistible.

The result is plain to see on anyone's hard disc: tens if not hundreds of Mbytes of Internet-derived data and programs. For anyone who uses the Internet regularly the challenge is to manage this local data flood in the same way that the search engines are helping people cope on the global scale.

Fortunately there are solutions available that give you much of the power of an Alta Vista or Open Text search through the contents of your hard disc. Although these have not been designed specifically for Internet users, they lend themselves very readily to the task.

There are two programs for PCs running Windows: AskSam, which costs £99.95 from Guildsoft (01752) 895100, and ZyIndex, costing £395 from ZyLab UK (01235) 861681.

Both allow you to carry out full-text searches of groups of documents and other text files that have previously been indexed using this software. Both programs offer a good range of features, including Boolean searches (using AND, NOT, OR, etc), proximity searches (finding two words within a certain distance of each other) and fuzzy searching (where near-equivalents can be found for a given search word).

Response times are good for both: just a few seconds to search through several Mbytes of data. However, the approach taken is quite different in each case, and which you choose depends to an extent on your working environment.

For example, AskSam creates an entirely new file containing the data and the index, roughly the same size as the files themselves. This has the virtue that you can carry copies of this indexed version to other machines running AskSam.

ZyIndex, in contrast, creates an index that is separate from the files themselves. This has the advantage that the index is smaller than that of AskSam, and allows data spread across a network to be indexed more efficiently.

The down side is that once words and phrases have been located, you need to load the file's native application (e.g. Microsoft Word) to view it in its original form. (ZyIndex can only display rather crude ASCII excerpts.)

For the Macintosh there is the program On Location, £99 from ESP (01628) 23453. This offers fewer text retrieval facilities than the PC programs, but also serves as a more general file-search tool.

These free-text search engines solve one part of the problem of the data deluge - how to find things - but leave untouched another aspect. Using a 28.8Kbyte modem it is quite possible to download tens of Mbytes of files from the Internet in a day, and with leased lines much more can be obtained.

Even with Gbyte hard discs, such a stream of files can soon fill up the most ample storage. Moreover, many programs (such as the latest versions of Netscape Navigator) are so large they will not fit on a floppy disc (and dividing programs across floppies is not simple).

Happily, a new generation of low-cost removable discs has arrived to meet this need. Products such as Iomega's Zip drive (available for the PC and Macintosh, priced at about £108+VAT), makes it possible to store 100Mbytes of data on a disc costing under £10. And if that isn't enough, the Iomega Jazz drive (£229+VAT) offers no less than one Gbyte on a removable medium (£52), which should keep even the most voracious of downloaders happy. Iomega is on (0800) 973194.

Creating a World Wide Web for every tongue

Few people have tried to add multilingual capabilities to their browsers. A good reference (http://wwli.com/library/localize.html) point for users interested in this area is the site for multi-lingualism and the Internet.

Perhaps not surprisingly, Microsoft's Internet Explorer is well thought-out as far as international use is concerned. For all its faults, Microsoft is a company that is keenly aware of the importance of localisation for software. Internet Explorer is available in 9 languages (that is, with menus changed appropriately), and it is also easy to add extra characters sets to view multilingual pages.

This is done by downloading the appropriate font file (http://www.microsoft.com/msdownload/ieadd/03.htm), (the choices are Simplified Chinese, Traditional Chinese, Japanese, Korean and Pan-european). Running the file causes it to be installed and the relevant changes for Internet Explorer made automatically. Thereafter, there is a pop-up list of available character sets available in the bottom right-hand corner of the browser window. When you encounter a page using a character set other than Latin-1 you can simply select from this list to refresh the page.

Adding this capability to Netscape is much harder, and reflects this young company's relative inexperience in dealing with international markets. First, you need to find the relevant fonts yourself (in practice, the simplest solution is to use those provided by Microsoft). Once these have been installed, you must then activate them in Netscape. This is done from the Options menu, choosing General Preferences and Fonts. For each of the encodings you specify the font that you have added. Then, to use this encoding for a Web page, you will need to go to the drop-down list available on the Document Encoding entry on the Options menu.

None of this is very intuitive; worse is the fact that for Japanese font capabilities you have to edit an entry in the Windows registry - the software equivalent of brain surgery, and about as risky. If you want full foreign language capabilities for Netscape, it may be easier to buy the plug-in (http://www.accentsoft.com/) called Navigate with an Accent from Accent Software. This adds a new drop-down list of character sets alongside the main menu buttons. An evaluation copy (http://www.accentsoft.com/download/dleng.htm) is available. Unfortunately the add-in disables important features such as plug-ins, frames and Java.

Accent produces its own standalone browser (based on the original Mosaic). This too adopts a drop-down list of language options, though strangely Chinese is absent. Accent does, however, offer both Arabic and Hebrew, something that neither Internet Explorer nor Netscape is capable of. An evaluation copy is available from the URL given above. Another product in the Accent range that can be downloaded from there in a trial version is Accent Publisher. This addresses the other side of the multilingual problem: creating Web pages with character sets other than Latin-1.

With Accent Publisher, you can design a page in most European languages plus Arabic and Hebrew (floating keymaps let you use a QWERTY pad to enter non-Latin characters) and then to convert them to HTML files automatically. More advanced features such as tables are supported. Also notable is the ability to swap among 21 languages (including Arabic, Greek, Hebrew, Russian and Turkish) for the menus.

Another browser product based on the original Mosaic is Tango, from Alis Technologies (http://www.alis.com/), whose site was mentioned last week as a useful starting point for exploring Internet multilingual issues. An evaluation copy (http://www.alis.com/internet_products/try_form.en.html) can be downloaded. Tango can display no less than 90 languages, including Arabic, Chinese, Greek, Hebrew, Korean, Russian and Thai. The interface can be switched to any of 19 languages. The corresponding creation software called Tango Creator lets you compose HTML pages in 90 languages using character sets other than Latin-1, and supports tables and frames.

Why the Web will be the font of all corporate data

Several surveys in the US have indicated that already the majority of larger companies there have or are implementing intranets. The UK is still a little behind in this area, but as with the Internet in general it is probably further advanced than any other country outside North America.

Since even the most basic of intranets - one where you replace costly and inefficient paper-based communication systems within companies by equivalent TCP/IP network technologies - is so simple to grasp and so compelling as an idea, it is easy to forget its intrinsic limitations.

Much of the intranet's enormous potential derives from the use of Web browsers as the common front-end to corporate information. These are easy to use and platform-independent, making rollout across a company straightforward both in terms of training and development. But where simple intranets fall down is at the back-end.

Normally an internal Web system will run off one or more Web servers; this will therefore mean that all of the information to be made available on the intranet must first be transferred (and possibly translated) to the store of HTML documents that are served across the network. This is relatively straightforward for text documents, but for anything more complex - in particular for the kind of information held in corporate databases or financial systems - the simple intranet approach is not enough.

As a result, there is now a growing interest in the marriage of the conceptual simplicity of Web front-ends with the rich complexity of corporate databases, mediated by the TCP/IP-based intranets.

The vision driving the very many disparate approaches being developed is to employ the Web browser as the universal client that will allow anyone to access any information held on a company's heterogeneous array of data servers. This would avoid the need for new, proprietary software on every desktop or costly re-training.

In many respects, this coming together of Web and database represents the second generation of the Internet and intranets in business. On the external Web such systems are paramount for electronic commerce, and still thin on the ground as a consequence of the work they require in setting them up.

There, typically, a customer will access a database of product information via a Web browser, and place an order. The customer and order details would then be entered into another database and fed from there into the vendor's fulfilment system. Both sides of the equation therefore require tight integration of the Web front-end and the database back-end.

The importance of this integration of Web and database is even more crucial for intranets.

Corporate databases in all their forms represent a unique and therefore highly valuable store of information about customers, markets, divisions and future trends. Enabling people to get at these easily, and to drill down in myriad ways to find other, possibly unsuspected kinds of data deposits could offer major benefits in terms of a company's day-to-day running and longer-term planning.

Because the Web, intranets and their associated standards are so new, and because databases are so much an established part of corporate computing, there remain many thorny problems to be resolved before this golden age of internal information retrieval dawns. Although it is quite possible to lash up quick fixes using more or less any of the many development tools that are currently available, the strategic importance of this area means these must be superseded by solutions that are properly thought-out, reliable and fully scalable.

Why small is beautiful on an intranet

Scalability is one of the main reasons for why the TCP/IP protocol is used for business intranets. The same basic elements can be used for a multi-national company employing hundreds of thousands of people as for a small office network: at no point is it necessary to switch technologies as the number of users increases, and there is no need for bridges between different networks.

Another great advantage of intranets is that they are very easy to set up, although maintenance may be another matter. This means that they can be rolled out not just centrally, but locally too, with each department, office or even worker able to create and run their own personal servers, which will typically be Web-based.

Clearly, money is an important issue: if many sites are being set up, the overall expense can soon mount up. This means that traditional heavy-duty Internet or corporate intranet solutions such as Sun Sparc systems running Solaris are ruled out for widespread local use. One solution that might well find favour is to use a PC running the Linux operating system and the Apache Web server, both of which can be downloaded for free.

Although some managers may be nervous about entrusting their data to free software, they can be comforted by the knowledge that more than 40% of all Internet Web servers use this software, according to the definitive Netcraft guide. Another obvious candidate is Windows 95. Microsoft has brought out its aptly named Personal Web Server, which is free, while Netscape has its Fast Track Server (£220) and many other freeware and shareware programs for Windows 95.

However, as anyone who has used the platform for a while will know, Windows 95 is hardly robust enough for this kind of role. Windows NT is far more appropriate, albeit more expensive. This is particularly the case since Microsoft has chosen to impose an arbitrary limit in terms of maximum simultaneous TCP/IP connections on its cheaper Windows NT Client product, which more or less forces you to use Windows NT Server if you are likely to exceed this limit. The benefit of moving up to NT Server is that you automatically get Microsoft's Internet Information Server (IIS) for free. This is an impressive piece of software, now in its third release, and may well be justification enough for swallowing the price of NT Server.

An alternative is Netscape's Fast Track server, which also runs under NT. A particular benefit of this product is the way it employs a Web browser as the main administration tool, rather than a standalone program as with Microsoft's IIS. Indeed, this is likely to become the standard way of administering many functions, and Netscape deserves much of the credit for pioneering this approach. Alongside these high-profile products, there are two others that are worth noting: Purveyor Encrypt (£599 from Process Software) and Website (which costs £365 from O'Reilly & Associates). Both are notable for the excellence of their documentation.

Nor do they suffer in comparison with Microsoft's and Netscape's products. Like them, Purveyor and Website offer secure transactions (increasingly important even for small intranets), as well as easy access to programming interfaces. Where Microsoft and Netscape have ISapi and NSapi, Purveyor has ISapi too (which Process helped Microsoft develop) and Website has WSapi. All four products have sophisticated ways of integrating with back-end databases.

Version 2.0 of the Wingate program is now available. This software allows entire networks to be connected to the Internet through a single Windows 95/NT machine, which acts as a proxy server.

How to make money from your Web site

One of the initial promises of commercial Web sites was to generate revenues from the huge and global audiences the Internet offers. Three revenue models were available: subscriptions, advertising and commerce. Subscriptions, almost without exception, have failed miserably: Internet users seem unwilling to pay for something they have by-now come to think of as free. E-commerce is still very small-scale, though recent forecasts have become very optimistic in the light of technological developments (the agreement of SET etc.).

This has left advertising as the primary means of making money with Web sites, notably through the use of banner ads. These are block-shaped advertisements, typically placed at the top and bottom of Web pages, with links to the advertiser's site. Practically all of the most popular sites employ them, and earn millions of dollars as a result. For those companies that have established a Web site and built up a more modest readership, it has hitherto not proved possible to convert this audience into money. The main problem is creating a viable sales infrastructure: setting up a Web site is one thing, managing an ad sales force to go with it, quite another.

This has led to the rise of services, which sell advertising space on others' Web sites. There is much to be said for this idea. For the Web site owner, it obviates the problem of running an ad sales team, and even small sites can gain revenues by being bundled with others to form attractive packages of sites - advertisement networks, as they are called. For the advertiser, there is the advantage of a single purchase, along with better targeting: with large advertising networks it is possible to put together tailor-made groupings of sites.

Moreover, the technique employed by most services - whereby the advertisements are held on a central server and accessed by URL references to them in the participating sites' Web pages - means that obtaining statistical information about visitors is much easier. Although this is a very new area, there are already hundreds of companies offering these services. The best resource is the excellent collection of links at Web Site Banner Advertising (http://www.ca-probate.com/comm_net.htm). As well as companies offering to sell advertising space on a company's Web site on a commission basis, listed here are also those that hold ads centrally and pay according to hits. Other variants include companies that use an auction technique (for example AdBot http://www.adbot.com/) and even host the entire site themselves (Intercity Oz Network http://interoz.com/network/).

But in general the watchword for this whole new area is caution. Few of these companies have any track-record that can be examined, and it will take many months - perhaps years - before this interesting idea has been refined into something that will allow ordinary corporate Web sites to generate money in this way. Until then, you may prefer to adopt a rather safer strategy. Rather than worrying about whether the people selling space on your pages are just making money out of you, you can simply exchange banner ads with other sites. The relatively well-established Link Exchange (http://www.linkexchange.com/) has been organising this for some time, and uses a system of ad banner credits. It costs nothing to join and there are no charges. There is a UK-based operation called LinkSwap UK (http://www.ukwebmarketing.com/linkswap-uk/) that is also free, and even simpler.

New tools to create the perfect Web site

NetObjects Fusion is notable for the power of its design tool; you can place elements on a Web page with pixel-level accuracy. This means that designers can fine-tune their pages and be reasonably sure that what appears in the visitor's browser will look very similar to their intentions. However, there is a price to be paid for this very unWeb-like behaviour: the huge number of tables and small graphical elements the program uses to pad out space.

HTML purists will probably be shocked if they examine the HTML code generated, and certainly they will find it very hard to edit it by hand. However, if such matters do not worry you, and design considerations are paramount, then Fusion is probably the best tool. It also offers standard site management facilities such as link checking, and a very visual though fairly limited approach to hooking up with back-end databases

Microsoft's FrontPage 97 is more orthodox, and it is much easier to edit the HTML code directly, not least because FrontPage produces code that is automatically indented (examples of this can be seen at the site http://dialspace.dial.pipex.com/glyn.moody/, created using FrontPage).

Microsoft also offers other, complementary tools for the Web site development process. For example you can use Visual Source Safe to manage simultaneous development, while the new Visual InterDev (see) is a powerful tool for linking Web pages to back-end databases, and creating Active Server Pages, which generate HTML on the fly.

Corel's WebMaster Suite provides a similar range of tools for editing pages (Web.Designer), managing a site (Web.SiteManager) and database integration (Web.Data - which employs a very simple nine-step approach), and trial versions can be downloaded from. In some respects Corel's tools are even better than Microsoft's. For example, as well as indenting HTML source, it also colour-codes the links within it according to whether they have been checked, are broken, etc.

SoftQuad's Web site tools represent in many respects the antipodes to NetObjects' Fusion. Where the latter is particularly concerned with appearance, the former concentrates on content. One consequence of this is that HoTMetal Pro 3, SoftQuad's Web page editor, is now looking distinctly old-fashioned with its non-WYSIWYG approach and its limited design capabilities.

However, its site tool, called Information Manager, offers a number of interesting features. For example, unlike Fusion's simple hierarchical representation, or the more flexible approaches of FrontPage and WebMaster, Information Manager uses what it calls a Cyberbolic display of elements employing a spherical geometry to give you an ingeniously compact view of your site.

Perhaps even more impressive is SoftQuad's intranet product, HoTMetal Intranet Publisher (HiP). This offers the same basic features as the Internet product, but adds the extremely powerful ability to add user-defined extensions to HTML to allow different views to be produced from the same basic HTML code (so that accounts and production can pull out different relevant information from the same document, for example).

It does this by employing cascading stylesheets, the only Web site tool to support this new standard. Also provided is a server-side tool for monitoring Web site usage and sending user notifications of changes to pages of interest. However, HiP's lack of database tools is a serious weakness in the current product - one that is apparently being addressed in the next release.

How the Web is going to turn up everywhere

Microsoft's fascinating attempt to integrate Internet functionality directly into the Windows operating system, in effect, the Web browser becomes the interface to the entire computing environment, although bold - not least for the rigour with which it has been carried out - this move is by no means unprecedented.

For example, last year Microsoft introduced NT Web Admin to allow Windows NT environments to be administered using a Web interface. Similarly, it has been possible for some time to control both Microsoft's Information Internet Server and the cut-down Personal Web Server employing just a Web browser.

There are two big advantages of this approach. First, and most obvious, is the ability to exercise control at distance: if the target system is connected to a TCP/IP network it can be manipulated from anywhere. More subtly, Web interfaces draw on the intuitive nature of the browser. One of the reasons why the World Wide Web has taken off so dramatically, particularly in business, is because it requires only the most minimal training. As some cynics have put it, a browser is software even a Chief Executive can use.

The credit for this shift towards using the Web for general interface purposes must be given to Netscape. When it launched its Web servers it adopted the then-new technique of administering them using a standard Web browser (which access the server on a non-standard port number). Microsoft's adoption of this technique sets the seal on the idea. The beauty of the Web approach, along with its essential simplicity, is that it can be applied to almost any field. For example, Pipex has created a service whereby its Internet subscribers can read their e-mail using a Web interface - which means that it can be read anywhere in the world that an Internet connection is available.

More ambitious is the attempt by a consortium including Microsoft, Cisco, Compaq and Intel to create a complete Web-based approach to managing all aspects of networks, and from any location. An exemplary site devoted to the initiative has been created, with details of the basic ideas and the elements involved, including an illuminating Web-based demo and full details of the new HyperMedia Management Protocol.

Nor is this all promises; BMC has already added support for this proposed standard to its Patrolwatch Management Suite. Similarly, Cisco has employed a Web interface for its ClickStart management software for some time. Other network hardware devices are also being drawn into this approach. For example, Hewlett-Packard has developed Web JetAdmin for managing its JetDirect printers, while IBM has created a Java-based Network Printer Manager. Even CD-ROM units can now be controlled from a distance through a Web browser.

The Web interface can also be applied to software, as the server administration tools described above show. But so powerful is the approach that it can be used with any kind of application. For example, both Softquad and Netiva have come up with databases that are accessed purely through Web interfaces. And the use of Java means that potentially any kind of functionality can be added to a Web page while retaining the basic metaphor.

This Web-based approach could become even more central to computing if the ambitious vision underlying the next version of the TCP/IP protocols, IPv6, is realised. IPv6 was designed in part to allow IP addresses to be allocated to just about every electrical object on the planet - from light bulbs and toasters up; what could be more natural than accessing them from a distance via a Web interface?

Make a spectacle of your Web site

Web site development is no longer a job for amateurs and enthusiasts, writes Brian Clegg. As delegates at the Web 98 Design and Development Conference conference in Boston, US, September can vouch, to do a professional job requires knowledge of strategy, usability, information and visual design, and programming.

The first consideration in setting up a Web site is the server software needed to host it. A Web site is, in effect, a simple database. This database, the server, uses hypertext transfer protocol (HTTP) to deliver the specific Web page requested by many browsers simultaneously.

But delivering pages efficiently is no longer enough: a Web server must be able to execute programs to retrieve data, format pages on request and give a page life.

Depending on the server's abilities, this external program might be written in languages such as Perl, C++ and Visual Basic. In these an application communicates with the Web server using Web technology called the common gateway interface (CGI).

Microsoft has provided an extra twist by adding ISapi - an interface that allows external programs to plug more directly into the server.

Operating system
The next thing to consider is the operating system on which the Web server will run. If the platform is Windows NT, the obvious choice is Microsoft's Internet Information Server, which is free, fast and feature-rich.

Page functionality can be added by writing direct to ISapi, using any development environment that can build a dynamic linked library; or by scripting Active Server Pages using Perl or the popular browser scripting languages. If you want to provide access to a wide range of data try OLE-DB, Microsoft's latest object interface for databases.

Even more development options are available with O'Reilly's Website Professional, which adds CGI and its own proprietary interface. However, it costs about £500, has less general functionality and is probably better suited to smaller enterprises.

If the platform is Unix, the most powerful choice of operating system is Netscape's Enterprise Server. Here, development interfaces are provided for NSapi - Netscape's equivalent of ISapi, for CGI and for the Java language. Enterprise Server also has a huge range of connectivity options, making it an excellent choice for fronting up a database. But it isn't cheap - about $1,200 (£750).

One budget option is Apache. It is limited to CGI but has add-on modules to extend it - and it's free. If you have a free choice, Internet Information Server probably has the edge at the moment, with Enterprise Server coming a close second.

Once the server is chosen, the site needs to be built. hypertext markup language (HTML), which is used to describe Web pages, is plain text.

However, developing a site with a text editor such as Notepad in Windows is tedious and risky. Most developers use a visual design tool, which act like a word processor for the Web.

Simple pages can be created using actual word processors such as Microsoft Word and Lotus Wordpro, which generate HTML from an ordinary document. Developers can also use free Web editors such as Netscape's Composer which is included in the Communicator Suite, or Microsoft's Frontpage Express which comes with Windows 98 or Internet Explorer 4.

For professional sites, though, products such as Softquad Hotmetal Pro, Microsoft Frontpage, Adobe Pagemill and the unusual free-format Netobjects Fusion offer a much wider range of features.

These typically allow Web designers to preview pages in multiple browsers. They offer high-end facilities such as style sheets, and support for scripting. There are also software packages that manage the Web site, map its layout and check for dead ends, and upload pages to the live site from a test environment.

Style sheets
As the tools have matured, so has the underlying language. HTML has been revised and extended. Most notably, it now offers cascading style sheets, providing a mechanism to fix items on a Web page and set standard styles across a site.

At the same time, other extensions enhance the presentation of data within Web pages (XML), and make possible pages that can be tailored to an individual or change on the fly.

Traditionally, scripting ran on the server. This meant it was independent of the functionality that the end-user's Web browser had to offer. But increasingly, scripting has moved to the browser.

Both major manufacturers support the de facto standard Javascript, a cut-down version of the Java language, while VBScript, derived from Visual Basic, is available in Microsoft's browser, or through a Netscape plug-in.

One development that has been around for a while without achieving the penetration initially expected is "push". This technology enables users to subscribe to "channels" - special subsets of sites with active information. By contrast, another advance - hybrids of the Web and TV - is already under test.

Few environments change more quickly than the Web, and consequently Web site development remains a frantically moving world.

Definitions

Active-X

The surprisingly wide support that Java has generated derives in part from manufacturers excited by the possibility of software-on-demand, perhaps sold on a per-use basis and delivered directly to your machine over the Internet.

Microsoft has responded vigorously to this and has come up with it's own Java-like approach. Recognising that there was a demand for 'componentware' Microsoft has plucked a fairly technical aspect of its programming products from obscurity and promoted it to linchpin of its new Internet strategy.

Active-X is the latest incarnation of OCX, which itself derives from OLE Component Object and evolved from VBXs (Visual Basic custom controls) found in Microsoft Visual Basic. These are software elements - components - that can be used in a variety of projects.

Active-X is an outgrowth of this software recycling. They add extra features, the ability to work in both 16- and 32-bit environments, and a greater portability than that offered by VBXs. This last property has allowed Microsoft to recast them as its platform independent Java-killer, with the added bonus that this builds on the highly popular and well-understood technology of VBXs.

The distinction between Active-X components and Java is that the former is compiled and therefore the appropriate binary has to be downloaded when required for the target computer. Java is transmitted as code and compiled upon demand.

Visual Java mirrors Microsoft Visual C++ tools and includes a way of turning Java applets into ActiveX controls. Using Visual Basic scripts as the glue that binds all these ActiveX elements together, Microsoft has managed to extend the functionality offered by Java.

Microsoft has made available a free software development kit (SDK) for use with its new Visual InterDev Web application development system. The Design-Time ActiveX Control SDK aids the creation of server-side components for use in Active Server Pages.

Why Microsoft has made ActiveX take a back seat

The most interesting aspect of the UK launch of Internet Explorer 4.0 was not what was said but what was omitted. During the surprisingly amateurish two-hour presentation the ActiveX approach was referred to neither directly or indirectly. ActiveX was introduced back in December 1995 as Microsoft's answer to the then relatively new Java applets. Rather than taking a chance on new and untried technology, so Microsoft's argument went, far better to go with ActiveX, a new incarnation of OCXes.

According to Microsoft, ActiveX could match all the exciting new interactive and dynamic features offered by downloaded Java applets in Web browsers, and required no investment in new languages or techniques. The fact that OCXes were strictly for the Windows platform would be addressed by extending the technology to the Macintosh and Unix at some point in the future.

With its usual superb marketing, Microsoft was able to convince the Internet world that there was therefore a real rival to Java applets, and that Java's apparent raison d'être - to extend Web browsers - had disappeared. Having come up with an approach that was clearly reactive and tactical, the company built on the initial success of the idea and made it central, to its Internet and later to its entire computing strategy.

ActiveX controls migrated from the client side to the server, notably with the Active Server Pages approach, and formed the basis of Microsoft's DCom component technology and fledgling transaction processing model. But while these important shifts were going on behind the scenes, the most visible manifestation of ActiveX remained on the client side.

This makes Microsoft's retreat all the more dramatic: dramatic, but perhaps inevitable. ActiveX's security model is fundamentally flawed. Whereas Java applets are restricted in terms of the operations they can carry out once they have been downloaded to a client, ActiveX controls can do anything their creator's desire.

Microsoft's response to this unacceptable situation is Authenticode. This employs digital certificates to ensure that ActiveX controls are not tampered with as they pass across the network, and to provide a sure way of establishing who wrote them. The idea is that if a control comes certainly from a trusted source - a major software house, say - then it can be left to operate freely on the user's system. But the trouble with Authenticode is that it places the burden on the user: he or she must decide whether a digital certificate provides enough assurance to allow the corresponding control full access to a system.

This is of course unrealistic for most users who are not experts in digital certificates and simply want to get on with their work. Moreover, if rogue ActiveX controls are accepted and cause damage, there is no guarantee that the certified software publisher still exists, much less is within any useful jurisdiction where it can be sued or prosecuted.

In other words, for all practical purposes, ActiveX controls are useless on the open Internet, since they simply cannot be trusted. Microsoft has tacitly recognised this with the introduction of security zones in Internet Explorer 4.0. Using these it is possible to set defaults in terms of accepting or rejecting ActiveX controls, according to whether they originate on the Internet or from within a intranet.

Security zones mean that ActiveX controls can still be employed within a corporate intranet, since their origin and capabilities are presumably known. And this in its turn means that Microsoft's server-side ActiveX strategy - and with it Active Server Pages, DCom and Transaction Server - is still viable. But it does mean that ActiveX controls are unlikely to be used for public Web design.

Instead, Microsoft is pinning its hopes on Dynamic HTML, which made an appearance at the Internet Explorer 4.0 launch, and which will presumably now take over as Microsoft's latest anti-applet technology.

ActiveX Web Database Programming (£27.49, ISBN 1-861000-46-4) presents an excellent practical introduction to the Microsoft's alternative middleware technologies.

Address classes/IP address

The internet uses a 32-bit address scheme, called IPv4 (Internet Protocol version 4) to define hosts on the global network. This 32-bit address is usually written as four eight-bit numbers in the decimal form 123.45.67.89, where each of the four elements is less than 256. These Internet addresses, of which there are theoretically 4,294,967,296 (though in practice there are fewer because blocks of numbers are reserved for special purposes), are split up into various classes, each of which has important characteristics. Moreover, the way in which blocks were allocated to users (especially in the early days of the Internet) means that there are now relatively few of these addresses left.

Class A IP addresses are those whose first element runs from 1 to 127. Because the other 24 bits can be freely assigned by the holder of Class A address, this gives a network with a potential 16,777,216 Internet addresses. Examples of these lucky organisations include IBM, which has the Class A address 9.0.0.0, and MIT, which has 18.0.0.0.

It was quickly realised that the entire address space would soon be exhausted if many Class A addresses were handed out, and so Class B numbers became the default.

Class B numbers run from 128,0,0,0 through 128.1.0.0 to 191.255.0.0 giving 65,536 addresses for each of the 16,000 or so networks available.

However, such has been the growth of the Internet that even Class B addresses are in short supply. It is now the policy to give out Class C numbers, which run from 192.0.0.0 through 192.0.1.0 to 235.255.255.0, and each of which has 256 addresses.

Organisations requiring more than this number may need to take several Class C networks.

The remaining IP addresses, those from 224.0.0.0 up to 255.255.255.255 are divided into two more classes: D and E. However these are reserved for special purposes, and are not allocated to individual networks on the Internet.

The solution is relatively simple: to increase the address size and with it the number of possible Internet nodes. This is the approached adopted with what is called IPv6 or IPng (for Next Generation). What is surprising is the scale of the extension that has been adopted. Instead of the current 32-bit system, IPng will use no less than 128 bits for addresses. This does not give a mere four-fold increase: in fact, according to the excellent introduction to the whole subject of IPng at the URL http://playground.sun.com/pub/ipng/html/INET-IPng-Paper.html, an address-length of 128-bits implies 340,282,366,920,938,463,463,374,607,431,768,211,456 (3.4x10^26) nodes. Since such large numbers are difficult to grasp, the same source puts things in context by point out that this figure represents 665,570,793,348,866,943,898,599 (6.7x10^23) possible nodes for every square metre of the planet.

This extraordinary number does not simply represent some rather excessive caution on the part of the IETF working group that drew up the new standard embodied in RFC1752, which is available from ftp://ds.internic.net/rfc/rfc1752.txt . It hints at something altogether grander and more exciting.

For with such huge numbers available it is possible to look beyond allocating Internet addresses to every computer. IP numbers could be given to every computer peripheral; to every piece of business equipment in an office - fax machines, photocopiers, telephones (assuming they had some kind of networkable digital control element that could become part of this super-Internet).

And beyond this lies the integration of an even broader range of digitally-controlled electrical devices into a massive total network girdling the world. Included could be transport systems (most cars already have as many chips inside them as a desktop PC) and even semi-intelligent domestic appliances. The day when individual light bulbs can be accessed and controlled over the Internet is perhaps not so far off - and already implicit in IPng.

Although the physical infrastructure can expand almost endlessly, some of the logical elements have limits in their current incarnations. Perhaps the most serious of these has to do with routing tables.

These determine how the individual data packets should be routed over the multiply-connected Internet. As the latter grows and changes on a daily basis, so these tables need to be updated - sometimes several times a day.

As well as resolving issues of address space - finding enough Internet addresses for new users is now becoming a problem - IPv6 adopts a more efficient approach to routing that avoids the need for massive tables and their frequent update.

Leaving aside the fact that IPv6 has not yet been implemented yet - and will require some years before it is fully rolled out - there remain broader challenges for the Internet. For example, the Internet phones, along with other kinds of multimedia traffic, generate huge quantities of data packets. The current Internet infrastructure and pricing system is simply not designed to cope with this flood which is likely to clog the system increasingly in the future.

It may, for example, be necessary to implement new pricing schemes for Internet connections whereby you pay for the volume of data transmitted, rather than a flat rate. Another possibility is that Internet users, particularly companies who are beginning to depend on their connections, will demand Quality of Service (QoS) guarantees from their ISPs - and be prepared to pay for them.

Here, too, IPv6 should help. It supports the new ReSerVation Protocol (RSVP) that attempts to negotiate a certain QoS on the Internet. Once this is in place users will be able to expect the same kind of reliability from their Internet suppliers that they do from telecom companies or energy utilities. In a few years' time it will be unthinkable for companies to work with an ISP unable to offer these kinds of guarantees. Although this is still a way off, now is the time to start talking to Internet connectivity suppliers about their plans in this area.

ADSL

The recent features on ISDN described the advantages of this by-now rather old technology. Interestingly, the potential throughput - 64Kbit/s on each of two channels that can be combined to give 128Kbit/s without compression - is beginning to look less impressive. The new generation of 56Kbit/s modems running over ordinary telephone lines are not so far behind, though ISDN does have other advantages.

Independently of these modem technologies, ISDN is unlikely to remain the speed champion for long. As a previous Net Speak explained, a new kind of modem working with cable TV networks offers the promise of Mbit/s download speeds in the not-too-distant future. Trials are already underway in the UK, and there is talk of launching cable modem services next year.

Needless to say this cable modem challenge is not being taken lying down by the telecom companies of the world. Almost as if by magic, their engineers have discovered that they can push data down a telephone line at not just the current 28.8 Kbit/s, or the coming 56 Kbit/s, but at a stunning and rather convenient multi-Mbit/s speed.

This is accomplished by using higher frequency transmissions over conventional telephone lines. There are limitations of distance from user to exchange - typically 10-18 kilofeet, as the jargon has it - and not every exchange or area will be able to offer the technology. But the new service - generally known as Asymmetric Digital Subscriber Line (ADSL) - promises cable modem speeds down an ordinary phone line. It is asymmetric because upload speeds are "only" a few hundred Kbit/s - more than enough for most people's needs.

Agents

There are more than one million Web sites according to the Netcraft survey, with many more being added each day. As well as the new sites that are cropping up, established locations are being updated, often on shorter and shorter time-scales as constantly refreshed content and form becomes a paramount factor in attracting and keeping visitors.

The rise of the Internet search engines has been one response to this data deluge. Initially they seemed almost unbelievably powerful tools that placed millions of Web pages within the user's grasp, and allowed unimaginably large quantities of data to be sifted in seconds.

But now, as the Net continues to expand, even search engine results have become unmanageable. Instead of reducing data to information, the listings of hundreds or even thousands of hits across the Internet provide simply a first winnowing.

It is clear that two new elements are needed to help users in their struggle against this flood of facts. First, a way of conducting searches automatically, without needing to specify every time what you are looking for. And, secondly, more intelligence applied to the filtering process to produce usable results.

The new class of programs designed to offer these two elements is called agent software. Although grandiose claims have been made for agents, so far their incarnations have been simple and disappointing. But these will undoubtedly change and probably soon - not least because the Internet would otherwise soon drown in its own content.

Agents are not limited to simple browsing but can help to find the best prices on offer by online merchants. One of the leaders in this field was Jango from Netbot at http://www.netbot.com/ , now owned by the search engine Excite at http://www.excite.com/. In the shopping section at http://jango.excite.com/cf/index.html are now offered Jango-enhanced searches for various categories including Computers and Software, Movies and Games & Toys.

Alexa

If you hate the way browsers lead you around the Internet by the nose, you'll welcome a new helper which aims not only to take you where you want to go but also to make things more interesting along the way.

Anyone who has spent any time browsing through the sections of netspeak and articles in magazines and books online will no doubt have come across links that no longer lead to the Web pages cited. Many of the resources referred to have not just moved, but disappeared completely.

This is a fundamental problem with the Internet: it is not like a traditional library gradually gaining more titles, but a huge, organic entity that changes constantly with time, losing sites and pages as well as gaining them. The idea of trying to capture and save these previous incarnations may seem utterly unrealistic, and yet this is precisely what Internet Archive is trying to do.

The man behind the Archive is Brewster Kahle, who developed the Wide Area Information Server search system. He has set up an ancillary company called Alexa which aims to use the Internet Archive (which it generates, largely) for commercial purposes. (For more information; the company, Brewster Kahle).

Alexa is also the name of a product, (for Windows 95/NT currently). It is a kind of browser helper: it runs alongside Navigator or Internet Explorer, and monitors which sites you visit. On the basis of where you are and where you have been, it offers suggestions about other sites that you might find relevant and interesting.

It does this in part by drawing on its database of where other people who have visited similar sites have gone: what it calls usage paths. A corollary of this is that it keeps a record of where you have been, though the company insists this information is kept private.

Novel navigation
There are full details of the technology, and a good article on some of the information that has been gleaned about the Web in general

The aim of Alexa is twofold: to offer a novel way to navigate through the World Wide Web, and to create a kind of consensus about which sites are worth visiting on the basis of what Internet users think and do, not according to some self-appointed arbiter's judgements. Alexa offers other benefits. As well as providing suggested locations to visit, it tells you about the site you are currently viewing: who runs it, how big it is etc.

Get into the archive
More interestingly, Alexa allows you to hook into the Internet Archive: when older material relevant to the site you are viewing is available, this is signalled on the Alexa toolbar that appears with your browser. You can then view previous versions of Web pages. There is also a facility to search through the electronic version of the Encyclopaedia Britannica (or at least some of it) and an online dictionary and thesaurus.

Alexa is free: the company aims to generate revenue by a new kind of online advertising. In the pop-up menus offering suggestions of where to go next small banner ads (http://www.alexa.com/company/advertise.html) can appear. I found the suggestions offered by Alexa stimulating, not least because they were frequently unexpected, and nearly always worthwhile. Too often search engines take you to pages that are irrelevant or dull. If enough people start using Alexa seriously - and so feeding in their use patterns to the system - it could represent a genuinely innovative way of using the Internet.

American Online (AOL)

To see the first Internet company, once such an exciting and innovative player, disappear in this way is sad. Its passing represents if not the beginning of the end, certainly the end of the beginning for the Internet world.

Apache

Besides the browser Mosaic at the National Center for Supercomputing Applications (NCSA), NCSA has another claim to fame on the Internet, as the creator of the NCSA httpd Web server.

Just as Mosaic played a crucial role on the client side in popularising the idea of graphical navigation of the World Wide Web, so the NCSA httpd Web server was one of the key programs in providing a practical demonstration on the server side of just what could be achieved. The "d" in "httpd" in the name refers to the daemon, or continuous Unix process, that runs the HTTP service used by the server to supply Web documents to the client browser.

Like Mosaic, the NCSA was (and still is) free. However, also like Mosaic, it suffered from a number of bugs of varying severity. To solve these problems, and to improve the overall performance, a new Web server was developed, taking as its starting point some of the fixes - patches - to the NCSA httpd software. Because of its origins as a "patchy" server it was dubbed Apache, a name it retains to this day.

Despite its less-than-glamorous origins, Apache has been a phenomenal success. According to the Netcraft survey of Web servers, over 40% of sites are on machines that are running Apache, a market share far ahead of any commercial server program. Apache has the advantage of offering full-power encryption - even outside the US. Apache is explicitly designed for Unix platforms, and one of the most popular of these is Linux, not least because like Apache it can be obtained free of charge.

Application Programming Interfaces

Now that the Internet is becoming more integrated into the rest of corporate computing it may well be that the larger ensemble created begins to lose some of the very qualities that made the Internet such a success in the first place. Platform-independence is a case in point.

One of the current key areas of Internet development is how to plug the Internet into non-TCP/IP elements - specifically databases, perhaps using some kind of middleware. This means that the platform-independent techniques that lie at the heart of the Internet must be supplemented with others, some of which do depend on the platform in question.

For example, in the database arena, the use of Application Programming Interfaces is proving to be an important part of linking Internet Web servers with the heterogeneous components of the corporate computing matrix.

An API is essentially a published set of functions that are available to other third-party programs to enable them to carry out actions. They provide a common standard that can be used by many disparate programs without the need for special patches to be created on a case-by-case basis. However, APIs must be defined - usually by a leading player with enough clout to make them viable in the marketplace - which means that they are arbitrary to a certain extent, and tied to that defining manufacturer.

In the Internet field, two of the most important APIs that are starting to be widely supported are the Netscape Server API (NSAPI) and Microsoft's Internet Server API ISAPI. Although end-users need not worry about the details of such APIs, it is likely that they will be aware of their presence more in the future.

Archie

Archie was the first attempt to solve the Internet's biggest problem; the lack of a central directory. Begun at McGill University in Canada, the Archie project (the name comes from the fact that it has to do with file archives) provides users with a way to search for files on the Internet. It is indispensable when you are looking for a particular program among the 50 Gbytes of publicly-accessible software held on some sites.

Information about where files can be found is held on a number of Archie servers located around the world. These update their lists by contacting major FTP sites in turn and retrieving information on the directories and files held. To find a particular file you should ideally use an Archie client program resident on you desktop machine or elsewhere on the company network. There are such clients for all major platforms, including Microsoft Windows. You simply enter the name of the file you are looking for, send the request to an Archie server (the main one for the UK is archie.doc.ic.ac.uk) and wait for the response. This will generally consist of hundreds of locations throughout the world where that particular file is held. You would then use FTP to retrieve the file. If you don't have an Archie client on your system, it is also possible to use the telnet facility to log into an Archie server (enter Archie for log-in and the password) to carry out the searches directly.

There is an E-mail version of Archie that complements the FTPmailers well. By sending an E-mail message to a special Archie site (e.g. archie@archie.hensa.ac.uk). For example, to locate pkz204g.exe you would send the message find pkz204g.exe to the address archie@archie.hensa.ac.uk. You will eventually receive a list of locations and directory entry for the file. You will then need to extra the name of the site, the relevant directory and the name of the file (already known). You then send an E-mail message to ftpmail@doc.ic.ac.uk such as open teseo.unipg.it binary uuencode chdir pub/stat/jse/software/misc/ get pkz204g.exe end

Archie is a rather crude instrument; it is difficult to refine its searches unless you have a mastery of regular Expressions (and if you don't you might like to see the pages at http://venus.ubishops.ca/course/regex.html which has a good explanation of the subject).

Rather better in someway is Shase - the shareware Search Engine. Its home page is at http://www.fagg.uni-lj.si/SHASE/ - which, as the .si domain indicates , is located in Slovenia (at the University of Ljubljana). A UK mirror can be found at http://shase.doc.ic.ac.uk/SHASE.

Shase offers the possibility of carrying out searches for files in a more sophisticated way than is possible with Archie. Its index covers over 110,000 files totalling 13.6Gb. An interesting list of these archives (which includes CICA, SIMTEL, Info-Mac and Microsoft) can be found at http://dolphin.doc.ic.ac.uk/DB-SHARE/statistics.html.

Useful too is the list of 100 new files at each site. Selecting one of these files takes you to a page with a list of possible FTP locations: as well as providing hot links to the site, there is a useful statistic that indicates how often a test program was able to access the directory in question. This indicator of how easy it is to log in to leading FTP servers is probably unique.

Asynchronous Transfer Mode (ATM)

Asynchronous Transfer Mode (ATM) in the context of broadband communications is one of the most important technologies for the future. Broadband simply means able to transmit large quantities of data (over 1.5 Mbits per second according to the official definition).

As anyone who has used the Internet for a while knows, sooner or later you hit against transmission speed limitations - either locally, in the connection to your computer, or in terms of the size of the data pipe that your Internet provider uses to connect to the rest of the Internet (especially the size of the transatlantic connection, often a critical bottleneck).

For this reason many see ATM as offering one of the best ways of upgrading the global Internet infrastructure to provide bigger data pipes from which faster local feeds can be taken.

ATM has nothing to do with the physical side of these connections, but is all about how the data is packaged and transmitted. ATM employs packets of fixed size (rather perversely chosen to be 53 bytes - 48 bytes of data plus five for routing information). Once data has been chopped up into these packets, the latter are then transmitted across the physical network in question asynchronously (hence the name). That is, the sender and receiver do not have to be rigidly synchronised before or during the transmission.

The great advantage of ATM, apart from its ability to offer high data throughput reliably, is that its very simplicity means that it can cope with any kind and mixture of data and run over any kind of network. This flexibility makes it a good match for the similarly minimalist Internet, which is defined by little and which can be used in almost any situation.

Acceptable Use Policy (AUP)

The acronym stands for Acceptable Use Policy, and referred originally to a short but important document drawn up by those running the NSFnet, the first backbone of the Internet.

It defines very basic rules for governing who could and could not join the Internet (which necessarily meant using the NSFnet, since the latter tied everything else together), and for years was the main limiting factor on employment of the Internet for commercial purposes.

Its opening statement is unequivocal: "NSFnet Backbone services are provided to support open research and education in and between US research and instructional institutions, plus research arms of for-profit firms when engaged in open scholarly communication and research. Use for other purposes is not acceptable."

There are other AUPs apart from that of the NSF: for example, the subsidiary academic networks frequently impose constraints similar to the original NSFnet statement. Even commercial suppliers usually have some kind of AUP, usually limiting people to legal activities.

Avatars

One of the most exciting applications of Virtual Reality Modelling Language (VRML) is in the creation of virtual worlds through which users can move, for example to allow information about data hierarchies and relationships to be conveyed in a simple visual way. A completely different use of VRML involves the fashioning of shared virtual environments. Here the emphasis is on interaction with other users, extending the other forms of online communication currently employed such as Internet chat or Internet telephony.

Since a crucial aspect of these worlds is their three-dimensional nature, it follows that some kind of virtual corporeal presence is required in them if the overall metaphor is to be preserved. This has led to the growth of an entirely new online element: avatars. An avatar is the incarnation or form that you take in one of these virtual worlds. Typically it will have some three-dimensional characteristics, a front and a back, for example, so that other participants in these worlds can move around you. The form might be minimalist - a floating head, or a simple object - or a complex piece of three-dimensional graphics crafting that is a work of digital art.

Although the word itself comes ultimately from the Hindu religion, the first use of the term avatar in a computing context is generally traced back to Neal Stephenson's seminal cyberpunk novel Snow Crash. In the world described there, avatars inhabit a huge and rich virtual domain called the Metaverse, where all kinds of activities and transactions are conducted. Current implementations are rather cruder, but may one day evolve into an important business and entertainment medium.

Auctions

For those interested in auctions, there are more and more sites springing up where you can make your bids over the Internet

It may well be that this re-invention of classified ads online will prove to be as important a source of revenue for Web sites as banner advertising now is.

Banner sizes

The commonest form of Web advertising is through the use of images with promotional messages placed on a Web page. Given that many people do not scroll all the way through to the end of a document, the prime position is "above the fold", in the initial screen displayed to visitors when they reach a site. In particular, ads are frequently found at the top of Web pages to ensure that they are the first thing seen (unless inline images have been turned off).

These banner ads, as they are generally called, have sprung up in a completely uncontrolled way; not surprisingly, given the Internet's general lack of supervisory bodies. As a result, the ads tend to be designed to fit in with the overall form of the Web page on which they appear. This means that there are currently hundreds of slightly different shapes and sizes employed for banners.

For the user this is not a problem, but it is for companies such as Microsoft that wish to place the same advertisement in hundreds of sites. Rather than designing the promotional image for one or two standard sizes, it must be tweaked to fit the demands of particular pages.

This is clearly extremely inefficient from the advertiser's viewpoint, and indicative of the immaturity of the Web advertising market.

To combat this, the Internet Advertising Bureau and the Coalition for Advertising Supported Information and Entertainment have drawn up some standard sizes for banner advertisements. Unfortunately, to date there has been little effort to enforce them, and so the current banner size anarchy continues.

BinHex

The file extension .hqx refers to the BinHex format, commonly-encountered in the context of Macintoshes, and occasionally seen elsewhere on the Internet too.

Macintosh files are unusual in that they can consist of two parts, called the data fork and resource fork. This structure is used for programming convenience, and is part of the independent approach mentioned above as being characteristic of the Macintosh world.

One problem that the BinHex format solves is how to convey both parts from one computer to another. This not a trivial operation, and BinHex is now perhaps the most widely-used means of combining the two forks into a single entity.

But BinHex has another side, one, which means that it is of more general interest. As well as combining the data and resource fork into a single file, BinHex also converts the eight-bit binary code into something that can be represented with fewer bits. In this respect, BinHex is very similar to the UUencoding and MIME schemes that similarly take binary files and convert them into a form that can be represented in ordinary ASCII. As a consequence, like those produced using UUencoding or MIME, BinHex files are bigger than the original binary form.

Because BinHex represents an alternative to MIME or UUencoding (at least as far as its conversion of binary to ASCII goes) it is sometimes encountered outside the Macintosh world. For example the Windows version of the Eudora E-mail package offers BinHex as well as the more usual MIME when sending binary attachments with messages.

Bolero

Even though Bill Gates must envy the name, Software AG is hardly familiar in computing circles. The firm is probably best known for its database Adabas and its fourth generation language Natural, neither of which are products people are likely to be passionate about.

Software AG has won a certain fame (some would say notoriety) through its work with Microsoft to port the Distributed Common Object Model (DCOM) technology to other platforms. Many have seen this as another ploy by Microsoft to fend off accusations that DCom is limited to the Windows platform, but without having to support rival operating systems directly. But Software AG's strategy has long needed a complete overhaul. The firm's current products are all rather wedded to older programming models, while new-fangled concepts such as the Internet arise only tangentially.

Bold leap
Rather than a series of incremental updates, Software AG has opted for a single bold leap into this new market. Its Bolero product - the name derives from Business Objects Language Environment - is extremely ambitious. A good white paper on the subject is at www.softwareag.com/ corporat/solutions/Bolero/papers/boltwp.htm. In a sense, Software AG has taken its expertise in fourth generation languages and applied it to Java and the Internet. Bolero the product consists of an object-oriented language, also called Bolero, closely modelled on Java, which is used to create component-based business applications. The end result is server-side Java byte-code, even though no Java programming is employed.

Platform-friendly
The advantage of this approach is that Bolero's output can run on any platform which has a Java virtual machine available: a clever way to obtain cross-platform capability.

The development environment of Bolero comes with graphical user interface front-ends, compilers, wizards, editors, debuggers and other programming tools. The rest of Bolero, called the Application Server, consists of heavyweight modules that handle things like long transactions (see Net Speak) and links to databases.

Even though DCom is the main way of communicating between Bolero applications and existing software (Corba's Internet Inter-Orb Protocol will be added later), Bolero components can be either Com objects or Java Beans. Software AG is further hedging its bets through the ability to switch in other virtual machines if Java fails to catch on as the universal platform.

Bolero is still in beta, and therefore remains more promise than reality at the moment, though the demos look interesting. None the less, it is potentially significant in a number of ways. First, the use of the Java virtual machine through the output of Java byte-code is an approach that other software manufacturers will doubtless be interested in adopting.

Second, it is worth noting that Bolero will be written in Java, making it perhaps one of the largest and most complex Java projects to date. Assuming that Bolero is finished and works, its mere existence will be yet another significant demonstration of the capabilities of Java as a serious programming environment for the enterprise.

Finally, Bolero represents an important statement from Software AG, which is more or less betting the company's future on this product. Its next project is proof that it is not content to sit back from here on. Since EntireX the port of DCom to non-Windows platforms, represented an updating of Software AG's older Entire middleware range, and Bolero is a kind of Net-enabled Natural with bells and whistles, it is logical that the next move will be a revamp of the Adabas software range.

The key new ingredient here is XML, which will lie at the heart of the successor to Adabas: yet another indication of how this language is moving to the centre of corporate computing.

Broadband Sites

Of course, whether or not this vision of a brave new broadband world is realised depends not just on firms creating the content for it, but even more on whether users will have fast enough access to make these services commercially viable.

The recent $58bn (£36bn) bid from AT&T for the MediaOne Group details of the AT&T offer represents not just a consolidation of the US cable TV market, but is further evidence that the broadband revolution is spreading to most high-tech sectors.

There is information on cable activity in Europe, and other details of cable companies in the UK.

The use of cable to deliver broadband Internet connections, has a number of advantages. The technology is relatively mature, several cable modems are available and broadband services are offered by many cable providers (in the US, at least). But cable also suffers from some drawbacks.

The European Commission has also issued a paper that touches on many issues in the area of broadband services. It is available online.

Browsers

Like many great break-throughs, the World Wide web browser Mosaic was put together almost by chance, and certainly not as the result of a carefully-planned project to produce what some have called the "killer app" of the Internet. Its author is Marc Andreessen, a research student working at the University of Illinois. Apparently he wanted a browser for the wanted a browser for the then-new World Wide Web, and so knocked up one of his own. The result amazed not just his supervisors, but also the millions of people who have since downloaded the freeware product from the FTP directory at the NCSA (ftp://ftp.ncsa.uiuc.edu/Web/Mosaic). Although more glamorous browsers are now available, notably Netscape, Mosaic remains a standard by which such software is judged and is a key part of Internet history.

The aftermath of the battle of the browsers

Now that the dust has settled a little since the launch of version 3.0 of Microsoft's Internet Explorer and Netscape's Navigator (in September 1996), it seems clear that Microsoft has caught up on the browser front, in technical terms and through a serious of very adroit moves succeeded in out-manoeuvring Netscape in marketing terms. This does not mean that the Internet war is over: many important issues remain to be decided. But it has put the pressure on Netscape to respond with something dramatic for its next generation of products.

In one respect, Netscape has done that with the recently announced strategy for 1997. For it effectively cedes the general Internet browser market to Microsoft, largely because Navigator costs money, and Internet Explorer doesn't. Netscape has shrewdly chosen to concentrate on the highly lucrative corporate market, and in particular on intranets.

Netscape is perhaps lucky that its only choice happens to be the best thing it could have done anyway. The uptake of intranets has been, if anything, even more spectacular than that of the Internet; not least because the advantages of the former are so obvious. Even better for Netscape, those advantages have recently been quantified by the independent market research company IDC, which found that the return on intranet investment was, on average, over 1000%. This effectively makes intranets the best investment in business today.

Signalling this strategic shift is Netscape's bold move to turn its flagship product Navigator into just a relatively small part of a whole suite of intranet clients, called Netscape Communicator. Navigator 4.0 will be the primary interface for moving around the corporate LAN, but allied to it are a number of other components which together go to make up a complete groupware solution.

Given that Netscape bought the groupware company Collabra back in September 1995, it comes as no surprise to find a client called Netscape Collabra which offers standard threaded newsgroup discussions. But Communicator also offers an enhanced e-mail client (Messenger); a new HTML editor (Composer - replacing the unloved Navigator Gold); and an audio conferencing client supporting the new H.323 standard (Conference). There is also a professional edition that adds a central administration tool (AutoAdmin) and a calendaring and scheduling tool (Calendar).

Matching these are a series of servers which comprise SuiteSpot 3.0 (for full details of these and the client components see Netscape's customary polished document. The main novelty here is the Media Server, for publishing streamed audio files, and the use of intelligent agents (how intelligent remains to be seen).

Also important is the native support for Microsoft Office file formats. This is part of Netscape's other sensible if equally necessary move: recognising Microsoft's place in the Internet/intranet scheme of things. Rather cheekily, Netscape has dubbed this new policy "embrace and integrate" - a pointed reference to Microsoft's own more arrogant "embrace and extend" approach. Among the elements embraced are Windows 95 and Windows NT through tighter integration; OLE/COM and ActiveX (though there seems to be some hedging about how total the support for ActiveX will be, and when it will appear); and Microsoft Office and BackOffice.

IT managers implementing intranet strategies will applaud this partial rapprochement. They will also doubtless be pleased that through these announcements Netscape is providing a totally open, component-based groupware solution. One important effect of the Netscape announcements (due to be implemented early next year) is that the focus shifts from the browser to the server side. The battle here is likely to be complex and drawn out. Writing a good browser is a relatively simple task; creating ten or so enterprise-level servers is a mammoth undertaking. Netscape has not yet come out with final versions of all the elements, and Microsoft is even further behind (though its high-end server project, code-named Normandy, is gathering pace.

Microsoft talks serious money on the Internet

Although Microsoft is currently (as of November 1996) behind Netscape as far as high-end Internet/intranet servers are concerned, with the launch of Merchant Server, Microsoft seems to be further ahead in one particular area: fully-integrated electronic commerce solutions. Microsoft's success in reaching this market first is due in no small part to its acquisition of the company eShop (http://www.microsoft.com/ecommerce/pressrel.htm) earlier this year. The fact that Netscape was also trying to buy the company is an indication of the importance of the technologies it had developed.

Microsoft's Merchant Server is part of the Normandy project, but is closely integrated with the BackOffice range of products. In fact it sits as middleware between Microsoft's Internet Information Server and any ODBC 2.5 compliant back-end database such as SQL Server. A trial version can be downloaded, the file is 13 Mbytes in size, and to run it you will need an NT Server system with 64 Mbytes of RAM. The product's home page; there is a FAQ and a very useful White Paper.

The Merchant Server's front-end offers no surprises, being based on a standard model employed by eShop and other pioneer online commerce sites. Pages of information about products are generated on the fly from the database; the software comes with some ready-made store templates that provide the overall structure and ambience. One element that Microsoft is at pains to emphasise is the scope for offering a personalised shopping experience. For example, the pages generated could be tailored on a visitor-by-visitor basis; it is also possible to offer personalised promotions.

Secure payments over the Internet have been available for some time, but Merchant Server goes further through tighter integration with the external financial system. Using the vPOS system from VeriFone, it is possible to take credit card details from a purchaser which have been sent from any browser supporting secure standards such as SSL, and forward that information to a variety of financial institutions. These will then validate the credit card request, carry out the back-end transactions, and return a confirmation. For each bank or other financial company there will be a corresponding module used with Merchant Server. One of the options is a module for CyberCash's micropayments system.

As part of the complete billing process, there are also various tax modules (including one for European VAT); it is possible to use any currency. Shipping and inventory management are also included. For users, there is also a kind of electronic wallet utility (available as an ActiveX control or Netscape plug-in) that saves having to enter your credit card details every time.

This whole order pipeline, as Microsoft calls it, is modular, so third-party components can be swapped in at various points. Also notable is the fact that the entire control process of the Merchant Server is effected via a Web interface. This is an approach that Netscape pioneered, and is likely to become increasingly the standard way of controlling all Internet/intranet software.

As well as being the first mainstream server product to offer such an integrated electronic commerce solution, Merchant Server is also notable for its pricing. There is a basic charge of around £9000 for each computer that the server runs on, together with a further 'per shop' cost of about £3000 - surely the highest price for a single Microsoft product. There are already a number of sites employing Merchant Server for real-life transactions. UK users include Tesco and Shoppers Universe. A list of international users can be found, including Microsoft's own store.

What users can expect as Explorer goes forth

In the struggle between Netscape and Microsoft for the hearts, minds and desktops of Internet users, release 4 of their respective browsers promises to be particularly important. Internet Explorer 3 showed that Microsoft had caught up with Netscape in terms of basic Internet functions, so the next iteration is crucial for both companies. Netscape needs to demonstrate that it has not lost the initiative, and Microsoft needs to show that it can not only match but trump its young rival.

Microsoft joins the battle at a slight disadvantage: as usual, its products are later than originally promised. Moreover, the available code for Internet Explorer 4 (IE4) is a Platform Preview - a pre-beta, in other words - whereas Netscape's Communicator browser is currently nearing the end of its testing. IE4 can be downloaded - but note that this is a 11 to 20 Mbyte file. For those who prefer to read about IE4, rather than risk running what Microsoft emphasises is not yet stable software, there is an excellent general introduction.

Superficially, IE 4 is not much different from IE 3. Wisely, perhaps, Microsoft chose to adopt its new Windows look from release 3, putting additional pressure on Netscape's product which at a stroke was made to look a little long in the tooth as far as user interface is concerned. One change that is not obvious is the Autocomplete feature: as you type in a URL that you have visited before, IE 4 completes the rest of it.

Also not immediately clear is the power of the new Search button on the tool bar. When you press this, an extra frame appears in the browser with six search engines: Infoseek, Lycos, Excite, Yahoo, Altavista and Hotbot. Once you have selected one of these, the results appear in the new frame. Clicking on a link brings up the site in the right-hand frame, while retaining the other links on the left.

Potentially one of the most interesting features of IE4 is hidden away on the Programs tab of Options on the View menu: Microsoft Wallet. This is a facility that will enable you to buy goods over the Internet using credit cards without having to fill in card or address details each time. Instead, the Wallet will transmit safely all the information to the merchant server. Also hidden away is the thumbnails feature for bookmarked sites. When you view sites that you have bookmarked (stored in a folder called Favourites by Microsoft) there is a humbnail option that shows what the pages look like - handy for reminding you exactly what they refer to.

As well as supporting the usual Java applets and ActiveX controls, IE 4 offers a scripting engine - which allows greater interactivity to be added to Web pages - and support for Dynamic HTML. There are a number of ancillary programs designed to work with IE 4's browser. For example, there is FrontPad, a cut-down but serviceable version of Microsoft's FrontPage HTML editor, Outlook Express, a new e-mail and Usenet client, NetMeeting for audio and videoconferencing, and NetShow, which offers streaming audio and video in a single file.

Also worth noting is the new Windows Address book. This employs the increasingly standard LDAP protocol to allow white pages directories on the Internet and intranets to be searched for information about online users. The above indicates the breadth and richness of the changes to Internet Explorer in this latest version. But in many ways these pale in comparison to an even more ambitious goal Microsoft hopes to achieve with IE 4: the complete integration of the Internet with the desktop.

Microsoft's marriage of the Internet and desktop

Some of the new elements in Microsoft's Internet Explorer 4 Web browser are essentially only incremental improvements to previous releases, but the same cannot be said about the other aspect of IE4, which represents a fundamental shift in Microsoft's entire operating system philosophy. For IE4 is not only the tool you use for browsing the World Wide Web, but has also been fully integrated with Windows Explorer to allow you to browse your PC in exactly the same way.

A trivial example of this unification is that it is now possible to open folders and files with a single rather than double mouse-click - just like ordinary hotspots in browsers. More profound is the presence of an item called The Internet as one of the fundamental elements of the Desktop as displayed in Windows Explorer (along with My Computer and Network Neighbourhood). That is, the Internet becomes simply an extension of your hard disc, with pages that you have visited indicated as a hierarchical arrangement of HTML files.

Moreover, when you select one of these pages, it appears in the right-hand pane of Windows Explorer, as a fully active Web page. This means that you can click on hotspots within the document and you will be taken to the corresponding Web page (assuming you have a live connection to the Internet, or the page in question is cached) which will be displayed within Windows Explorer once more.

This equivalence between internal and external storage space also works the other way. Just as the Internet is turned into a huge hard disc, so your hard discs become Web pages. Using a new Web view option, hard discs and their contents appear on a special Web page within Explorer whose HTML you can examine and modify using a Wizard. You can also create internal bookmarks: not to Internet pages, but to files on your hard disc that you can jump to as you would with external Web pages. As Microsoft's example shows, this could be a real boon for corporate IT departments, enabling them to set up PCs for non-technical users where explanations are not hidden away in help files, but found in the file listing itself (as a background HTML document).

Even the underlying desktop on your screen is an HTML file. As such, it can have not only live hotspots but also ActiveX controls too. This Active Desktop, as Microsoft calls it, is another major innovation for the Windows interface. For by embedding the appropriate ActiveX controls it is possible to obtain constantly updated information from the Internet which is then displayed automatically on your desktop - the ultimate in integrated push technologies. This new style of desktop push is also a feature of Netscape's Netcaster technology, part of its Communicator product, which will be discussed in a future column.

Other aspects of Microsoft's integrated Web and desktop include the new Taskbar, reminiscent of the icon bar in IE3 and above, and the Find option on the Start button. As well as searching for files through your hard disc, you can search for information out on the Internet and even for people using LDAP white pages servers. I was very impressed with IE4 - not just for what it did, but that Microsoft had done it at all. The extent to which it has managed to meld Web and desktop is extraordinary, and is eloquent testimony to just how serious Microsoft is about the Internet.

However, my admiration is tempered with considerable apprehension. As I have pointed out several times before, ActiveX controls are fundamentally insecure. Unlike Java applets, they are not confined within a safe 'sandbox', and the Authenticode scheme used by Microsoft essentially means that they are potentially viruses with birth certificates. Imagine, then the devastating power a rogue ActiveX control could have in the context of IE4 where there is no distinction between Web and desktop or corporate network. IE4 will rightly be a great hit with end-users, but corporate IT managers will need to control its use extremely tightly.

Why users must resist Microsoft's advances

The software giant is slowly but surely increasing its influence in different spheres - which should set the alarm bells ringing.

The legal battle between Microsoft and the US Department of Justice indicates how Microsoft seemed to be quibbling, to say the least. Its interpretation of the judge's initial order forbidding it from forcing computer manufacturers to bundle Internet Explorer with Windows 95 has been similarly selective: it now offers what are effectively non-viable versions of Windows 95, with Internet Explorer and various ancillary but essential files removed.

Not content with this less-than-conciliatory approach, Microsoft has gone on to attack not only the so-called 'special master' appointed by the judge to carry out further research into the technical and legal issues surrounding this case, but also the judge himself. Of course Microsoft has a perfect right to defend itself. But what is interesting is its attitude to the whole legal process. This matters because whatever power Microsoft wields today is likely to pale into insignificance compared to the control it will soon have.

In the context of operating systems, Microsoft now has few real competitors. Windows 95 rules the desktop, and Windows NT is fast making inroads into the fragmented Unix market.

Cause for concern
Microsoft's impending dominance in all operating system sectors is not in itself cause for concern. If the history of computing shows anything, it is that once-powerful players can be reduced to a secondary role by the arrival of new technology in a matter of years (the case of IBM springs to mind). What is far more preoccupying is Microsoft's use of its technology - particularly Internet technology - to gain control of something much more fundamental: infrastructure.

Even though the Microsoft Network (MSN) has been a damp squib, it has taught the company a very important lesson about the difference between providing and exploiting online access. As MSN becomes increasingly marginalised, Microsoft is replacing it with a series of standalone Web services that already are among the key players in their fields.

For example, Expedia is one of the top online travel services, CarPoint is a major site for online car sales, and there is an imminent house-selling service, Boardwalk. Also worth noting are an online mall called Microsoft Plaza and a growing collection of local listings magazines known as Sidewalk

Given the $9bn (£5.6bn) cash reserves Microsoft has to realise these and other ambitions, its aggressive approach to the current court case is worrying. In particular, it raises the question of how the company might act if it ended up dominating the US's and perhaps the world's commercial, professional and educational infrastructure.

It also begs the question what can be done to counterbalance such an unhealthy concentration of power. In its escalating attempts to Balkanise the world of Java - and to turn into another Unix - Microsoft has provided the clearest indication that it is here that it sees its most serious rival. Java is not perfect, but it may offer the only alternative to Microsoft. A programming environment that is truly platform-independent would level the playing field and give other software houses and content providers a chance. Companies might like to think about deploying it more widely - while they still have the choice.

The Phoney Browser peace

Those with good memories may recall when browser updates seemed to appear every week, and when duration was measured in Web years, which ran 10 times faster than the more mundane variety. No longer: Web years are no more than double the normal kind, and new browser versions are very infrequent.

For example, it is now over a year ago that version 4 of Netscape Navigator and Internet Explorer first appeared. Internet Explorer 5 has finally made its appearance - as a pre-beta version - (July 1998) while Netscape Navigator 5 is still some months off. Nothing could illustrate more strikingly how browsers have moved away from the centre of the Internet stage.

Of course, they remain pivotal to the way the Internet and increasingly business works, but as a result they have become more workaday, the development cycles more leisurely and the improvements more incremental.

For instance, Internet Explorer 5 seems identical to version 4. One of the few differences between the two is the slight clicking noise that is produced when you select an option from the icon bar or follow a hypertext link. Less obvious is the new face of the Organise Favourites dialogue box. This comes up as a browser window written using Dynamic HTML, even though unusually it has no menus or other such screen furniture.

The biggest changes from Internet Explorer 4 are in the area of Dynamic HTML support. The company seems to be moving to make the language one of the main ways of creating and controlling objects on the desktop. Similarly, XML capabilities have been enhanced in Internet Explorer 5.

For the first hints of how all these new features might be applied, there is an overview of Internet Explorer 5, while more detailed information can be found.

If the development process of Internet Explorer has become largely hidden, with little obvious to show for the new engineering that lies beneath the browser surface - much of it tied to the Windows platform - Netscape is adopting a different approach for its next major browser release.

Netscape has boldly chosen to throw in its lot with the open source camp. This means making the development process of Navigator largely public. The results are at the Mozilla site where the source code can be downloaded, and the current status of the various building blocks of the browser reviewed.

However, quite how Netscape will pull all these together remains to be seen. In the meantime, it is about to issue version 4.5 of its browser. This will have new features including the use of the Alexa navigation system discussed elsewhere.

This lull might suggest that the browser wars are over. Certainly, in terms of market share things are settling down to a rough parity. With Netscape offering its browser free of charge, it is likely that it will staunch the ruinous flow of users towards Internet Explorer.

Similarly, now that a US Court of Appeals has overturned the initial decision regarding the bundling of Internet Explorer with Windows 95, it can be taken for granted that Internet functions, and Internet Explorer, will be part of the various Windows operating systems.

But this does not mean browsers have reached the end of their evolutionary path. In terms of user interface and basic HTML features perhaps they have, but the dynamic version, and particularly the imminent arrival of XML in the Web mainstream, will have great repercussions on browsers.

Microsoft has consistently led here, with limited XML support in Internet Explorer 4 now enhanced in version 5. Netscape has done important early work on the XML application Resource Description Framework, and needs to turn this now into new navigation features in the next version of its browser. So while the first browser battle may have ended in an uneasy truce, there can be little doubt that both parties are girding their loins for the next stage of the continuing war.

Cable modems

A natural instinct for Internet users is to want faster access, especially as new multimedia features such as animated gifs, Shockwave files and Java applets become increasingly common. For companies with leased lines, this comes down to cost: you can have any speed up to tens of megabits per second if you are prepared to pay for it. But for those using dial-up access, there are natural limits imposed by the ordinary telephone network that is generally used to reach the Internet Service Provider (ISP).

Modems pushed beyond 28.8 Kbit/s to 33.6 Kbit/s in the summer of 1996, with the possibility of 56 Kbit/s being dangled by Rockwell. ISDN, offers 64 Kbit/s (128 Kbit/s if two lines are combined) by using a purely digital telephone connection. There is, however, another alternative that, in the US at least, is passing from pious hope into everyday reality. Instead of using the main telephone network, the idea is to employ cable television as an alternative infrastructure for accessing the Internet.

Like the telephone, cable TV offers a huge wiring system that can be pressed into service for carrying digital data, with the bonus that it was built for high-bandwidth transmission from the start. A new device, the cable modem, is required to translate from computer digital signals into a form that is acceptable to the cable network. On offer are download speeds in excess of 1 Mbit/s; upload speeds are more modest - generally tens of kilobytes per second - but fast enough for most end-users not trying to run Web sites using this method.

Cacheing

The main constraining factor for almost all Internet activity is bandwidth, the quantity of information that can be sent over a given connection. By far the easiest way to speed up the transport of data is not to send it over the Internet at all.

This may sound rather paradoxical, but is based on the observation that many of the things that you do on the Internet you have done before (that is, you often visit the same sites, or step back to view the same pages). The idea behind Internet cacheing is that by storing certain kinds of Internet information it is possible to retrieve much of your online requirements locally, thus avoiding the constraints of bandwidth.

Perhaps the best-known form of Internet cacheing occurs within leading Web browsers like Netscape. There, pages that you visit are cached for the session so that you can easily move back to one that you have visited without needing to reload it from the distant site.

Similarly, images can be cached between sessions, again allowing pages to be retrieved more quickly (since only the text need be collected, with the cached images dropped in from the local store). Microsoft's new Internet Explorer, found in the Windows Plus! kit for Windows 95, goes even further, and can cache entire Web pages between sessions, providing a kind of offline WWW reader.

Internet providers sometimes make use of caches in order to improve the speed of the service that they offer. For example, they may set up a proxy server to hold all or most of the files that have been accessed; when one of those files is requested again by a user, it comes from the cache rather than from the Internet site.

Cache poisoning

The Domain Name System (DNS) lies at the very heart of the Internet. Without it, we would be reduced to entering addresses in the form 123.45.67.89 instead of as www.bloggs.co.uk. It has also ensured the recent rapid expansion of the Internet and in particular its extraordinary uptake in the business sector.

It is therefore surprising that this fundamental element of the Internet's infrastructure remains almost completely defenceless in the face of subversion. This was proved when the DNS files were intentionally corrupted so as to redirect traffic meant for one site - the main US registry for Internet names - to another run by an unauthorised rival.

The technique used was one known by the dramatic name of cache poisoning. This exploits the fact that the DNS service assumes that participating servers are well intentioned; after all, it is in their best interests that the domain name system be as efficient as possible.

DNS servers therefore accept without question certain kinds of information fed to them by other domain name system servers. Generally this information, stored in their caches for future use, helps speed subsequent conversions between the numerical and domain-based addresses.

However, as the attack on InterNIC showed, it is possible for the cache to be fed false data: to be "poisoned". If the poisoned caches belong to sufficiently important DNS servers, this can cause wholesale redirection of Internet traffic.

To solve this problem, several updates to DNS incorporating security and authentication have been proposed, but not yet implemented.

Cascading Style Sheets (CSS)

A solution to meeting the problem of ever increase demands of Web page design is a new HTML standard called Cascading Style Sheets (CSS). The official Web stylesheets home page is at http://www.w3.org/pub/WWW/Style/, hosted by the World Wide Web Consortium, and itself employing stylesheets. This has a useful list of HTML editors that support this new feature. The site also maintains the official CSS recommendations at http://www.w3.org/pub/WWW/TR/REC-CSS1. Perhaps the best overall starting place for exploring the world of stylesheets is Microsoft's page on the subject at http://www.microsoft.com/opentype/css/default.htm (also using stylesheets). Microsoft has been at the forefront of supporting CSS in its Internet Explorer, starting with version 3, whereas it is only with version 4 of Navigator that Netscape caught up.

Case sensitivity

One of the many confusing aspects of the Internet is the extent to which elements within URLs can be altered when you enter them. Clearly, nothing major can be changed - no spaces added or letters substituted. But there remains the question of case, and which parts of an Internet address can and cannot be changed from upper to lower case, or vice versa.

In fact the rule is quite straightforward: you can change the actual address of the machine that you are accessing in any way, but you cannot safely fiddle with the directory structure that follows it. Thus http://www.ibm.com/ will work just as well as HTTP://WWW.IBM.COM/ or even HtTp://WwW.IbM.cOm/ .

However, the same is not true of the elements that follow this part, which must be entered exactly according to the information given. The reason for this stems from the underlying structure of the URL.

The first part - the domain name - is simply an alternative way of expressing the 32-bit address that is generally written as four decimal numbers, each less than 255, to give a general shape of 123.45.67.89. When messages and requests are routed over the Internet, the domain name is converted to the numeric equivalent, and the exact form of the name as far as case is concerned is irrelevant.

However, the remaining part represents the directory structure of the site being accessed. The commonest operating system found on these remains Unix, which is case-sensitive when it comes to directories and file-names. For sites running Unix there is therefore a world of difference between http://www.abc.co.uk/pub/ and http://www.abc.co.uk/PUB/, even between http://www.abc.co.uk/pub/read.me and http://www.abc.co.uk/pub/Read.me.

Castanet

Two of the most important innovations on the Internet during 1996 have been Java and PointCast. In some ways they are very similar: both are about transmitting information over a network - the Internet or intranets - to provide functionality. Java has the advantage that it transmits fully-fledged programs, while PointCast offers the added bonus of constant updates.

Given the plaudits which Java and PointCast have won separately, it is therefore not such a surprise that an approach which puts together the best of both to offer constantly updated programs delivered over networks should meet with equal acclaim. The fact that the company doing so, Marimba, was founded by several key members of the original Java development team only adds to its credentials.

After months of rumours Marimba has released beta versions of its first product, Castanet. As the basic PointCast approach dictates, there is a transmitter and a tuner (respectively the server and client elements); a Visual Basic-like development tool for this new medium, called Bongo, has also been written. They can all be downloaded from Marimba's home page at http://www.marimba.com.

Two official books about Marimba products have been published; both have been written in co-operation with the company's developers. They are guides to the application distribution technology Castanet (£37.50, ISBN 1-57521-255-2) and the Java interface design environment Bongo (£37.50, ISBN 1-57521-254-4).

In keeping with the broadcasting metaphor, information is transmitted and received on various channels; a transmitter (that is, Castanet server) may offer several channels. These channels are effectively distinct programs (written in Java initially, but not limited to this language) that are sent from the transmitter to receiver. They are stored on the client machine and run by the tuner, and receive updates of both data (for example a share service might display the latest prices) as well as of the basic program itself. This is one of the great advances of Castanet, particularly with respect to Java: it allows updates to be carried out on the fly and incrementally. This avoids having to download large applets again just to get the latest version.

The tuner adopts a tabbed sheet approach, familiar from Windows 95. A configuration sheet allows you to set how often channels are updated. A Hot sheet provides some of the available transmitters. Marimba is currently providing an open transmitter with developer channels at http://trans.havefun.com:5282/. The Listing tab is the way to access transmitters in general, though for this you need to know their address; typically a port is also specified for Castanet transmitters.

Once a transmitter is located, you can obtain a listing of all its channels. These are displayed as hanging off a directory-type tree of the main transmitter addresses, and contain a short description about the nature of each channel. Double-clicking on a new channel causes the associated application to be downloaded and then run. These applets can be quite large - several megabytes - but as mentioned above, downloading them is a one-off operation.

Among well-known names included on the Hot spots tab are Talk.com, HotWired's Castanet-based chat channel (at http://trans.talk.com: this address can be entered directly in the Listing tab of Castanet, the Excite search engine (at http://trans.excite.com), and the Java directory Earthweb (at http://trans.earthweb.com). As you might expect, Marimba itself has a number of interesting channels (at http://trans.marimba.com), including an online tutorial that uses the Marimba tuner as a proxy server to an ordinary browser, serving up standard HTML documents (this requires a slight manual modification to the set-up of your browser).

Encouragingly there are also already a couple of entries from UK users. One is at http://trans.totem.co.uk. Particularly impressive is the other, the KMI/Stadium project (at http://kmi.open.ac.uk/stadium) which is an ambitious mock-up of a distance learning project using Internet channels to broadcast lectures worldwide. The application of this kind of idea in a business context is obvious. Although it is early days yet for this technology it is clear that it adds greatly to the possibilities of Internet and intranet information provision. Both Microsoft and Netscape have expressed interest in the idea, and it will be interesting to see how this area develops.

Cello

In 1994, there were two main Microsoft Windows browsers for the World Wide Web: Mosaic and Cello. The history of Mosaic since then has been dramatic: a constant flow of updates, the licensing of the program to a wide range of companies (including at one point Microsoft) and the appearance of Netscape; Mosaic's intellectual progeny written by most of the team of programmers that devised and created the original product.

Against the blaze of publicity that Mosaic has enjoyed, Cello has almost vanished from sight. Things have not been helped by the fact that a number of commercial companies, such as Frontier Technologies Corporation and Booklink Technologies (now part of America Online), have also leapt into the browser market with products that offer alternative ways of accessing the Web; shifting the emphasis on to what is new and innovative rather than old and established.

And yet Cello has always had its loyal band of users. If the program was rather less showy in its capabilities compared with Mosaic, it was also much simpler to set up and easier to use. It also came with very full built-in help from the start - something that Mosaic lacked, although it does not at least have links to online help from its home site (an approach pioneered by Netscape). Cello is available from many sites, including its home site at ftp://ftp.law.cornell.edu/pub/LII/Cello/cello.zip.

Certification

The public key encryption techniques used in the Pretty Good Privacy program and elsewhere (for example in the Netscape browser) solve several problems to do with security. Most obviously they provide a means of sending a message to someone in a completely secure fashion, without needing to exchange secret keys beforehand (like conventional encryption). This is done using two keys: a private one (which is kept secret) and a public one (published for others to use).

Public key encryption also allows you to be sure that a given message came from the person holding a certain private key, since only the private key could have been used to generate the encrypted form of the message that the public key unlocks.

However, on its own this approach cannot guarantee that the person who has the private key is who he or she claims to be. Imagine the situation where somebody publishes a public key falsely claiming to be someone else: encrypted messages will then indeed appear to come from the claimed person (because the published key unlocks the messages), but it is the matching of the key with the individual that is untrue.

The way round this is to use what is called certification. It is possible for a trusted third-party organisation to add its seal of approval to certify that a given public key did indeed come from the person who claims it as his or her own.

Of course, this creates a situation where you need some trustable authority for certifying keys, and it has been suggested that independent bodies such as the Post Office, banks or similar might fulfil this role.

Chain E-mail

There are two basic forms of Internet chain E-mail. The first is the "good luck" chain letter where you are invited to forward a self-perpetuating E-mail message that is supposed to bring you and the people to whom you send it good luck. It explicitly tells you to send no money, but does threaten you with "bad luck" if you fail to pass on the E-mail. So much for primitive superstitions being banished by high technology.

The other form of chain letter is more insidious, and probably illegal in most countries where the Internet is widely used. It specifically requires you to send money to people at the head of the chain. The nominal logic of such pyramid schemes is, of course, familiar: that by doing so and passing on the message it is a mathematical "certainty" that in due course you will receive huge sums from those lower in the pyramid.

If for no other reason, chain letters and pyramid schemes should be studiously avoided because they are a complete waste of the Internet's valuable bandwidth. Fortunately, on the rare occasions when they rear their heads in public Usenet newsgroups they are quickly shot down by many of the vast majority of intelligent Internet users who have a rather more useful message to pass on.

Client-server

Many of the Internet's services operate using the client server model, which is widely employed in business for corporate databases and other applications. The basic idea is to split up a function into two separate elements: the client, which instigates actions, and the server, which responds to them. The client and the server are usually physically distant, and connected by some form of network (such as the Internet).

the advantage of the client-server model is that most of the processing can be done where data and processing can be done where data and processing power are concentrated - for example a mini or mainframe with extensive storage. Only the information requested by the client is sent back over the network, thus minimising the traffic. This contrasts with the situation where the processing is always done locally, requiring larger amounts of data to be sent and for greater processing power to be available on the local machine.

Archie file-searching, Gophers and the Wide Area Information Servers (WAIS) also use the same model. Less obvious is the fact that the familiar browser are examples of WWW clients.

Common Gateway Interface

Although the World Wide Web is an immensely impressive achievement, the underlying technology is extremely simple. When you use a Web client like Netscape or Internet Explorer, you send a simple request using the HyperText Transfer Protocol (HTTP) to a Web server which then sends you back a page written using the HyperText Mark-up Language (HTML). For obtaining information, or just browsing, this is fine; but if you need something more complex, the limitations of this approach soon become apparent.

To get round the problem, a way has been devised whereby ancillary programs can be run at the Web site to add any kind of extra functionality to the basic features of the WWW server. For example, the forms that appear in many Web pages send information back to the server which is then processed using one of these programs - perhaps to search through a database, or to take an order for a product and process it. Another common use is to invoke different responses when different parts of a clickable image are selected.

The Common Gateway Interface (CGI) describes the way the Web server and the external program interact. You can often detect the presence of programs using the CGI by the appearance of the directory /cgi-bin/ somewhere in the URL.

This indicates that when you access a page containing this path you are probably running a program, sometimes called a CGI script. Such scripts are often written in Perl, but other languages like Visual Basic and Java can also be used depending on the circumstances.

Compression

No matter how fast your Internet connection, or how big the hard disc on your server, they are never fast or big enough. One way to increase both the effective connection speed and hard disc capacity is to use compression.

Essentially compression techniques store a file in a way that exploits structures and repetitions that exist within all but the most random of digital files. The greater the structure within a file (for example a graphics file might contain distinct areas of one particular colour) the greater the possible compression.

As a result, wherever you go on the Internet you are almost certain to encounter compression. If you are using a dial-up connection, then your modem will probably offer some kind of compression on the fly, where files sent and received are squeezed for transmission over the telephone line and unsqueezed at the other end.

Similarly, files held on FTP or Web servers are usually stored in a compressed format, both to save space there and to reduce the download time for visitors (thus freeing up the connection for others). This is evident from the file extension employed. Typically these are .Z and .gz for Unix files, .zip for PC programs (less-common formats include .lzh, .arj and .arc), and .sea for the Macintosh.

The latter is unusual in that it is a self-extracting format: to retrieve the file it contains, you do not need any additional software to carry out the decompression; simply running the program causes the compressed file to be reconstituted automatically. There are facilities with PKZIP (Winzip) that convert a .zip file to a self extracting .exe file.

Connectionless protocols

The experience of using the World Wide Web is such a smooth and fluid one that the nature of the underlying technology is a little surprising. For when you are accessing a site, the connection remains in place between your browser and the server only as long as it takes to download the file (or files if there are inline graphics, for example) to your desktop machine. The rest of the time - while you read the text or admire the images - the server forgets about you completely.

Indeed, when you click on a link in a page, in general the site holding that document has no memory of you retrieving it whatsoever. For this reason, text files called browser cookies are sometimes employed to store an access history on the client machine that can be retrieved by a server.

Technically speaking, the HyperText Transport Protocol (HTTP) which handles the communications between the Web server and its client (the browser) is said to be connectionless, since there is no permanent link between the two sites.

This makes the Web very efficient in terms of bandwidth since it uses the Internet only at the moment when there is something to download (or upload if you are filling in a form). Other protocols - telnet and FTP, for example - maintain a connection between client and server whatever you are doing.

Even these, of course, are only virtual connections. They are unlikely to have any fixed physical path through the Internet's many constituent networks given the way in which all data is converted into packets that are sent independently and then re-assembled on arrival.

Cookie file

One of the problems of downloading executables files from the Internet (even assuming that they are virus-free) is that when you install them they start spraying components hither and thither until you end up with a hard disk full of unmememorably-named files.

Netscape is no exception, and many people must have wondered what on earth one called cookie.txt was for. Surprisingly, this is not just some feeble programmer's joke, but is a moderately well established fact of Internet life.

The cookie file is used to store information between Web sessions (normally, temporarily cached information is lost when the browser is shut down). This information is available to Web sites that may be configurable in some way, and which need to retrieve previously-defined parameters. One of the best examples of just a configurable site is the home page of the Microsoft Network (at http://www.msm.com/). One of the options from this page is to customise it in various ways: adding favourite links, search engines, news stories and even screen colours.

Obviously the site itself cannot store all this information for every visitor. Instead, this customisation is held in a cookie file from where it can be retrieved by the browser and sent to the web server the next time you log on.

Unfortunately, with typical Microsoft presumption, instead of placing the file cookie.jar in the main Internet Explorer directory (as would be natural) , the company has elected to hide it away in the main Windows directory, cluttering it up even more.

Corba and IIOP

One of the most important aspects of the Internet - and of intranets - is platform independence. There is a real possibility of mix and match, where elements of one manufacturer can be integrated seamlessly with those of another. Add to this the distributed nature of the Internet/intranet and you have the foundation for a component-based approach to computing that allows pieces to be plugged together wherever they are located on a network.

Against this background of interest a pre-existing approach is finding itself increasingly in the limelight after years of obscurity. Called Common Object Request Broker Architecture (Corba), this object management protocol has the backing of almost the entire computing industry. In addition to describing how distributed elements should be constructed so that they can work together in this seamless way, Corba also defines how the process should work across the Internet.

Known as the Internet Inter-ORB Protocol (IIOP) - Corba seems blighted with very forgettable names - this area has received a significant boost through Netscape's espousal in its Open Networking Environment. The success of IIOP and Corba would seem assured if Microsoft were not pushing its own Distributed Component Object Model approach as an alternative, even though it is nominally a member of the Object Management Group which oversees the development of Corba and its related standards.

Crawlers

Web robots, spiders, crawlers and worms conduct methodical searches through the whole of Web space, trying to find all URLs that are connected. As an example of the scale of the problem of indexing WWW, Lycos, which can be found at http://lycos.cs.cmu.edu has a database of over 1.89 million URLs.

Searching is simple; just enter a word or short phase and then Lycos will find matches or near matches among its holdings. The match does not have to be the site name, it can be hidden fairly deeply within the site's contents.

The Web crawler at http://webcrawler.cs.washington.edu/WebCrawler/WebQuery.html has around 70,000 documents referenced in a 50Mbytes index. It knows of a further 700,000 un-visited documents in a database of 40Mbytes.

Credit card security

It is accepted online folk wisdom that you never send credit card details over the Internet. The reasoning behind this is that ordinary Internet messages are more like postcards than letters, with anyone being able to read them who takes the trouble to do so.

While this is certainly true, it is hardly worth the effort required for this kind of credit card fraud, and such thefts are more or less unheard-of. Clearly, though, not being able to use a credit card for purchasing online because of these fears would render the Internet useless for day-to-day commercial transactions. Fortunately, there are now a number of completely secure methods of sending such details across the Internet. All of them use the public key encryption techniques described elsewhere.

The basic TCP/IP standard that underlies the Internet says nothing about the security of the data it transmits. Although TCP/IP is generally called a 'packet'-based system, in truth information is sent more like a series of postcards that can be read by anyone who takes the trouble to do so. This inability to guarantee secure financial transactions has perhaps been the biggest obstacle to the routine use of the Internet for all business activities.

Of course, technology companies have not been slow to spot this need and to satisfy it. Indeed, the problem has been that there is not just one but two highly-suitable solutions. The first comes from Netscape Communications Corporation, the company that burst upon the Internet scene at the end of 1994 with the slick and powerful Netscape Navigator browser.

Built-in to this product was a security standard called Secure Sockets Layer (SSL) - see the URL http://www.netscape.com/newsref/std/SSL.html for more details. When the Netscape browser connects to a Netsite Web server, the SSL standard allows any data that passes between the two to be encrypted, using almost identical techniques to those employed by the Pretty Good Privacy system. Applied fully, these techniques can be - according to current knowledge at least - for all practical purposes unbreakable. Perhaps just as importantly for ordinary uses, they are implemented in a completely transparent fashion so that you are unaware that they are in place.

Or almost unaware: for Netscape has instituted a rather neat visual element that indicates the security or otherwise of the transaction: in normal, non-secure transfer, there is a small broken key in the bottom left-hand corner of the screen. As you pass to secure mode, this key magically joins itself together. To see this in action, try accessing the URL http://www.virtualvin.com/ with a Netscape browser.

SSL has the advantage that it exists now, and comes as standard with what is almost certainly the most popular Web browser. Its rival, called Secure HyperText Transfer Protocol (see http://www.eit.com/projects/s-http/), is backed by the NCSA (where Mosaic was developed) and the CommerceNet organisation (http://www.commerce.net/); it also seemed to be favoured by the WWW Consortium (but see http://www.w3.org/hypertext/WWW/Security/News/950303_Statement.html). It works in essentially the same way as SSL in that transactions are encrypted using the techniques also employed by PGP and without the user needing to worry about the details.

Until recently it looked as if the Internet world was faced with the prospect of a highly-damaging split as software vendors and commercial users lined up with one of the two opposing camps. Although working in similar ways, a browser set up for one could not work securely with servers using the other.

However, rather unexpectedly, all parties have shown tremendous good sense and got together to work on a common standard embracing both security schemes. This is possible because they work at different levels. SSL creates secure channels across the Internet at a very fundamental level (and therefore is not just restricted to Web operations, but can also be used for secure mail, news, FTP, telnet and Gopher operations), whereas SHTTP, as its name suggests, is bound up with the HTTP protocol that ties the World Wide Web together.

A variant of SHTTP has been developed by a UK company MarketNet whose product Workhorse uses SSL to encrypt E-mail containing financial information (see URL http://193.118.187.105/help/system/horse).

Apart from the fact that the two main camps have come together, thus avoiding any direct conflict, the success of this new standard seems assured by the fact that the move has been sanctified by four major online players: IBM, America Online, CompuServe and Prodigy. All of whom have invested (along with Netscape) in the company Terisa (see http://www.terisa.com/) that will produce the necessary toolkit for the new standard.

Unfortunately this cosy situation has now been disturbed by Microsoft's announcement of its Secure Transaction Technology (STT), developed jointly with Visa. The latter had originally joined with MasterCard to draw up a standard for the transmission of credit card details over the Internet, but the STT announcement sees Visa and Microsoft ranged against Mastercard, Netscape and many other major players in this market - and the waters muddied once more.

Another way to protect credit card details during transfer across the Internet is to avoid the latter completely. Instead, you set up an account with a company on the Internet (effectively a cyberbank) and use funds held there to pay for goods sold by companies or individuals who also have accounts. At the end of the month you are billed for your purchases, while vendors are credited with the appropriate sums. The most-developed scheme along these lines is the one run by First Virtual (see http://www.fv.com:80/info/). A similar system can be found at http://www.checkfree.com/.

Since the status of such cyberbanks is a little uncertain from a regulatory point of view, other companies have chosen to take a related but different approach. Instead of holding funds for you which can then be spent, they maintain instead direct links to banks, arranging for monies to be transferred from your account to that of the vendor. In this way no credit card or account details are passed over the open Internet, just orders to buy and confirmations of the sale. Among the schemes in various stages of development are those run by Cybercash (at http://www.cybercash.com/), NetChex (http://www.netchex.com/), Netbill (http://www.ini.cmu.edu/netbill/) and NetCheque (http://nii-server.isi.edu/info/NetCheque/).

NetCheque will also be offering something called NetCash, which represents the final stage in the evolution of online payment systems. Here there is no recourse to credit cards or matching orders of purchase and sale: instead a completely digital equivalent of money is employed. This is achieved using techniques very similar to those behind the encryption tool Pretty Good Privacy (and the SSL and S-HTTP standards). It allows messages with a value guaranteed by an ordinary bank to be sent safely, anonymously but non-reputably over the Internet.

Link, the UK's largest shared cash machine has put its directory on the World Wide Web. It can be accessed at http://www.link.co.uk/LINK/, and has a search engine that can be used to find cash machines local to a given address.

The pioneer and leader in this area is Digicash, whose pages at http://www.digicash.com/ have more background on the subject. Also worth noting is the Mondex consortium that, although primarily concerned with electronic money on a card, also has plans to allow its digital currency to be used over the Internet (see http://www.mondex.com/mondex/net.htm). Such schemes may seem the stuff of science fiction rather than business at the moment, but it is likely that they rather than any of the others discussed above will form the basis of Internet commerce in the not-too distant future.

Several research projects are investigating ways of pulling together the security technologies and making it easy-to-use. The European Community for the development of a "Secure Electronic Marketplace for Europe" (Semper) has a home page at http://www.darmstadt.gmd.de/~semper/, although all the links here seem to be out of date. A better starting point is http://www.digicash.com/products/projects/semper.html).

Cybersitter

One of the brakes on the uptake of the Internet in businesses has been a worry that staff may abuse it. Along with the fear of general time-wasting, there is the more serious concern that the Internet may be used for undesirable activities such as downloading pornographic material. Obviously this is part of a much larger issue, that of who ultimately controls the Net, and who can or should apply any kind of censorship. The extent to which these questions are unresolved was highlighted recently by the strange case of the disappearing CompuServe Usenet newsgroups.

According to CompuServe, German prosecutors informed the company that making available 200 of the more extreme newsgroups in Germany would have left CompuServe and its employees liable to legal action.

Unable to block Usenet feeds in one country only, CompuServe pulled all 200 areas from its global service, provoking a furious debate in the US, where this was seen as a gross infringement of personal liberties. What makes the matter even more curious is that the German authorities deny that they demanded this action, and say that CompuServe overreacted to their initial comments.

Clearly many issues need to be clarified in this area. But there is a growing feeling - among Internet users, at least - that legislation is not the way to tackle people's concerns about who can access what on the Internet. The alternative is to control Net access at the user's rather than the supplier's end.

For example, the new AOL service, allows several accounts to be set up, with one acting as the overall arbiter of what the others can access. It is therefore possible to block access to all newsgroups, to those containing certain key words or to specified areas. It is also possible to block all binary downloads.

Of course, if you know what you are doing, it is possible to circumvent these restrictions, but this feature is aimed at parents who are concerned about what their children access.

To serve this and related markets, a new class of software designed to offer control over what users do online has sprung up. Because of the way they work, many such programs could also be used in businesses by IT managers worried about rogue use.

For example, Net Nanny (at http://www.netnanny.com/netnanny/) allows you to block words, phrases, sites and content, all according to a user-defined dictionary. This applies to all Internet activity, whether it is on the Web, in Usenet newsgroups or at FTP sites, and can include E-mail.

It is also possible to block the transmission of sensitive information, and the software provides a full audit trail of attempted accesses to blocked sites. A corporate version that works across local area networks is under development.

WebTrack (at http://www.webster.com/) is a complete monitoring, filtering and management server, designed to run as a proxy behind a firewall. A complete HTML log of all corporate accesses is maintained, and daily server logs can be exported to Excel for reporting and analysis.

WebTrack is Unix-based, unlike the programs above. These all run under Windows (with some Macintosh versions under development), and free, limited function demo versions are available for most of them.

Denial of service

One aspect of the Internet that is often hard for businesses to accept is that sites are either public or private: there is no half-way house whereby the site is visible and yet cannot be reached in some way.

This means any public Web site, for example, is exposed to all kinds of probes from anyone on the Net. The most familiar attack involves trying to exploit weaknesses in the configuration of the software running on sites to gain control, but recently another has become more common: denial of service. Here the idea is not to break into a site or network, but to block access to it, often by abusing the logic of the software itself.

For example, denial of service attacks can work by causing programs to crash or get into states that cannot be resolved, such as waiting for confirmations that will never come. Just as ordinary cracking is foiled by applying patches to the relevant code to close any loopholes, so denial of service attacks are fought by tightening up underlying logic of the programs involved.

Denial of service attacks are generally carried out for the hell of it. When crackers find that a site - ideally a high-profile one - has omitted to configure software correctly, or failed to apply the latest patch, the temptation is to exploit that weakness as a demonstration of supposed superior computing skills. But in an age when Net connectivity is becoming indispensable, the use of such an attack as a weapon of illicit commercial warfare cannot be far off, and may already be here.

Diffie-Hellman

Encryption is crucial for the application of the Internet to business. Without it, communications are not secure, and the most important ingredient of electronic commerce - trust - is missing. Fortunately there are now many different encryption approaches that can be applied in electronic commerce applications. But because so much hinges on this area, encryption has become something of a battlefield.

There have been disputes about cryptographic patents (between the top company in this area, RSA, and the inventor of PGP), and there is the continuing argument about whether industrial-strength encryption should be exported from the US. In this context, the passing of the Diffie-Hellman encryption system into the public domain assumes a certain importance. Although few will have heard of this approach, it is in fact one of the earliest encryption systems and therefore one of the most-tested and best-understood.

If few have heard of Diffie-Hellman, even fewer will care about the mathematical details, which depend on the properties of very large prime numbers, as do many encryption techniques. In practical terms, Diffie-Hellman allows a secret number to be established by two parties using public communications. This number can then form the basis of conventional secret key encryption. The fact that the technique is now freely available should lead to a renaissance in its use and more low-cost electronic commerce solutions.

Digital certificates

Public key or asymmetric cryptography works using two encryption keys. One of these is kept secret (the private key), while the other is made public (by posting to Web sites, E-mailing to potential users etc). Information encrypted with one key can only be decrypted with the other. So if information is encrypted using an individual's public key, you can be sure (at least as far as is currently known) that only the person with the corresponding private key can read that message.

Public key cryptography works so well that it has been adopted for just about every situation where secure communications are required. RSA, the company that dominates this area, provides details. But public key techniques on their own are not enough to guarantee completely the secure transfer of information.

For even though the encryption techniques themselves are secure, there is the issue of identity. When you send a message to someone using their public key, you can be sure that only the owner of that key can read it. However, you cannot be sure that the key really belongs to the person you wish to communicate with. It would be easy for somebody to impersonate an individual online, and make available his or her own public key. Then any messages you sent using this key would be read by the impostor.

To get around this fundamental problem, the concept of digital certificates has been devised. The idea is simple: to associate an identity with a given public key. The certificate is provided by a certification authority. The authority is responsible for establishing that a given public key does indeed belong with a given individual. This can be done at various levels, ranging from little more than a simple confirmation that the person concerned has claimed the key, up to higher levels where individuals must present themselves in person with personal documents to prove their identity along with their public key.

As well as details about a public key and the individual associated with it, a digital certificate also includes information about who issued it and when it expires. The official standard is called X.509 v3, for which there are full details and a simple diagrammatic representation

The other crucial element of a digital certificate is that it is signed by the certification authority. A probabilistically unique number is generated (using a one-way hash function) from the information about the individual, public key and authority, and then encrypted using the private key of the authority. The resulting number is added to the other entries to form the certificate.

The signature allows prospective users to check that the certificate has not been tampered with. If it has, the result of the hash function will be different from the original. The original can be obtained by applying the certification authority's public key to the encrypted hash function result appended to the certificate. It is not possible to forge the hash function, since the authority's private key is required for the computation.

This does pose another problem: how do you know the public key supposedly of the authority really belongs to it? The answer is that the authority in turn has a digital certificate that allows its public key to be verified, which requires another certifying authority. In fact, the whole certification authority approach pre-supposes a hierarchy of authorities, with a central one sitting at the apex of each authentication pyramid. Among the leading commercial authorities are VeriSign and GTE .

Using certificates for Internet transactions

The beauty of digital certificates is that although the mathematical techniques that underlie them are complicated, using them is simple. Indeed, most of the time people will not be aware they are operational at all: everything is handled automatically by the relevant software.

A good example of this is offered by what is probably the most important public application of certificates: electronic commerce using credit or debit cards. One of the key considerations here is security, ensuring that card details remain hidden from third parties at all times. But in fact there are two other issues that also need to be addressed: privacy and non-repudiability.

Privacy means that in addition to keeping card details hidden from potential eavesdroppers, they should also, ideally, remain unknown to the vendor. All that the latter requires is confirmation from the company issuing the card that the requested payment will be made.

In this respect the widely-used SSL protocols are unsatisfactory: although they do indeed protect the card details in transit, they disclose them to the merchants, who might use them for fraudulent purposes. Another requirement not met by SSL is the need for the electronic equivalent of physical receipts. That is, it is important to be able to demonstrate that a sale was made and agreed to by both parties.

The new Secure Electronic Transaction (Set) protocols, recently finalised by the main developers Visa and Mastercard offer all three benefits, and make use of digital certificates extensively. Three parties are involved: in addition to the purchaser and vendor, there is also something known as a payment gateway, which acts as a gateway through to existing financial systems.

All three parties require digital certificates to use Set. These will be obtained from a Certificate Authority (CA) as explained last week. They enable the three parties to obtain and trust the public keys necessary for encryption. The use of public keys also provides non-repudiability: if messages can be decrypted using a given public key you can be sure that they were encrypted with the corresponding private key. Since only one person should have access to this key, any message must have come from that person, and so is binding.

Privacy is provided by using the payment gateway as an intermediary: card details are sent by the purchaser to the gateway, which contacts the card issuer to obtain authorisation. The gateway then confirms the authorisation to the vendor, which never sees card details. Both purchaser and vendor can trust the gateway because it has a digital certificate, which is guaranteed by a Certificate Authority.

The open Internet is an obvious arena for the use of digital certificates. Potentially you are dealing with complete strangers who may be located the other side of the globe, so the need for some kind of authorisation is all the more pressing. Digital certificates can also be applied to E-mail to ensure full confidentiality, for example using the standard called S/Mime.Certificates may also offer a solution to the growing problem of junk E-mail: a company might choose to accept E-mail only from correspondents with certificates that can be checked first.

Another important use of certificates is on intranets, where they can be employed to establish permissions for accessing various levels of sensitive information. One advantage of using certificates here is that it is not necessary to obtain them from an external CA. Instead, a certificate server (for example from Netscape ) is added to the intranet to handle certification and verification.

An obvious extension of employing certificates to determine access privileges is over an extranet. Here, though, external Certificate Authorities will probably be involved since each participating company's intranet may well have its own certificate server whose trustworthiness will need to be confirmed by an independent authority.

Digital Signatures

One of the ideas put forward of addressing the perennial problem of knowing when it is safe to download and run programs from the Internet is that of digital signatures, which Microsoft is suggesting as a way .

The idea is that if you know with absolute security who something came from, you can gauge the likelihood of its being hazardous for your computer. So if you can be sure a program comes from a major software firm, say, you might be rather happier running it than if it comes from an undergraduate at a Bulgarian university.

To achieve this certainty, a digital signature is used. Although meaningless in itself - it is just a string of bits - it can be used to show that the software it refers to must have come from a certain person. This is because signatures employ the same principles as encryption programs like Pretty Good Privacy: using the public key of a person or entity it is possible to check that the signature could only have been produced by that person or entity from the program it comes with.

The technique usually involves what is called a hash function that takes the program and derives from it a unique number. This number can then be encrypted with the private key of the program's owner.

If you download the program and use the purported owner's public key you will retrieve the signature which can then be checked against the output from the same hash function applied to the program. If they match, the program must have come from its claimed sender. If not, you should steer clear.

Directory-enabled networks

It is ironic that the very things that join people together - networks - should be so unconnected when it comes to detailed control of resources. Networks tend to exist almost in isolation, as separate entities that other core elements are plugged into. The Directory-Enabled Networks initiative aims to change that by integrating networks tightly with directories holding information about users and resources.

With such a standard in place, it will be possible to manage heterogeneous networks centrally, and from any location. But more importantly, it will be possible to provide network services on a completely individual basis. Using a directory, network attributes for users can be set according to their individual needs.

Currently, network bandwidth allocation, for example, tends to be on an all-or-nothing basis. With directory-enabled networks, different departments can be granted greater or lesser rights to corporate bandwidth. Similarly, key individuals can be given enhanced access privileges, perhaps for the duration of a special project.

Once this kind of fine-grained control is possible, other advanced network capabilities such as volume-based differential pricing and Quality of Service guarantees are likely to become more attractive.

The two main players driving the initiative are Microsoft and Cisco. Many other companies have also joined the movement, and it has now been passed over to the independent Desktop Management Task Force.

Distribution and Replication Protocol

Every Internet user waiting for Web pages or files to download knows that you cannot have too much bandwidth. Unfortunately normal Internet use is not particularly efficient as regards bandwidth utilisation, which means that the network's overall performance is even less than it could be.

An example of this inefficiency involves the updating of files. When information or data is updated it is nearly always the case that all of the files in each logical unit are sent over the Internet. Since many applications only change a small percentage of their constituent parts, this is manifestly extremely wasteful.

This was what made Marimba's approach with its Castanet product so interesting. There, Java programs could be updated incrementally, so that only those elements that had changed since the last download were re-sent. This meant great savings in bandwidth utilisation.

Marimba went on to extend its techniques to any programming language. Now it has gone even further, and proposed its general techniques to the World Wide Web Consortium as an open, non-proprietary standard for updating data, files and content. It is called the Distribution and Replication Protocol, and is designed as an extension of HTTP.

This eminently sensible proposal has the backing of most of the industry, except for Microsoft, which is probably concerned about losing control over what will doubtless prove a key element of software distribution in the future. To muddy the waters even further, another company, Novadigm, is claiming that elements of the proposed standard "may infringe its property rights".

Document Style Semantics and Specification Language

HTML is used for creating documents for display in a Web browser. Strictly speaking the markup used in a Web page is about the underlying structure, rather than its presentation, but for years HTML has been used as a way of creating a particular effect on the page.

The new Extensible Markup Language (XML) is much closer to HTML's original roots in the Standard Generalised Markup Language (SGML). This is purely about structure, and XML too says nothing about how any documents produced from its files will look. This task is left to a separate style sheet.

Style sheets have become more familiar with the support for cascading style sheets provided by the recent versions of Internet Explorer and Navigator. Style sheets allow advanced design elements to be created without the current HTML tricks, and for a single house style to be applied to different Web documents through the use of a separate style sheet file.

In the SGML world, style sheets are created in the Document Style Semantics and Specification Language, or DSSSL. As its rather intimidating name indicates, this is a very rigorous approach which allows SGML documents to be processed in a variety of powerful ways, including the arbitrary re-arrangement of elements.

Given XML's SGML heritage, it is perhaps not surprising that the emerging standard for XML style sheets, called the Extensible Stylesheet Language, draws heavily on DSSSL. Although this means that Extensible Stylesheet Language is unlikely to become widely used, it holds out the promise of some powerful applications when combined with XML.

Document Type Definition

Hypertext markup language (HTML) is sometimes described as a subset of the standard generalised markup language (SGML): that is, people understand HTML as a kind of cut-down version of SGML.

In fact, HTML is an application of SGML: an example of how the completely general SGML can define a particular document structure, in this case that of a Web page. That document structure is encapsulated in the document type definition (DTD). SGML is the language used to craft the DTD: it is a meta-language which can be used to create practical mark-up languages like HTML.

The full definition is never seen in HTML documents, although sometimes there is a vestigial reference found as a first line such as <!DOCTYPE HTML PUBLIC "-// W3C//DTD HTML Final 3.2//EN">. This essentially provides a cross-reference to the DTD used for the document.

How complicated DTDs are depends on how much structure you want to build into a document. The DTDs for HTML have become progressively more complex as features have been added; that for HTML 4.0 runs to tens of pages. DTDs contain abstract details about the markup that is permitted and what structure it has; which elements can be placed within others. It also specifies the characters that can be used in a document following that definition.

One of the main functions of a DTD is to let software check automatically whether a given document obeys all the relevant rules. Given the number of proprietary extensions to HTML it is probably just as well that such rigour has rarely been applied to Web pages.

Domain Name System

The Domain Name System, of DNS, lies at the heart of the Internet's approach to addresses. When you send a message to bloggs@acme.co.uk or use the Web browser to download the home page from WWW site www.ibm.com, your message and request for data must somehow find their way through the tangled mesh of networks that make up the Internet to precisely the right host at the right site.

This is achieved by converting a host name such as www.ibm.com into a numerical Internet address of the form 192.147.13.12, a process known as performing a host look-up.

Ideally there would be a central registry where the Internet sites with names such as www.ibm.com were held. But with millions of users joining the Internet every month, this is simply not possible.

Instead a hierarchical naming system is used whereby there are registries for local areas or domains, small enough so that each can be updated. for example, with-in a company there may be a database for individual users, and within a country or group of countries there will be a database of companies.

Then, if an address is required for a site that lies outside the home country, a request can be sent to the database in the destination country. This, in turn, may send on the request to a sub-database handling a smaller region or a company, which can provide the address, in question to the original enquirer.

Sub-domains

The main domains are familiar to anyone who has entered a few URLs: in the US, there is .com for the commercial domain, .edu for the academic world and .gov for government bodies. In the UK, the equivalents are co.uk, .ac.uk and .gov.uk where the country domain .uk is necessary because the default is always the US.

Nonetheless, the .us domain does exist, and is being used more and more by those who are conscious that the proportion of US sites as a percentage of the whole is diminishing steadily. Many that use the .us domain also adopt the useful policy of putting their state as the next sub-domain. For example, ca.us is California, fl.us is Florida and so on.

The rest of the online world seems to fall into three camps. Those that broadly follow the US sub-domains, such as Australia, which uses com.au and edu.au, those that follow the UK's approach such as South Africa (ac.za and co.za) and those which dispense with sub-domains altogether.

For example, Germany tends to adopt domain names that are descriptive, such as bauphysik.architektur.uni-kassel.de.

The new domain names

As more businesses take to the Internet the subject of domain names is becoming critical. The basic problem is that the DNS system - used for converting human-friendly names like www.ibm.com into IP addresses - was never designed to cope with the millions of names or complex legal issues such as trademarks that are both facts of life on today's Internet. As a result, the current structure can no longer cope, and it is clear that changes must be made.

As well as national domains such as .uk, and .fr, there are general domains, including .com, .net and .org. While the former are functioning well, with new local sub-domains created where necessary - for example, there is now one called police.uk in the UK domain alongside the traditional co.uk and ac.uk etc - the .com domain is showing signs of strain. Originally designed for companies within the US, .com has become a global domain.

As a result of the increased demand, most of the memorable names have been allocated and - more seriously, perhaps - many thorny issues relating to trademarks are beginning to arise. For example, a company with a valid trademark in one region might register the name in the .com domain and thereby block another company's equally valid use of the same trademark elsewhere.

This has also led to the rise of what is often called trademark extortion: the registration of well-known trademarks as domain names in the hope of selling them to their rightful owners for a suitably large sum. It has therefore been clear for some time that the general top-level domains - those not within national domains - needed expanding, and the procedures for arbitrating disputes revising.

The main Internet body involved in this area is the IANA (Internet Assigned Numbers Authority), since it is responsible for handing out the corresponding IP numbers (into which the domains are converted by the world's DNS servers). The IANA therefore requested another venerable Internet body, the Internet Society to create a committee to draw up suggestions for an overhaul of general domain names.

Called by the rather uninspired name of the International Ad Hoc Committee, this drew on some of the key Internet and international trade organisations. As well as the IANA and the Internet Society, there were representatives of the Internet Architecture Board (which oversees the more technical aspects of the Internet, such as the IETF working groups) and the US Government's Federal Networking Council (FNB, perhaps best-known for the image it created of the Internet's US backbone structure).

The other members came from the Telecommunications Union, (it uses the rarely-used .int domain), the International Trademark Association and the new World Intellectual Property Organisation. The IAHC has now produced its proposals, and, perhaps surprisingly, will be implementing them almost immediately. Because the consequences and longer-term implications of this new initiative are potentially so important for businesses, and because this is very much the beginning of a process that promises to be both long and painful.

The proposals from IAHC for new domain names and other changes to the whole area of the DNS system can be found. There are three main areas: domain names, registries and the resolution of trademark disputes. The IAHC has come up with seven new generic top-level domains.

The new domains (as well as the older generic ones at a later date) will be allocated by up to 28 new registrars. This number is arrived at by allowing up to four registrars in each of seven regions in the world. These new registrars will not supersede the current national registries but offer an alternative to them, and outside their national hierarchies. Rather bizarrely, the four registrars in each region will be chosen by lottery, subject to the fulfilment of certain conditions relating to business and technical practices.

One important innovation of these new registrars is that they will be competitive: potential registrants will be able to pick and choose among them - on the basis of price, for example. One of the main criticisms of the current system for the .com domain has been the monopoly held by Network Solutions. The presence of multiple registrars around the world means that it should be possible to reduce the cost of registration considerably. If this system works well, the plan is to add even more registrars, at the rate of twenty to thirty a year.

However, this competitive system means that some way of unifying the multiple registrations must be available. This will be achieved through the creation of a shared repository, run by an independent and neutral third party. There will also be a further independent and neutral third party to run the master DNS server: this is necessary so that the new domains are integrated into the current Internet system. Without it, requests to sites using these new domains could not be routed.

On the trademark front, one proposal is for the creation of trademark-specific domain name spaces. These include a new sub-domain tm.int for international trademark registration, as well as national versions (such as .tm.uk). France already has such a subdomain. On the important issue of resolution of trademark disputes, the proposals include a clause in the application form for domains that binds applicant to agree to participate in either online mediation under the rules of the Arbitration and Mediation Centre of WIPO, or else in a binding expedited arbitration under these rules.

Leaving aside important questions of how these proposed dispute resolution procedures will work in practice - and especially how they will fit with national and international law in this area - there are some more general issues surrounding these IAHC proposals. First, there is the question of how the names were arrived at: the suggested new domains seem arbitrary and are certainly heavily-biased towards the anglophone world.

Secondly, they are being rushed through, without consultation, in a way that is bound to make their implementation much harder - see, for example, Nominet's comments. Finally, even before the proposals have been implemented, IAHC has a lawsuit to fight. All-in-all, it seems certain that much more work will need to be done in this area - and that companies will find themselves struggling with problems of domain names for some time to come.

Enter the weird domain of the Internet names

US presidential special adviser Ira Magaziner has created a furore by coming up with his own proposals for the Net naming system (January 1998).

The new domain name system proposed by the Internet Council of Registrars, announced over a year, was to have come into force around now.

Under this scheme, new domains ending in .firm, .store, .web, .arts, .rec, .info and .nom would have been added to the national domains such as .uk and .fr, and the generic domains .com, .net and .edu, which are widely treated as international in scope, even though strictly speaking they apply only to the US.

At the same time, some 28 registrars were to have been created, each able to offer any of these domains, as well as the pre-existing generic domains. The new domain system has not been activated, even though the contract with Network Solutions, the company that has a monopoly on handing out generic domain names, comes to an end in September. The delay is partly due to the US government's taking exception to the proposal to move ultimate control of the Internet's name space from the US to Geneva. In a move to head off this development, president Bill Clinton's special adviser on the Internet, Ira Magaziner, has drawn up his own proposals.

These were based on comments sent in by interested parties last year, partly as a result of their unhappiness with the council's approach. The draft of Magaziner's plan is at www.ntia.doc.gov/ntiahome/domainname/dnsdrft.htm

A splitting headache
Magaziner's scheme is fairly complicated. It involves splitting Network Solutions into two nominally independent parts, one providing registrar services (allocating names) and the other running the registry (the central database holding names and the corresponding IP addresses).

In addition, more or less anyone would be able to become a registrar and hand out names, while a small band of new registries would be created, each dealing with one of up to five new top-level domains (as yet unspecified).

However, the details are still very unclear; the Magaziner document asks for further input about many of its ideas, and leaves open key questions such as what the new top-level domains would be, who would be allowed to run the database registries, and how copyright clashes would be handled.

As the wealth of information at the Internet council's Web site indicates, the old plan may have been unpopular with some people, but at least it was complete and detailed. Some of Magaziner's ideas may have merit - though the general approach is very US-biased - but the manner in which the US government is trying to impose them is little short of disastrous.

Currently, it is not clear whether the council's proposals will simply fade away, whether they will be partially subsumed by whatever ultimately comes out of Magaziner's proposals, or even whether there will be a kind of Internet schism, with some DNS look-ups supporting the new council domains, and others following some new body set up by Magaziner.

An Internet council press release of 15 January 1998 cited March as the date for initiating its new domain names. A press release of 30 January, responding to the release of Magaziner's draft, seems to indicate that the Internet council is intending to press ahead.

Trial of strength
The Internet has already had a foretaste of how this might happen. At the end of January 1998, just as the Magaziner proposals were being finalised, the current root of the DNS system was bypassed, and a mirror system used instead. This was not the action of some crazed cracker, but a deliberate show of strength by Jon Postel, who is in charge of the Internet Assigned Numbers Authority, the official body that hands out blocks of IP addresses.

The authority is working closely with the group behind the council system but would lose its current role under the Magaziner plan. The fact that Postel was able to hijack the entire routing system of the Internet hints at just what might happen if mavericks decide to implement the council proposals regardless of the US government's attempts to steamroll its own plan through.

In March 1999, with a lot of talk and little action over the previous two years a new body that is charged with sorting out the mess, the Internet Corporation for Assigned Names and Numbers (Icann), barely got round to issuing proposed guidelines on how it intends to license new Internet domain name registrars.

DNS round robin

For most people, the front door of commercial sites is represented by the domain name: www.ibm.com etc. But as far as the computers are concerned, what really counts is the IP address: 204.146.17.33 in the case of www.ibm.com. The technology used to map the domain name on to the IP address is the Domain Name System (DNS). The fact that most users are unaware of its operation is a testament to how efficiently it functions generally. Indeed, it is only when name look-ups fail that the system manifests itself.

Another situation where the DNS is working magic is particularly relevant to electronic commerce sites. Clearly, it is important never to turn away visitors, since they are all potential purchasers. This means that commerce sites must be able to scale to cope with visitor numbers.

Adding the hardware is simple enough, but a problem arises with the configuration. Since each machine must have a unique IP address, it would seem that different domain names must be used. This is a disaster in marketing terms, since users would never remember to rotate among them to ensure an even load at the site in question.

Fortunately this is not necessary, since DNS offers a facility called round robin. Instead of holding just one IP address to be associated with a domain name, several can be listed. Each of these is supplied in turn when a look-up is made, ensuring traffic is distributed evenly among the machines at a site, though not with any account taken of either machine power or current load.

Domain squatting

The domain name system was designed to allow numerical Internet addresses in the form 123.45.67.89 to be replaced with the more easily remembered bloggs.co.uk. But as has happened so often in recent years, what began as a neat engineering shortcut has grown into a pivotal facet of the Internet for business use.

Now this address mnemonic has become nothing less than one of the key elements of a company's Internet presence. As a result, domain names have passed from rather unremarkable synonyms for 32-bit IP addresses into hot intellectual properties with a quantifiable value.

And, of course, once money starts to enter the equation, there are always sharp operators prepared to create and exploit the market. The world of Internet names is no exception, and something known as domain squatting has become a familiar if rather regrettable aspect of domain name selection.

The practice involves the registration of domain names that are likely to be sought after by other users; typical targets will be major organisations and their brands along with common words. The theory (and sometimes practice) is that when the company decides it wants this domain it has to pay the squatter a considerable sum to give up the name.

Unfortunately, with the confused state of law on the use and abuse of trademarks online, it is not clear what recourse firms have against these speculators. Until the legal situation has been clarified - and the domain name approach extended to resolve the shortage of "good" names - such cyber squatters are likely to remain part of the business Internet scene.

However, ICANN may still be open to comments on its guidelines (March 1999), which will draw on the final Wipo proposals, at the address comment-guidelines@icann.org.

Downsampling

Visitors are obviously crucial to nearly every business site on the World Wide Web. But once electronic commerce is added to the mix - in the form of online sales, say - they become its very life-blood. Maximising the time they spend at a site - and their purchases - is a central concern of those running such sites.

Fortunately, companies using Web servers as their chosen tool have a very powerful weapon at their disposal - one that marketing departments in conventional sales environments would pay dearly to have. For from the time that a visitor enters a site, every move they make is recorded in a visitor log.

Information about how shoppers use physical stores can be obtained only through painstaking and expensive market research; but the visitor log contains all this for every single visitor to a Web site.

Hidden within this log, then, is marketing gold: the common patterns and overall trends that can tell a company how to spend its resources more effectively. The trick, of course, is extracting it.

Here, the company offering online purchases finds itself in a rather novel situation: unlike its peers in the ordinary world of shopping, it has too much data. A busy site can generate millions of log entries every day, making analysis a daunting task.

To get round this problem, the technique of downsampling can be employed. Rather than trying to analyse the entire log, only those visits that began in a given fraction of every hour are used, and the results then extrapolated.

Dynamic Addressing

To be truly "on" the Internet, you need to be identifiable by a unique Internet address of the form 123.45.67.89. This address is needed so that other nodes on the Internet - World Wide Web sites, FTP servers etc - can route to you the various information and data that you request. It is normally given to you either directly by your Internet supplier (e.g. Demon or City Scape) if you have a personal account, or else by your network manager in the case of a corporate connection where blocks of addresses have been allocated to the company in question.

However, it is not absolutely necessary that you have the same Internet address every time you log on as far as most Internet services are concerned. For example, an FTP server needs to know your address only at the moment that it sends a file, and is indifferent to what it was last week.

Many Internet providers and network managers take advantage of this fact to allocate addresses dynamically: each time that you log on (either to the Internet supplier, or to the corporate network) you are assigned an Internet address for that session from a pool of them.

The advantage is clear: instead of needing enough Internet addresses for every possible user, only enough to cover the demand for the maximum simultaneous number of users are required.

Whether you have a dynamic address or a static one (that is, the same for all Internet sessions) only really surfaces when it comes to setting up the TCP/IP functionality that hooks you into the Internet, as with the Windows 95 software.

Using the dial-up networking feature of Windows 95 you can set up multiple SLIP or PPP Internet addresses (though SLIP functionality has to be added separately: PPP is the default protocol throughout). Both of these can employ either static or dynamic addressing. Moreover, unlike the original TCP/IP set-up process that requires you to restart your PC every time that you make a change, this dial-up Internet connectivity is active immediately.

Although setting up this new TCP/IP functionality is much easier than the alternative approach, it is still likely to be beyond most users (not least because the jargon used is so unfamiliar). Recognising this, Microsoft has written what it calls an Internet Set-up Wizard (similar to the other Wizards now found throughout its software) that guides you through the process step-by-step using a series of screens with simple questions. Once you have answered all the questions, the dial-up Internet connection is created automatically.

The Internet Set-up Wizard is supplied as part of the Internet Jumpstart Kit, itself found on the Windows Plus! CD-ROM sold separately which adds various kinds of features to the basic Windows 95 product. Also of note here is the fact that the Windows Plus! disc contains Microsoft's Web browser, called Internet Explorer.

This is based on the original Mosaic browser, which has been licensed by Microsoft, IE possess a number of interesting features. For example, URLs (which are called shortcuts in Windows 95-speak) can be dragged from a Web page directly on to the desktop: double-clicking on them later causes an Internet connection to be made, Internet Explorer to be loaded and the relevant page accessed without further intervention.

Perhaps most fascinating of all, though, is another option of the Internet Set-up Wizard. As well as allowing you to use your own Internet supplier (by giving the telephone number, login name, password etc.) you can also choose to access the Internet via the Microsoft Network. This area signals perhaps the biggest change in Microsoft's strategy.

When you choose this option, an account is created for you with the Microsoft Network that allows you to access it not in the normal way, but with a full TCP/IP connection, using the built-in TCP/IP functionality of Windows 95. This lets you access all of the usual areas of the Microsoft Network, but also allows you to jump anywhere on the Internet (indeed, you can use the Microsoft Network and Internet tools simultaneously).

So, in effect, the Microsoft Network becomes just another part of the Internet, rather than a separate online service with limited gateways to it (the original plan). This enormously important shift of emphasis means two things: first, that Microsoft has realised that it cannot compete with the Internet, but must work with it; and secondly, that as a result the general rush to the Internet by companies offering services and the public using them is about to accelerate.

Dynamic Host Configuration Protocol

Despite the unplanned topology of the rag-bag collection of networks that go to make up the Internet, data packets do arrive (usually) at their intended destination, thanks to the addressing scheme employed. However, there are a couple of problems with the use of these addresses - numbers of the form 123.45.67.89 - which are assigned to the millions of Internet nodes.

First, addresses rather pre-suppose that they are allocated and then remain assigned to a particular computer that sits on a particular network. Unfortunately, the increasing use of mobile computers means that users may often log in to different parts of a network or even entirely different networks. Re-configuring the DNS tables to take cognisance of this fact soon becomes an administrative nightmare.

The second problem is perhaps rather deeper. The growth in the popularity of the Internet and the way that blocks of addresses are typically released means that there can often be a shortage of Internet addresses within a company.

The way round both of these problems is to use the Dynamic Host Configuration Protocol (DHCP) to allocate Internet addresses on the fly - dynamically - and temporarily. When someone logs on to any network supporting DHCP, they can be allocated an IP number for the duration of the session. This ensures that data packets are routed to them correctly wherever they are.

Moreover, the fact that the address is only 'leased' to them means that it can be used by another user when the first has finished. In this way a limited pool of Internet addresses can serve a far larger number of users who log on for only part of the time.

Dynamic HTML

One noticeable trend over the last year has been the move towards interactive Web pages. What were once static presentations of words, images and occasionally sounds are now turning into dynamic multimedia experiences. But this new format is bought at a fairly steep price.

In order to achieve a chameleon-like ability to respond in this way, Web page designers are forced to take one of two routes. One involves generating the varying contents and design of the Web page at the Web server. This has the advantage that there is plenty of processing horsepower, but the disadvantage that interaction with the user involves sending information back and forwards across the Internet - with all that this implies in terms of low bandwidth and slow response.

An alternative approach is to use a Java applet or ActiveX control to provide advanced multimedia features in the page. But this too means sending a large number of bytes over the Internet connection - in this case all at once initially instead of spread out over the session.

To get round these problems, a new lightweight kind of interactivity is being developed. In this approach, the processing is carried out at the client end to avoid the delays as data is passed over the Internet, but instead of requiring custom applets or controls to be sent first the functions required are built-in to the Web client as extensions to HTML and scripting.

Through the use of additional elements within the HTML code, Web pages can change in pre-programmed ways, or respond to user input, immediately and directly, without wasting time or Internet resources

The World Wide Web has been a continuing story of the triumph of ingenuity over the limitations of the basic underlying HTML technology. Perhaps the most dramatic instance of this has been the way that tables, formalised in HTML 3.2, have been deployed in an attempt to give designers the kind of pixel-level control they are used to elsewhere.

HTML 4, now at the draft stage (November 1997), promises to change all that. It offers the most radical extension of the basic HTML idea since its inception, providing total control over both the end-appearance and the underlying structure. Once HTML 4 is finalised, it promises to transform the Web far more radically than anything we have seen before.

Alongside incremental changes such as better support for international character sets (including right-to-left scripts) and improved accessibility for those with disabilities, HTML 4's most important innovation has to do with the way the basic elements of an HTML page are regarded and handled. Hitherto, HTML tags have simply been ways of describing and displaying elements of a document: headings, lists, paragraphs etc. HTML 4 instead regards them as a set of hierarchical objects. This Document Object Model (Dom), still only a rough sketch at the moment - see - is not just for the purposes of taxonomy: by turning HTML elements into objects, Dom allows them to be manipulated with great flexibility.

This means, for example, that it is possible to redefine how headings will look in a page, not just as it is loaded, but afterwards too. That is, Web pages no longer need to refer back to the Web server in order to change: they can interact with the user by employing Dom and code within the browser. Because of this ability to change and to respond, the name Dynamic HTML (DHTML) has been given to this new kind of Web page. Its full power is realised when Dom is coupled with scripting languages like Javascript or VBScript.

Using these lightweight versions, it is possible to create complex Web pages that possess, for many practical purposes, the full power of ordinary programs. For example, they can respond to events - pressing keys, clicking mouse buttons - redraw their screens and carry out computations just like conventional software. If this sounds too good to be true, it is. Although both Netscape and Microsoft agree that DHTML is the way forward, their implementations of it in version 4 of their browsers differ greatly. This means that, at the moment, developers are faced with the usual difficult choice: either to follow the Netscape or the Microsoft line.

There is already a pair of excellent books on the different flavours of DHTML: Instant Netscape DHTML (£22.99, ISBN 1-861001-19-3) and Instant IE4 DHTML (£22.99, ISBN 1-861000-68-5).

For those who want to begin coding now, some tools that support DHTML are available. These include Microsoft's Front Page 98, HoTMetaL Pro 4 and ,Dreamweaver from Macromedia all with trial downloads. Macromedia has some good DHTML resources

Will DHTML end the demand for Java?

In many ways DHTML is an extension of the scripting languages already widespread on the Web. The revolutionary part of HTML 4/DHTML is that it allows scripting languages to access anything on the Web page via all the hidden tags not normally seen by users.

Hitherto scripting was limited by when it could be used and what it could be applied to. A good example of its current application is in the validation of input: when forms request information from users, Javascript or JScript (Microsoft's version of Javascript) can be used to check that dates are entered properly, or that E-mail addresses contain the "@" character, for instance.

With DHTML, Javascript can be invoked at any time, and alter anything on the page. You no longer need to click on objects on a Web page: just passing a mouse cursor over them can trigger events. And using DHTML it is not only possible to change things like fonts and layouts - including the use of superposed and shifting layers - but even the displayed text of a Web document (though the original held on the server remains unaffected).

What is really new about all of this is that there is no reference back to the server for the underlying programming logic. All of this happens on the client, using the scripting code in the Web page and the DHTML engine present in compliant browsers.

For businesses with intranets, this means that browsers can take on more complex roles, and provide richer front-ends. Whereas currently they act as a fairly easy-to-use but limited window onto corporate data, with DHTML they can start to interact with that data without needing to call auxiliary programs on either the client or server. This reduces the traffic on the corporate network and also simplifies the installation and maintenance of users' computers since updated versions of the scripting code can be downloaded along with the Web pages.

In this respect, DHTML begins to look suspiciously like Java. It can create highly-functional programs that are downloaded from a server to run on the client, and it will be completely cross-platform (provided there are DHTML-compliant browsers available).

It is this parallelism that has attracted Microsoft's interest. The ActiveX approach has revealed itself as fundamentally unsuitable as a general rival to Java technology on the desktop (though it still has its place on the server side). This would leave Java as the undisputed victor in the thin client stakes were it not for DHTML. Microsoft is now taking the view that what users want is not a completely new approach like Java but an extension of the very widely-used HTML.

This argument has many merits. HTML, it is true, has proved fundamental to the success of the new Internet/intranet/extranet model. HTML is very lightweight and extremely easy to employ, both for developers and users. And it is now under the aegis of the World Wide Web Consortium, a truly independent (and international) organisation.

But there are also some flaws in Microsoft's approach. Although HTML is very easy to code, DHTML is not. It requires both an understanding of a scripting language like Javascript as well as the new Document Object Model of HTML 4. Putting these two together creates something that is almost as complex as Java, but without the latter's rigour.

And while it is certainly true that much of the early uses of Java to provide advanced interfaces can now be achieved by DHTML, Java has moved on enormously since then. In fact these fairly trivial applets are becoming more and more rare as people realise that they contribute very little other than slowing down the user experience.

As DHTML becomes the standard solution for these kind of interactive Web screens, Java will continue to flourish elsewhere: as a cross-platform solution, a general approach for reducing the costs of writing and maintaining software, and in entirely new domains - smartcards, embedded systems etc. - that DHTML, for all its power, will never reach.

DHTML Editors

The most dramatic new element of HTML 4 is doubtless the implicit support for Dynamic HTML (DHTML), which allows pixel-level placement of elements, stacked layers, animation and wide-ranging interactivity. But this new power has a high price. In order to use all of the dynamic language's new effects considerable programming is required: basic HTML may have been limited, but at least it was simple to write.

Clearly this creates an opportunity for software houses to come up with Web page editors that allow designers to exploit the power of DHTML while shielding them from some or all of its complexity.

The previous generation of HTML editors is represented by SoftQuad's Hotmetal Pro 4 ( http://www.softquad.com/products/hotmetal/ ). Although this a good tool for creating rigorous Web pages (its roots lie in the world of Standard Generalised Markup Language, of which HTML is an application), and offers support for cascading style sheets, users will need to add advanced DHTML effects by hand.

Front page freedom
Microsoft's Front Page 98, by contrast, goes out of its way to make the addition of these as easy as possible. For example, to add animations and page transition effects you select from pull-down menus, and all the code is generated automatically and added at the appropriate place in the Web page.

Although real Web professionals will find this approach limiting, for those who simply want to create dramatic Web sites with the minimum of work, it is a good solution, and the latest revision of Front Page offers other refinements too. Front Page is naturally geared towards taking advantage of the particular DHTML effects offered by Microsoft's Internet Explorer 4. mBed's Interactor (www.mbed. com/try.html) is rather more flexible in that you can create dynamic pages aimed specifically at Internet Explorer 4 or Netscape's Communicator, Java-based browsers or Navigator with a special mBed plug-in.

Although the code generated depends on the end-environment selected, the process of constructing the dynamic page is the same. Various elements such as images, sounds, buttons, transition effects and paths are added to the basic HTML page. These all have properties that can be inspected (rather like controls in Visual Basic), and can be linked together by choosing from drop-down lists to create simple programs.

This technique allows a wide variety of dynamic effects to be created without the need for hand-coding. However, once again there is a trade-off between ease-of-use and power. In particular, it is hard to edit the underlying code once it has been generated.

If you want rather more control over this while still able to take advantage of automatic code generation, a better bet might be Macromedia's Dreamweaver. This exposes the details of the properties, behaviour and interaction between the various DHTML elements on a page more explicitly, and so allows you to customise the effects to a greater extent.

Dream machine
Dreamweaver offers a good balance between automatically-generated effects and the ability to change HTML pages by hand. But for the real power user the best tool is probably ExperTelligence's Webberactive 4.0.

This offers little in the way of automatic code generation, but compensates by providing perhaps the most complete support for creating complex dynamic Web pages by hand. For example, along with all the usual facilities for inserting HTML tags, anchors, images, tables and frames, it has a powerful tool called Tag Assistant.

This not only shows all of the attributes for each tag, but allows the former to be altered directly, again, rather like changing properties of a Visual Basic control. You can choose which document type definition (see last week's Net Speak) to follow, and the list of available tags changes accordingly. Stylesheets can also be applied, or created separately. Impressive too is a similar facility that offers the available methods, properties and events for relevant page elements when writing scripts. All in all, Webberactive makes writing DHTML pages from scratch considerably easier.

EDI - Electronic Data Interchange

Security fears have hindered the use of EDI on the Net but new encryption techniques are ensuring users' safety

Described elsewhere is how Internet technologies can be used within a company to provide a complete groupware/EIS solution using a standard TCP/IP network. Exactly the same approach can be used in electronic document interchange (EDI), an area currently the domain of proprietary solutions constituting a small and rather fragmented market.

Despite its vague name, EDI is really about document interchange, and special kinds of documents at that: those involved in commercial transactions such as orders and invoices. EDI is designed to save costs and increase efficiency through faster turnaround times.

Traditionally EDI has been conducted over special Value-Added Networks, or Vans. This was necessary to offer the required security (vital when financial transactions were involved). However, the downside has been the appearance of a number of rival Vans, resulting in compatibility problems. For example, for a company and one of its suppliers to use EDI both would have to agree on a common Van.

It is against this background that the Internet offers the perfect solution. It is a global, open standard, and so choosing it is a vendor-neutral decision. Most large companies are already attached to the Internet, and so adding EDI functionality does not require any additional communication infrastructure to be created (though obviously software will be required).

There are two main routes for Internet EDI: E-mail, which uses the fact that documents are involved, and FTP, which transfers them as binary files.

The benefits of using the Internet for EDI seem so compelling - no extra costs from using Vans, which traditionally have charged highly for their specialised services; instant, worldwide compatibility; and the ability to integrate EDI into other corporate network functions - that it might be asked why it has not become more common.

Of course, the principal problem of Internet EDI is security: as an open system, the Internet is not exactly watertight, and so sensitive documents cannot be sent over it directly. However, encryption techniques are now well-established for Internet services, and so wrapping up EDI documents appropriately is a straightforward task.

Moreover, there is even a commercial Internet EIS product already available, called Templar ( http://www.templar.net/). This has been written by Premenos, perhaps the leader in this field. As a couple of the URLs above show, Premenos also offers useful EDI resources ( http://www.premenos.com/).

Complementing these, there is an excellent independent UK site covering all things EDI, set up by Jim Smith at http://www.ibmpcug.co.uk/~jws/index.html. Also worth noting is the IETF-EDI mailing list, which has been established as a forum for discussing methods of operating EDI transactions over the Internet. To subscribe, send the message sub ietf-edi YourFirstName Your SecondName to listserv@byu.edu.

Elliptic curve cryptography

Public key cryptography lies at the heart of almost every secure Internet transaction. And yet it is not used to encrypt the messages themselves, but other encryption keys. These employ symmetric encryption methods: one secret key that both encodes and decodes the message. Public keys are used to send this important element to the recipient, since direct transmission of the symmetric key with the message would obviously nullify security.

This curious two-step process arises out of the fact that a price has to be paid for the wonder of asymmetric, public key encryption: it is extremely computation-intensive, and so it is not feasible to encode the entire message by this means. This dual technique has worked well so far, but there are two problems that need to be addressed soon. The first is that computer hardware continues to fall in price and increase in speed, making it necessary to strengthen the encryption employed to maintain the same relative security. This in turn means longer keys, and even longer processing times.

The other problem is that it is not possible to apply current public key encryption systems in situations where there is very little processing power: in smartcards, for example:- A solution may be offered by a new public key encryption system called elliptic curve cryptography. Like the currently-used Rivett-Shamir-Adleman (RSA) technique, which employs number theory, elliptic curve cryptography is also based on obscure mathematics, but requires far less processing power. The downside is that it is not yet as well-tested as RSA's approach.

Electronic Business

Software companies have had to take great pains to keep up with the development of Internet technologies within business. For some years corporate re-engineering was all the rage, and software companies were happy to encourage this trend in the hope of some useful sales along the way.

Ironically, these same companies have more recently had to reinvent themselves in order to cope with the uptake of Internet technologies within business. As with the earlier spate of re-engineering, the process has often been difficult, and not all companies have managed the transition gracefully.

One company that has managed to do so, despite its late start, is Microsoft. Since 1995 or so the company has passed from Olympian indifference through amateurish first efforts to an extremely shrewd understanding of the Internet's implications and potential - and of how to work the Net, in all senses.

Refocusing
It has achieved this by refocusing the entire company on this area, and making the Internet central to just about everything it does. In the process, it has become a paradigm of Internet re-engineering.

But other software companies have not been so lucky as to have a powerful leader capable of recognising and remedying major blunders through swift and decisive action. Instead, they have had to work out piecemeal - and often slowly and painfully - their own accommodation with the new Internet world.

Rather ironically, IBM's task was easier than it might have been. After losing the leadership of the software sector to Microsoft, the Internet represented as much an opportunity as a threat.

Recognising the shift in business caused by Internet technologies, IBM has cleverly taken the occasion to unify much of its rather sprawling product line under the catchy banner of e-business. It even invented its own 'e' symbol based on the by-now familiar @.

Another company whose core business was undermined by the arrival of the Internet, Novell, has fared less well. It took too long to recognise that TCP/IP would supersede its own SPX/IPX protocols. With Intranetware it started to move towards Internet protocol seriously, but that process is still not complete. Still, Netware 5, should remedy that.

However, the inroads made into its market by Windows NT, which has always provided good TCP/IP support, may now be too deep to halt.

Keeping aloof
Oracle is an interesting example of a company that has been able to stay aloof from all these great technology battles, since databases can work as adjuncts to more or less any approach. However, the company has wisely chosen to bolster its position through a serious of ancillary products that are designed for the Internet world.

Electronic Commerce

The first generation of business Internet applications were little more than online poster sites. A basic Web server provided information about a company and its products, but apart from wandering around the usually rather thin holdings, there was not much a visitor could actually do. The next generation added connectivity to back-end databases, which in turn permitted Web pages to be generated on the fly. In this way product information could be completely up-to-the-minute and customised for each visitor.

But even these sites were relatively primitive. They certainly did not mirror the full reality of how a company functions. In this respect, the latest releases of the leading electronic commerce software packages are a real advance and represent perhaps the first third-generation business Internet applications. Their complexity stems from the richness of the processes they seek to model. As a result, although the Web server forms the portal to the electronic commerce site, it is now a relatively minor part of the whole solution.

With these products, the simple Web-database tie-up of second generation Internet software has become one involving databases for customers, products, inventory, accounts and so on. Many of these may well be further linked to other legacy systems. At the front-end of the site, alongside the dynamic generation of customised pages (either using standard languages such as Javascript, or through proprietary server-side scripting tools) there are important merchandising techniques that must be accommodated.

These include intelligent cross-selling and multiple discount schemes (depending on the buyer's history). Similarly, some kind of electronic shopping trolley to hold pending purchases needs to be provided. Once a sale has been closed, the ordering process must be set in train, involving inventory controls (checking that goods are still available and automatically notifying purchasing managers if stocks are low), sending confirmations (perhaps by E-mail) and invoices or receipts (by post, if necessary).

At some point, monies must change hands. This implies the use of secure transactions (via SSL) at the very least, and may also mean offering the newly-established SET protocol to allow completely secure credit card transactions. One option to ease the process for users is to create an electronic wallet that resides on the client machine and stores information such as credit card details and home and office delivery addresses.

And as well as tying in with the vendor's existing accounting system, electronic commerce software may well have to cope with multiple taxation schemes (depending on where the supplier and purchaser are located). Another complex issue is shipping: purchasers will expect various options, and the pricing for these will probably change according to their location.

Clearly, then, electronic commerce solutions meeting most or all of these requirements are extremely complex programs, and represent a major advance beyond previous Internet offerings. What is also interesting is how many of the leading products are now quite reasonably priced, and also available for Windows NT. That is, online commerce may soon be an option for every company, even relatively small ones.

In fact, even for those organisations that would never contemplate selling their products on the open Internet (perhaps because they serve only a local market, and shipping elsewhere would be prohibitively expensive), this software may still become a vital tool. The reason for this is the rise of extranets. These secure linked intranets mean that business-to-business selling (perhaps using the Open Buying on the Internet standard, discussed recently) is likely to become as important as sales to the end-user.

Indeed, it is significant that one major electronic commerce player, Actra (home page at http://www. actracorp.com/) - formerly a joint venture with electronic data interchange specialist GE Information Systems, now wholly owned by Netscape - is concentrating almost exclusively on this sector.

Saving money to make more dosh

There's no need for major investment to get a commercial Web site going: there are a host of cheap new products out there to help.

Once upon a time, setting up a Web site that sold goods or services would typically mean having to invest hundreds of thousands of pounds in bespoke software and an expensive Unix machine. Today the total cost for an out-of-the-box solution running Windows NT can be a few thousand pounds with no loss in quality.

O'Reilly's Website Professional 2.0 costs just £499, yet the range of software provided is astonishing. In addition to the basic Web server, there is a built-in search engine facility, a site-mapping program, the Home Site HTML editor, Perl 5 and even the programming language Python.

For electronic commerce, there is support for Java server applets, Microsoft's Active Server Pages, and another server-side scripting technology called iHTML. This employs non-standard HTML tags that are processed by the iHTML engine: because they reside purely on the server, and generate ordinary HTML, they work with any Web browser.

They form the basis of the bundled merchant server application: this is a powerful solution, but does mean that, as with Website, you have to be prepared to code to add advanced features. However, Website can be recommended for its Swiss army knife approach to site creation, its exemplary manuals and unbeatable value. A trial version can be downloaded from http://website.ora.com/wspro2/demo_form_frame.html.

For those looking for more help when building commerce sites, iCat's Electronic Commerce Suite 3.01 is as good a place to start as any. This takes a completely different approach from Website, one built around store templates. The idea is to shield users from the underlying HTML. Like Website, iCat too uses its own server-side language, called iCat Carbo Command language, to build complex catalogues from databases.

iCat's greatest strength is on the merchandising side: for example, it is ideal for cross-selling and creating discounted offers. However, not everyone will take to its rigid Web-based approach: while this makes it easy to get up and running, anything out of the ordinary means plunging into the Carbo Command language. Luckily, the manuals are good.

The professional version of iCat's product running under NT costs £6,995. Another solution taking a very similar approach is Intershop Online 2.0 Like iCat, the basic unit is the template into which data is dropped. Unfortunately the demo version allows little of the underlying technology to be observed, so it is hard to judge how malleable it is. Intershop Online for NT costs $4,995 (£3,122).

The firms mentioned so far are relatively small players, a reflection of the immaturity of the electronic commerce sector. This makes IBM's Net.commerce 2.0 all the more interesting. The installation process is a rather complicated business, but the end result is remarkably usable.

This is not least because IBM too has adopted a Web-based approach, and has topped everything off with a dash of Java for creating templates, to great effect.

One product that is still to make it on to Windows NT is Oracle's Internet Commerce Server: this should be available sometime next year. The other major player in the NT electronic commerce sector is, of course, Microsoft, and its rather unusual solution, the Siteserver bundle.

Microsoft aims to cash in through E-commerce

E-commerce will really take off in 1998. At least, that's what market research reports say. Microsoft must certainly be hoping this is true. With the Commerce Edition of its Site Server product (home page at www.microsoft.com/siteserver/commerce/default.asp), it has come out with its richest offering in the E-commerce field yet.

The Commerce Edition takes an agglutinative approach: you must have the standard version of Site Server installed first before adding E-commerce extras.

The product aims to serve three markets: internal company sales; sales to the public; and business-to-business sales. Internal sales are largely about authorisation processes, and the Commerce Edition is designed to address this. For external sales, a major addition is an advertising module. Microsoft has once again engulfed an entire fledgling application market by bundling the necessary software with other products.

Fine control
The Ad Server allows fine control over the ad size and placement, and gives plenty of management information about customer response. Other advanced online shopping features include intelligent cross-sell. This allows goods to be offered on the basis of what the individual customer has bought now or in the past, or on what customers with similar tastes have purchased.

Far less obvious, but in many ways far more crucial, is the fact that the selling process is handled by Microsoft's Transaction Server. This works in conjunction with the Order Processing Pipeline, already present in version 2 of the Commerce Edition of Site Server.

Extremely clever
The Order Processing Pipeline's visual representation of the various elements of the purchasing process is extremely clever. But version 3 goes even further. Because each of the elements in the pipeline is an object, its interaction and the overall control of the ordering process can be left to the Transaction Server.

In this way, the integrity of the purchasing transaction is automatically guaranteed without the need for further programming. Such transactional checks are vital for serious E-commerce applications, and the marriage of Site Server with Transaction Server is a shrewd move.

The transactional capabilities of Commerce Edition are important, but they are not Site Server's most innovative element. Alongside the Order Processing Pipeline, Microsoft has introduced the Commerce Interchange Pipeline. This is an extension of the Order Pipeline to business-to-business commerce. It allows firms working together across virtual private networks to automate the exchange of information, and to use Transaction Server to ensure integrity.

This is a very important move for Microsoft. It takes it squarely into the extranet domain that Netscape has hitherto both defined and dominated. What makes Microsoft's solution more impressive is that it is explicitly designed to use extensible markup language (XML) as one of the possible data exchange formats.

True to form The idea is that whatever the form of the databases used by firms taking part in an extranet, the information they hold can be converted to XML files without losing any of the databases' structure. Also, one database structure can be mapped automatically on to a different one using XML as the supplier-independent translation medium.

The importance of the Commerce Edition of Site Server 3 is not just that it offers many powerful tools for conducting online commerce. Rather, it shows that Microsoft has addressed perhaps the biggest gap in its portfolio - extranets - and leap-frogged the competition by embracing the future standard in this area: XML.

Security alerts

Manufacturers are rushing out fixes to a minor flaw in a key encryption standard. None the less electronic commerce could find it falls prey to a much greater danger. After many years of scepticism, business finally seems to accept that not only can the Internet be used by companies, but that it will be an important medium for commerce itself. But a recent event has highlighted the extreme fragility of the entire E-commerce edifice.

The announcement by a researcher at Bell Labs that he had found a flaw in the Public Key Cryptography Standard number 1 (PKCS 1) sounded about as exciting as the discovery of a kind of beetle by entomologists. But the implications of this apparently academic exercise in cracking codes are important.

To understand what PKCS 1 is, and why it matters, we need to revisit the world of public key encryption as applied to commerce. The crucial advantage of public key cryptography is that it is not necessary to exchange a private key for the purposes of encryption beforehand.

Secure protocol
This cryptological breakthrough allowed Netscape to come up with the secure sockets layer protocol. This enables secure channels to be opened between Internet clients and servers.

Secure sockets layer, in its turn, has become the basis for practically all today’s E-commerce.

The flaw in PKCS 1 matters, because secure sockets layer uses this standard. A way of cracking messages sent using PKCS 1 means a way of reading nominally secure messages sent over an secure sockets layer Internet connection. Credit card details, confidential information and all the rest are as a result – theoretically – liable to electronic eavesdropping.

In practice the problem is not enormous, because the researcher’s technique consists of sending literally millions or billions of special messages to a server, and using the errors the latter returns to decipher an encoded text. In practical circumstances, it is highly likely that such an attack would be noticed well before it was successful.

Even though the risk is largely theoretical many software manufacturers have rushed out fixes. For example Netscape (notice how just about every product is affected), Microsoft and C2.

Others, such as IBM and its subsidiary Lotus, have downplayed the affair. RSA itself will be coming out with a revised version of the cryptography standard protected using something called Optimal Asymmetric Encryption Padding.

Trust is the key
The reason for the concern is that e-commerce will succeed only if the majority of users trust the underlying security. Even though the flaw in the cryptography standard is more apparent than real, the worry is that it might shake confidence.

In fact the ramifications of this incident go much further. As the above described, the cryptography standard vulnerability filters down through the use of secure sockets layer to affect most secure servers. But underneath the cryptography standard is something even more fundamental: the public key techniques themselves.

Remarkably, there are only two basic ways of doing public key encryption: one based on prime numbers; the other on what are known as elliptic curves. Both derive their strength from the fact that brute force attacks against them would theoretically take millions of years.

But if one day some shortcut techniques were discovered (or have already been, secretly) the entire e-commerce house of cards will come tumbling down. And that will not be so easy to fix as the current minor problem.

Microsoft's real goal: E-commerce

Conventional operating systems are on the way out, but Microsoft is ready for the next big thing - online commerce. hatever the final outcome of the current legal action by the US Department of Justice against Microsoft, it will soon be more or less irrelevant. Assuming that the case is not thrown out completely, the impact of a judgement involving Windows 98 or even NT5 will be undermined by the diminishing relative importance of conventional operating systems.

This may occur through attrition by the Open Source movement in general, and Linux in particular, or it may be because the main focus for software will move on to embedded systems, hybrid communication-computer units and other new digital devices - all areas where Microsoft has yet to gain a significant presence.

But the Seattle-based giant did not become the world's most valuable company through blindly clinging to one market-leading product line. If the firm has a watchword it is adaptability, as its successful U-turn on the Internet in December 1995 proved.

In fact the new, post-Windows Microsoft is already taking shape, out on the Internet, in the form of substantial if largely unsung successes in the E-commerce sphere. This group of sites is potentially so important that they deserve to be tracked regularly by anyone with an interest in online commerce.

The first sites appeared almost two years ago, and were already ahead of their time. Expedia and Microsoft Investor provided intermediation services - respectively in travel and finance - that were far more sophisticated than the other product-based offerings of better-known E-commerce sites such as Amazon.com and CDNow.

Where the latter essentially had huge databases of things, which they sold, the Microsoft sites offered relevant information that linked in to services and products provided by third parties.

Microsoft has since added Carpoint, which, alongside background information about particular models, sells them by putting potential buyers in touch with the nearest dealers holding models of interest and linking to relevant used car classifieds. More recently it has created a new Web site to act as a mediator between house buyers and sellers with Home Advisor.

Together with these monothematic sites, there are also a couple of more general E-commerce projects. Sidewalk offers local information, and related classified advertising, while The Microsoft Plaza is an online mall, with merchant sites for different categories.

As you might expect, the Plaza lists Microsoft's own E-commerce sites where possible. But such cross-promotion is also omnipresent on the single-theme sites. In fact, in E-commerce, as for the Internet, Metcalfe's law applies: the value of a network increases as the square of the number of nodes. The more E-commerce sites Microsoft has, the more cross-promotion it can run.

Of course, media firms do this all the time. What is novel is the instant linking that the Internet provides, allowing visitors to be carried straight to where they can buy at precisely the moment they are most likely to do so.

Microsoft's E-commerce sites may be young, but they are already very successful. For example, Carpoint drives $200m (£122m) of sales a month, while Expedia has weekly sales of $5m .

And Microsoft has barely begun. With the exception of Sidewalk, all the sites are US-centred. But the essence of the Net - to say nothing of Microsoft's ambition - is global, and Expedia, at least, is poised to offer its services elsewhere.

Given the trend for one online site to become dominant in a given sector, Microsoft may well soon become the main conduit for entire sectors of E-commerce.

As a result, these sites may be the beginning of a new Microsoft bigger than anything so far. Let's hope the Department of Justice is watching them.

E-mail

E-mail represents a cultural shift away from working methods based around telephone calls, letters and face to face meetings. E-mail combines the immediacy of speech with the convenience of the written word. Like letters E-mail (nearly) always arrives; no online engaged tones, no typing tag. But as with telephone conversation, there is a tendency in E-mail to react immediately, to write - as you would speak - without thinking too much about the words or overall form.

This leads to the biggest problem of E-mail, the fact that while it reads as a transcript of speech with all the benefits of spontaneity and informality that implies, it lacks the vital ancillary clues usually accompanying conversation. In particular, the tone of voice and non-verbal signals sent by facial expressions or body language are missing. All too often this generates misunderstandings, rash responses and the escalation of E-mail to what is called flaming; raw out-pourings of emotion rather than a reasoned reply. If there is even a faint possibility that your words will be mis-understood by somebody they almost certainly will be. Writers of E-mail should read what they have written from the stand-point of their hardest critic and wait before sending the message to give a change to re-read what has been written. What seemed witty at the time of writing may look pretty foolish when considered more objectively.

Comments can be added to explicitly hint about your written intent or smileys can be used. See the section Smileys.

How to retrieve binary files by Internet E-mail

The standard for Internet E-mail specifies messages composed of alphanumeric characters that can be represented by seven bits of a byte, whereas binary files use eight. A technique of encoding is used where three eight-bit characters are put together and re-written as four six-bit characters. The latter are admissible with an E-mail message, and can be sent over the Internet.

On reception they are converted back to the original group of three bytes by reversing the process. The intermediate encoding is quite arbitrary, provided a common standard is agreed. The most common one is uuencoding.

So, to retrieve the binary file pkz204g.exe held at the site ftp.tex.ac.uk you would send the message: connect ftp.tex.ac.uk binary uuencode chdir ctan/tex-archive/tools/pkzip/ get pkz204g.exe quit

Two new commands have been added; binary tells the public FTP site it will be sending a binary file. uuencode tells the FTPmail server to send the file on to you as an E-mail message using the uuencoding technique. The request is sent in the usual way to ftpmail@doc.ic.ac.uk.

The utility uudecode is available on most UNIX systems. PC users will need a standalone program such as Wincode or Winzip that does the decoding.

When sending encoded files FTPmail servers usually break them up into messages of 64,000 characters or less. On receipt you must then rename each message sequentially: file1.uee, file2.uue, file3.uue, etc. The uudecode utility is intelligent enough to find subsequent files after the first one.

Digging for Data at the Internet's Core

Many mailing lists have a secondary function as a store for information. Searches in these archives; those using the Listserv software, can be obtained using E-mail. In the document that is sent when you joined a mailing list will be details on searchable archive, it any. Normally you will have to obtain the basic reference guide.

To retrieve the guide send the message info database to the address listserv@irlean.ucd.ie (or any other Listserv mailing list address that you know of or have joined). The main syntax is quite straightforward. As an example if you were using Netscape and found difficulties with the URLs that you entered, leading to a DNS look-up failure you might hope that others on the Netscape mailing list had similar problems and perhaps even solutions. To find out, you would send the message // Database Search DD=Question //Question DD * Search DNS OR 'lookup failure' in Netscape Index /*

to the address listserv@irlean.ucd.ie. This requests that the Listserv program searches for either of the phrases 'DNS' and 'lookup failure' in the archive of postings to the Netscape list. The Index command simply asks for the results. In due course you will get something like:- Item# Date Time Recs Subject ------ ----- ----- ----- ------- 000106 94/11/20 21:17 162 New Netscape beta ... 000728 94/01/23 06:08 26 Re:DNS lookup failure...(followup)(solved..sor+ etc.

To view item number 728 you would send the following command: // Database Search DD=Question //Question DD * Search DNS OR 'lookup failure' in Netscape Print all of 728 /*

How to access the World Wide Web using Internet E-mail

At first sight it might appear a hopeless task to try to use E-mail to access the World Wide Web. After all, one of its most important facets is the hypertextual nature of its structure. And yet there is indeed a site that will send back to you Web documents upon receipt of your e-mail request. As usual, no charge is made for this ingenious service.

to the address listserv@mail.w3.org. That is, you just place the complete URL of the Web page you are interested in (including the http://) after the word send. In the example above, you would then receive back a document that begins

UK Service Providers UK SERVICE PROVIDERS This page was last updated on 25 Apr 1995. There are also automatically generated summary[1] and detailed[2] lists from when they were posted to Usenet on 25 Apr 1995. Send additions and corrections to inetuk@arcglade.demon.co.uk[3]. See the section on the required entry format[4] before submitting. Go back to the Internet index[5].

As you look through the e-mail document you will notice numbers in square brackets: these correspond to the hypertext links (the hotspots) that exist in the original Web page. At the bottom of the E-mail message there are 'footnotes' corresponding to these numbers which contain precisely the URL referenced by that hotspot. It is therefore possible, by sending another E-mail to the listserv@mail.w3.org address, to follow any of these links that exist in the original hypertext document.

For example, say you wanted to find out more about Compulink Information eXchange's WWW page, which has reference [94] in the above document. Using the corresponding footnote, you would therefore send the following message (derived from the URL given in footnote 94):

Overview of CIX (Compulink Information Exchange) Overview of CIX - Europe's largest computer conferencing system CIX (pronounced 'kicks') is a highly popular, British based, conferencing system, providing access to hundreds of conferences etc.

In this way, any Web page can be retrieved, and any link within it (including those to FTP and Gopher sites, though obviously not to interactive telnet addresses). Although the multimedia element is missing from the messages returned, this technique does at least allow you to access the content of Web pages in a remarkably simple way. If you wish to find out more about this service, send the message

How to track someone down on the Internet

One of the principal achievements of the Internet has been the bringing-together of more than 30 million people in a seamless web of personal communication that embraces one-to-one E-mail, one-to-many Usenet newsgroups and many-to-many interactions like Internet Relay Chat. It is therefore somewhat ironic that there is no kind of centralised directory for this huge collection of users where you can find out how to contact one of them. Although the Internet probably gives the general user direct access to more information than any other system previously invented, reliably finding out something as apparently simple as an e-mail address is almost impossible.

This is not because there are no Internet white pages - directories of users and their Internet locations: it is more that the many attempts to map out who uses the Internet are woefully limited, covering particular areas well, but little outside those domains.

A good example of an ambitious but so far unsuccessful attempt to create a central source of information is provided by the X.500 service. This highly-formalised system does indeed provide much in-depth information about people held on its servers around the world, but the coverage offered is only a small proportion of the total user population. X.500 is mainly found in Europe, particularly in academic establishments, where its rigour seems to appeal. More about the system including ways of accessing it can be found at http://www.earn.net/gnrt/x500.html.

If you are looking for someone based in the US, a better system to try is Whois. Like X.500, this adopts a client-server model with a distributed database; unfortunately, also like X.500, it has been taken up mainly in universities. Although the information that it holds is rather skimpier than that found in the X.500 directories, Whois is easier to use. A list of Whois databases is available from ftp://sipb.mit.edu/pub/whois/whois-servers.list.

The Whois system has at least attained a certain prominence in the Internet world as one of the main white pages; the same cannot be said about the Computing Services Office (CSO) servers and their gnomically-named client Ph (short for phone) which is used only very sporadically. Once more, this service is mainly found at academic establishments.

In the absence of a centralised database of users, a number of alternative approaches have been developed. For example, the Netfind program asks you to specify likely sites or type of organisation and then carries out searches using a list of possible candidates that it generates from that input. This makes Netfind a convenient tool if you have an approximate idea of where the person you are looking for is located. It can be accessed at telnet://monolith.cc.ic.ac.uk, logging in as netfind. There is also a World Wide Web gateway at http://alpha.acast.nova.edu/netfind.html.

Knowbot Information Service, also known as KIS or just Knowbot, adopts an interesting technique that shows what might one day be achieved in terms of drawing together all these disparate threads. It is not so much a directory service in its own right as a front-end to various pre-existing databases, including X.500 and Whois. Its big advantage is that it will interrogate those sources automatically, which saves you logging on separately; its disadvantage is that it carries out its search blindly and unstoppably, and cannot be fine-tuned by further intervention from the user. Knowbot can be accessed at telnet://info.cnri.reston.va.us; no login is required, but you are asked to leave your E-mail address in its visitor's book.

Junk E-mail

Spamming (see separate section) has been around for awhile. Junk E-mail has taken longer to arrive because it is logistically harder to set up. It requires hundreds of thousands of E-mail addresses to be gathered (either by hand, or using software), while spams are trivial to carry out. But where spams effectively are washed away in the deluge of messages found in most Usenet newsgroups, junk E-mail by contrast pollutes one of the most personal aspects of using the Internet, your E-mail in-tray.

E-mail represents one of the most direct ways of reaching somebody - even more direct than letters or telephone calls. Junk E-mail is therefore a particularly personal affront in terms of wasting time and abusing a service.

These new junk E-mailers gather their lists from many sources, principally Usenet newsgroups. It is also possible that other mailing lists and Web sites requiring registration sell on their lists, though they should give you the option to block this. Since E-mail addresses must generally be public to be useful, it is hard to stop them falling into the hands of the unscrupulous.

In terms of response, you can try replying to junk E-mail with a message asking where they obtained your name. If this is not bounced by the system (as it often is), you might then pass a few idle moments sending a few more such messages. If junk E-mailers become flooded with junk replies from enough of their victims, it becomes difficult for them to function.

If your mail is bounced, you can try writing to the postmaster at the address the junk E-Mail originated from (for example postmaster@abc.com if the sender's E-mail address ends abc.com). Every site must have such a postmaster, so it should generally be possible to reach the offending company in this way. However, there are various tricks that junk E-mailers can use to hide their identity further. In this case your only option is to direct your comments to any advertisers carried in the junk E-mail, pointing out how counterproductive their approach is. Unfortunately all of these actions require yet more of your time - something that has already been wasted enough by this new and regrettable Internet plague.

Industry acts to clear the Net of junk E-mail

The problem of junk E-mail has become one of the most annoying problems of the Internet. It is therefore worthwhile keeping track of the latest developments in this area, not least in the hope of a solution. A good summary of the issues from a business point of view has been put together by the Internet Mail Consortium. There are also General junk E-mail/spam resources

One way of fighting junk E-mail is to use existing legislation, notably that regulating pyramid schemes and other frauds. Recently, the US Federal Trade Commission and similar bodies around the world have started to act against such schemes. See New legislation is also under consideration in the US. One bill has been promoted by an anti-spam organisation called at Coalition Against Unsolicited Commercial E-mail and details can be found at http://www.cauce.org/why.html#smith

Self-regulation
An alternative approach is self-regulation, and this has been adopted by the Canadian Direct Marketing Association, which forbids its members from employing junk E-mail without the user's consent. Of course, the problem with this and similar schemes is that it has no effect on the more dubious junk E-mail merchants who are not members of any such associations, and who will continue regardless.

Unfortunately, it is unlikely that legislation will succeed in controlling them either: on the Net, it is only too easy to move outside jurisdictions that have anti-spam laws. Similarly, some of the more extreme online vigilante action - whereby Internet service providers that harbour junk E-mailers are subjected to various forms of harassment - will ultimately only drive spammers off-shore. What is needed is a technological solution that can be applied by the user.

One obvious approach is to apply filters to E-mail. This can be done by the service provider, using lists of known offenders, or the more sophisticated system from the Mail Abuse Protection Scheme. Or it can be done locally, on the user's machine. Most advanced E-mail programs offer a facility whereby E-mail can be pre-sorted and even automatically deleted according to the sender, heading and contents.

Cloak and dagger
Unfortunately, junk E-mailers are only too adept at hiding their intentions, at least in subject lines. They also use fake E-mail addresses to cloak the origin of these messages.

None the less, it is possible to dig out addresses with a little work, or by using programs such as Spamhater, which not only finds originators, but also has template letters of varying anger for firing off to the offenders. But reacting to junk E-mail is a waste of the user's time, whether in setting up filtering rules or adding new domains to be blocked.

What is needed is a way of checking the credentials of the sender automatically, digital certificates are one solution. Once these become widespread - as they will for authentication and authorisation on intranets and extranets - they can be used to establish with certainty the provenance of E-mail.

Users might choose to accept only those certificates already known to them, or to accept any certificate from a reputable certification agency that promises to repudiate certificates of anyone who indulges in spamming. This would get around the problem of off-shore junk E-mails, as they would still need certificates from well-known agencies.

Free E-mail

Free E-mail functions as an extension of the dominant economic model used on the Internet whereby Web site content, for example, is provided free in return for viewing and possibly responding to banner advertising. Similarly with free E-mail, accounts are provided on the basis that various forms of advertising will be displayed along with the mail messages you receive.

The pioneer in this field was Juno which employs proprietary software and a collection of dedicated dial-up points. Users do not pay even for the phone-call to download their mail, but this rules out use by non-US nationals since there are currently no overseas dial-up points. Its main rival, Hotmail takes a different approach: displaying E-mail messages as Web pages. This has a number of advantages. First, you do not need proprietary software to view your mail: any browser will do.

Secondly, and perhaps most importantly, you can retrieve your E-mail from anywhere in the world that you can connect to the Internet - which also means it is available for non-US users. The third advantage (for Hotmail, at least) is that the advertising which pays for the service can be displayed along with the E-mail message just as it would within a conventional Web page.

Irrelevant advertising
One downside is the fact that to enrol for Hotmail you must give various kinds of personal information, including your income. This is so the adverts placed within the Web pages displaying your E-mail can be targeted. This is a good idea in principle, since it means users are not troubled with the kind of irrelevant advertising that bedevils the world of junk E-mail.

None the less, users are rightly chary about giving up intimate personal details of this kind - whatever the privacy guarantees offered. Of course, once an idea takes off, others are quick to emulate. An interesting development among these second-generation free E-mail services is they tend to be used as adjuncts to existing Web sites.

For example, for the user directory Four11 a free E-mail service was a natural addition. The fit was also good when the search engine Yahoo bought Four11. In an attempt to establish itself as a one-stop shop on the Internet, Yahoo has constantly been adding ancillary services, and now at free E-mail was an obvious candidate.

Less intrusive
So obvious that one of Yahoo's main rivals, Excite, has done the same. The advantage of using Yahoo's and Excite's services is that the information required from you is less intrusive, and users may prefer to take this route rather than divulging more details to Hotmail. As well as the benefits of zero cost and accessibility from anywhere in the world, it is worth mentioning one other motive for setting-up an extra E-mail account.

Since all of these services depend critically on their users reading their E-mail - so that adverts are seen, and advertisers are happy to pay - companies offering free E-mail tend to be among the most vigilant when it comes to fighting junk E-mail. They try to filter out the most infamous offenders, and often offer extra tools to allow users to sort their mail still further.

As a result, free E-mail services may well be the best for public use on the Internet - in Usenet postings, say, or for the many free services or trial software schemes that require an E-mail address. Rather than giving out your main E-mail details - and run the risk of having your inbox flooded with junk - you could use a secondary account just for this purpose. Your personal address could then be reserved for the use of close business associates, friends, family etc.

Dangers of E-mail

Ironically, XML will only make matters worse. It allows files in proprietary formats to be turned into text files just like E-mail - and makes searching through them even simpler. Drawing up an E-mail policy now will make it easier to deal with these later challenges.

Encryption key length

The main encryption technologies all have at their heart the public key technique that allows an encryption key to be sent in an open way (typically with the message itself) without jeopardising the security of the encrypted contents.

This key consists of an extremely large number used in the mathematics that makes the public key approach possible. The size of that number (expressed as the number of binary digits required to describe it) is known as the key length. The bigger the key size, the larger the number, and the more secure the encrypted message.

The recent news that a message encrypted with the Netscape browser in its secure mode using these techniques had been cracked is a consequence of the fact that the key length employed in non-US versions of the software is relatively small.

This in turn is as a result of US export laws that forbid the general sale abroad of products with full-power encryption (since such systems are classed as munitions, of all things). The much larger key length in the US version of Netscape remains, for the moment, beyond the computational reach of any known computer, but unfortunately also unavailable to UK users. There is some hope that the recent success in cracking the smaller key length will allow at least somewhat more powerful versions to be exported.

Related to this issue is the plight of PGP's author, Phil Zimmerman. Because copies of PGP are now found outside the US, Zimmerman is under investigation for an alleged breach of the US export laws. More about this and his European Legal Defence fund can be found at http://draco.centerline.com:8080/~franl/pgp/phil-defense-fund-europe.html.

ECMAScript

One of the many confusing aspects of the Internet for newcomers is the strange dynamics that rule its operation. For example, there is no governing organisation that establishes correct practices, much less any structure for enforcing them and yet there are indeed standards on the Internet: it is simply that they evolve by general agreement, and their acceptance is equally consensual.

Nobody is forced to use the Domain Name Server system, or HTTP; but without the former your Internet sites will be invisible, and without the latter they will not be accessible by tens of millions of Web clients. For this reason, the issue of standards has become a kind of Holy Grail for companies selling Internet and intranet products, an indispensable ingredient in the marketing mix.

This manifests itself in curious ways. For example, both Microsoft and Netscape (the two main players in the standards game as they are on the Internet) habitually trumpet the fact that they submit their approaches to the Internet Engineering Task Force or World Wide Web Consortium as if that by itself were enough.

More recently, there has been an even more extreme manifestation of the standards bug. Companies are now turning over their products to independent bodies in an attempt to demonstrate once and for all that they represent "real", open standards.

Microsoft's ActiveX is under the auspices of the Open Group, Sun is trying to establish Java in a similar way, and now there is ECMAScript. This is the new incarnation of Netscape's Javascript (and Microsoft's JScript) as a "true" standard, under the auspices of Ecma, originally the European Computer Manufacturers Association.

Enfopol

One of the collateral benefits of the current Internet share frenzy is that the Net is no longer regarded, as it once was, simply as a playground for terrorists and pornographers. There is an increasing consensus that it is emerging as the central medium not just for global communications, but for worldwide commerce, too. However, governments are acutely aware that the rise of this supranational medium poses a unique threat to their old powers of surveillance and control.

In fact, there are already operations that aim to undermine the Internet's power. It will probably come as something of a shock to most readers to learn that "within Europe, all e-mail, telephone and fax communications are routinely intercepted by the United States National Security Agency, transferring all target information from the European mainland via the strategic hub of London then by satellite to Fort Meade in Maryland via the crucial hub at Menwith Hill in the North York Moors of the UK".

This statement can be found in the executive summary of an extremely detailed and ultimately rather dispiriting report on political control prepared by the Omega Foundation in Manchester and presented to the European Parliament's Scientific and Technical Options Assessment (STOA) panel. The full report can be found online.

As if this were not bad enough, there are plans to extend this surveillance to include all European Internet transmissions. The plans are known by the suitably Orwellian name of Enfopol 98, and there is a site devoted to the subject. A time-line detailing how the Enfopol 98 plans came about and have been progressing towards realisation is available.

However, as the explanation of the current situation details, one of the requirements of the Enfopol legislation is for "Internet Service Providers to set up high-security interception interfaces inside their premises. These interception interfaces would have to be installed in a high-security zone to which only security-cleared and vetted employees could have access." This would allow law- enforcement agencies to gain access to any Internet communication, and aims to provide them with all the relevant IP addresses, passwords and e-mail addresses of any Internet session in more or less real time.

The Telepolis site, which houses the Enfopol papers, is to be lauded for making these important documents available and for providing useful commentary on them. This is in stark contrast to the European Parliament's site which is so labyrinthine that it is almost impossible to track down any of the relevant papers, despite their enormous importance to the public and business.

One of the few explicit references to the Enfopol plans (officially known as "Lawful interception of telecommunications") can be found here, recording that the draft resolutions were adopted by the European Parliament on 7 May 1999.

Given both the US origins of the entire Enfopol project, and the ways in which the US has already abused other such eavesdropping systems for commercial advantage, it seems extraordinary that such important damaging resolutions should have been passed by the European Parliament, without input from business, or public debate.

Extranet

The idea behind extranets - that the corporate intranets of business partners can be linked tightly together to create a secure network allowing closer co-operation - is clearly appealing.

However, the size of the challenge facing those who wish to create such an extranet should not be underestimated. It implies a complete review of the intranet technologies used by all the companies concerned.

It is significant that one of the few extranets to have been announced is Netscape's: only a company right at the forefront of Internet technologies stands much chance of realising the extranet vision at this early stage.

Alongside the logistical problems of connecting different networks together is a more subtle issue that needs to be addressed. Even after the basic extranet plumbing is in place, there is the question of how the data that flows across it should be structured.

For simple applications, such as accessing internal Web sites or LDAP directories, this is not a problem; but for more complex applications, such as intercompany sales and supply, the issue of data format becomes paramount.

Fortunately, an initiative begun last year by the Internet Purchasing Roundtable may provide a ready-made solution. The initial impetus for what is now known as the Open Buying on the Internet (OBI) standard was to ease the purchase of goods across the Internet in the business-to-business arena.

Specifically, it was aimed at high-volume, low-cost purchases, especially those in the area sometimes known as MRO (maintenance, repair and operation).

Because unit cost is low, it is important to hold infrastructure costs down. This, in turn, implies a widely adopted, open standard with many players, so that competition keeps margins tight. OBI was created with the backing of major companies, such as Ford, and financial institutions, such as American Express, which supported the initial work.

The basic idea behind OBI is relatively straightforward. A corporate requisitioner accesses a supplier's system using a Web browser across a network to view a catalogue of supplies. After choosing the required items, the electronic billing is handled automatically. However, OBI goes into rather more detail than this superficial analysis might suggest.

For example, it specifies that the catalogue seen by each seller must be tailored specifically for them, perhaps generated dynamically from a supplier's back-end database. It must have search facilities and a list of frequently ordered items to expedite the selection process.

A user profile database holds information about the buyer, such as authorisation limits, billing and shipping preferences, to supply defaults in on-screen forms.

A receipt of order must be generated by the process to allow full audit, and many different payment options have to be supported, including bulk invoices and cheques, EDI invoices and electronic transfer methods.

Security is a key element of OBI: contract prices must remain confidential, and all transactions must be protected through encryption. Digital certificates are also vital for establishing the identity of the parties.

Sensibly, OBI is based on current Internet standards, such as HTML for content display, SSL for secure Internet communications, X.509 for digital certificates and SET for credit card transactions.

But the standard looks to the future too: it specifies that OBI applications must support international use and not be dependent on US-only technology. OBI has its home page, and there are excellent and extremely full explanations of the standard available from the OBI Library

Although complex for such low-level transactions, OBI seems likely to catch on, not least because Microsoft, Netscape (through its part-ownership of E-commerce company Actra) and Oracle have all said they will be supporting it in future products.

Moreover, as extranets are implemented, companies will find that they need precisely the structures defined by OBI to maximise the benefit they obtain from them.

FAQ

Frequently Asked Questions (FAQs) are documents that attempt to encapsulate fundamental knowledge about the Internet in various areas. They are not official publications in any sense, but have been put together by public-spirited Internet users as an aid to newcomers. They grew out of the Usenet newsgroups where there is a natural tendency for people just joining to ask the same basic questions. Usenet FAQs are posted periodically to relevant newsgroups, and can also be obtained from reference sites by FTP.

To obtain a list of available documents from the main Usenet FAQ site at the Massachusetts Institute of Technology, send the following message to ftpmail@doc.ic.ac.uk connect rtfm.mit.edu chdir pub/usenet-by-group/ dir quit

In due course a list of over 1,000 FAQs will be E-mailed to you. To receive a particular FAQ, say the one posted periodically to the newsgroup alt.winsock send the following message to ftpmail@doc.ic.ac.uk connect rtfm.mit.edu chdir pub/usenet-by-group/alt.winsock dir quit

This retrieves a list of various files held in pub/usenet-by-group/alt.winsock directory. You need this to give the correct name of the FAQ file you want to retrieve. The FTPmail message sent in response to the above commands is as follows: -rw-rw-r--14 root 3 109099 Nov 3 01:34 comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_1_of_3 -rw-rw-r--14 root 3 87289 Nov 3 01:34 comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_2_of_3

These unwieldy names can then be used to retrieve the FAQ in question by sending to ftpmail@doc.ic.ac.uk messages such as the following: connect rtfm.mit.edu chdir pub/usenet-by-group/alt.winsock get comp.protocols.tcp-ip.ibmpc_Frequently_Asked_Questions_(FAQ),_part_1_of_3 quit

FAQs are not limited to technical or computing areas, but exist for most subjects.

A list of more or less all the anonymous FTP servers in existence is found at http://hoohoo.ncsa.uluc.edu:80/ftp-interface.html. The list includes a short description of what can be found there and hotspot linking for instant access.

File transfer Protocol

One of then central features of the Internet is the ability to transfer files between computers connected to it. The standard that handles this is the File transfer Protocol (FTP). Note that ftp in lower case forms the beginning of many Internet addresses.

Many sites allow so-called anonymous FTP, as they offer huge stores of free or shareware programs.

Anonymous FTP means you do not have to be a registered user of a site to access it. Instead, when the log on prompt appears, enter the word anonymous of ftp as a generic user-name. A password is usually requested. Internet etiquette requires you to give your full E-mail address so that those running the anonymous FTP site have a record of users.

Once admitted to such a system, depending upon local off-peak and number of permitted anonymous visitors you will be faced with a directory structure. To see its topmost layer type dir. Each site is unique but usually there is a README file (read_me, index.txt, or similar). This often gives information about the directory structure. There is generally a directory called /pub which is a public directory containing downloadable files.

Imperial College's anonymous FTP site (ftp://src.doc.ic.ac.uk) has more than 30 Gbytes of free software. Of particular interest is src.doc.ic.ac.uk/computing/systems/ibmpc which contains DOS, win3, win95, nt and simtel sub-directories. Each contains an index file (zipped and lst files) which are worth downloading and marking up off line. The same shareware files are available from ftp.informatik.uni_muenchen.de

Microsoft's anonymous site is ftp://ftp.microsoft.com. Instead of organising in sub-directories Microsoft includes several thousand files in one directory, so using a browser to view the directory is not efficient. Enter the name of the file in URL or us an ftp application.

It is possible to retrieve FTP-accessible files using basic E-mail that can send and receive messages over the Internet. Some sites accept requests of FTP files via E-mail. These the retrieve, using the standard FTP function, and then forward them to the source E-mail address of the original request, all free of charge.

The requests are sent in the form of a list of simple commands that tell the FTPmail sites where the file is called. A typical set of instructions would be as follows: connect ftp.src.doc.ic.ac.uk chdir computing/systems/ibmpc/win95 list/ get index.txt quit

In the above example, the FTPmail site is being asked to connect to src.doc.ic.ac.uk, change directory and get the file index.zip. The quit command marks the end of the sequence. To see a full list of the possible commands available, send the message help to ftpmall@doc.ic.ac.uk. This is also the address to which you send your request for a file. Note that each command should be on a new line, obtained by pressing the Return/Enter key; you can leave the subject line blank or input an identifying name to help you recognise the file on its return. The FTPmail server will then process your request, fetch the file and send it to you. Within a few hours, or perhaps a day, the requested file should appear in your mailbox.

(technically known as a Uniform Resource Locator or URL), and also without the initial ftp://. To use the FTPmail facility you must break this up into its component parts, e.g. connect ftp.demon.co.uk chdir pub/archives/uk-Internet-list/ get inetuk.lng quit

Eventually you will receive a list of Internet providers with details of who provides what and for how much.

Note that the files used in the examples are ASCII or pure text format; binary files can not be downloaded this way.

Firefly

Microsoft's announcement that it is to buy the company Firefly (April 1998) might at first seem surprising. Firefly is a typical Internet start-up, with products in the new and as-yet rather unproven area of online privacy: see the privacy resources.

It has developed a scheme it calls Passport, which enables users to create, free of charge, a profile for use online. When used at Passport-enabled Web sites, it allows personalised information or services to be offered automatically while giving users control over what personal information is conveyed to those sites.

These are all very interesting ideas, and certainly privacy is likely to become an increasingly important area as E-commerce takes off. But on its own it is hardly reason enough for Microsoft to spend a sizeable sum on acquiring the whole company. Indeed, Microsoft already has considerable expertise in this area, as its privacy and profiling submission to the World-Wide Web Consortium indicates.

Firefly is of interest not so much for its present products as for the work it is doing on an important future World Wide Web Consortium standard called the Platform for Privacy Preferences, known as P3P. This represents the generalisation and extension of a privacy proposal put together by Firefly and Netscape, called the Open Profiling Standard. However, the platform is much more ambitious, and has the backing of both Microsoft and Netscape.

It is the relatively long-standing (by Internet standards, at least) relationship between Firefly and Netscape that makes Microsoft's acquisition more interesting.

Stranglehold
Through its purchase of Firefly, Microsoft gains a stranglehold on an emerging technology and removes one of Netscape's allies. For, however diligent Firefly may be in continuing to work with Netscape, the latter will hardly be happy about revealing its secrets to a part of the Microsoft empire.

But Microsoft's move goes even further than this. Firefly was also one of the prime movers of the Information & Content Exchange ICE proposal. This is a way to allow the automatic but controlled exchange of business information. For example, one Web site may wish to draw in constantly changing data from other information providers: ICE offers a framework for doing this.

What is noteworthy about the proposal is that it is based on the new Extensible Markup Language (XML). Put this together with Microsoft's work on Extensible Style Language, its submission on XML-Data (and Net Speak) and XML-based E-mail threading, and you have a situation whereby the company almost totally dominates what many regard as the biggest advance since HTML.

The acquisition of Firefly, with its expertise in the Platform of Privacy Preferences - yet another major application of XML - sees Netscape cut off from a partner and potentially isolated from the single most important Internet development in recent years.

Firewalls

There are business advantages to be gained by a company connecting to the Internet, unfortunately there is a downside to be considered, one that stems from the very nature of this global network.

The Internet is essentially democratic in that all points on it are equal: there are no centres. A related property is its symmetry: when you connect a computer to the Internet, the Internet also connects to you, and while the connection is in place any of the several tens of million users can access your machine - and probably those networked to it - in a variety of ways.

Clearly for businesses with confidential information stored on machines attached to their corporate networks, this is a worrying prospect. But of course it is not a new problem - it is implicit in the original design of the Internet - and there is now a well-established defence.

This generally goes by the name of the firewall, by analogy with the physical obstacles that are placed to halt the advance of fire within a building. Similarly, Internet firewalls are designed to block the spread of the digital fire that licks around networks in the form of unauthorised access. When potential intruders attempt to gain entry to a corporate system attached to the Internet the firewall stops them safely outside the internal network. At the same time, suitably configured, firewalls will still allow company users to access the Internet (were this completely blocked you might as well disconnect from the Internet completely).

At its simplest, the firewall is created by a separate computer that acts as a kind of cyberbouncer, rejecting unauthorised accesses. Unauthorised in this context might mean that it comes from the wrong Internet addresses (or, more usually, not from one of those held in a file of acceptable Internet addresses), using the wrong Internet port (which is generally associated with a particular Internet service like FTP or telnet, and which may have known weaknesses) or a combination of these.

These packet filters (also known as screening routers) work by examining individual TCP/IP packets for basic network information. Alongside them are the application gateways that allow Internet services and their users to be controlled more directly. An example of this approach is the proxy server running on a firewall computer which retrieves information on behalf of an internal user, and then transfers the results of the query to the originator. In this way, only the proxy machine is visible to the outside world, and can be well-armed against attempts to subvert it.

As the above indicates, the principle of the firewall is almost trivial: it is the details of the implementation that are crucial. For this reason there is no substitute for practical help and advice from practising firewall administrators, and one of the best places to obtain both is from the Firewalls mailing list. To subscribe to the digest form, send the message

Now is a very good time to join: after several months of near-inactivity there has been a flurry of activity on this list, with several digests being sent out every day. For those who wish to catch up on some of the background to the subject, compressed back issues are available by FTP from the URL ftp://ftp.greatcircle.com/pub/firewalls/digest/.

Also highly recommended are two books on the subject. The first, Firewalls and Internet Security by Cheswick and Bellovin (price £20.95, ISBN 0-201-63357-4) is the accepted classic in this area. As well as much hands-on detail, it is notable for containing a gripping tale of how the authors fought off the attempts of an intruder to enter their system, complete with logs of the epic battle. More sedate, but even more practical is Internet Firewalls and Network Security by Siyan and Hare (£32.49, ISBN 1-56205-437-6).

Fonts

Languages

One of the great attractions of the World Wide Web for businesses is its global reach. But this apparent universality is something of an illusion: it is, indeed, easy to reach customers in other lands, but it is very hard to speak to them in their own language. Most people probably know that some "foreign" characters can be created in a Web page using numeric codes - for example &#200 produces an e-acute. It is also possible to enter entities such as é to produce the same effect. All of these generally refer to the basic character set formally known as SO-8859-1 or ISO Latin-1 (see http://babel.alis.com:8080/codage/iso8859/jeuxiso.htm).

It is generally assumed that Web pages are irredeemably wedded to this character set. But in fact HTML consists simply of a stream of bytes; what these bytes represent is arbitrary. One obvious solution to the problem of representing other character sets is therefore simply to re-define what those bytes mean. For example, instead of ISO-8859-1 (Latin-1) the ISO-8859-2 set could be employed. This gives all of the "normal" Latin characters, but replaces some of the Western accents with those required for Central and East European languages such as Czech, Hungarian, Polish and Slovenian.

To work in any of these languages, you need to have installed a font that offers the extra characters and the ability to ability to change to this font in your browser. How to do this depends on the Web browser: next week's feature will give some practical details. For the Cyrillic alphabet there is the ISO-8859-5 set. However, for historical reasons most Web sites have chosen a rival encoding, known as KOI8-R; there is also another Windows encoding called Win1251.

There are similar problems with other languages. For example, Japanese Web pages employ one of three systems: JIS, Shift-JIS and EUC. This is over and above the complication that Chinese and Japanese Kanji characters cannot be encompassed within the 8-bit space used by ISO-8859. Instead a two-bit representation is used - requiring yet more intelligence in the software so that these can be converted correctly.

Arabic and Hebrew present another challenge to the browser: the representation of characters that are written right to left. There is also the issue of context: Arabic characters can take one of four forms depending on whether they stand alone, or are found at the beginning or end or in the middle of a world, and may also require ligatures to join them to surrounding letters.

To try to define a standard format for all character sets the Unicode (http://www.unicode.org/) or Universal Character Set (UCS) has been drawn up. The idea here is to replace the single byte representation used by ASCII by a general two-byte coding. Because such two-byte encodings can be problematic for many programs, two related encoding schemes (ftp://ds.internic.net/rfc/rfc2044.txt) have been drawn up, the UCS transformation formats (UTFs) UTF-7 and UTF-8. These employ single bytes, and have the property that they preserve the 7-bit ASCII character encoding.

Next week's feature will show all this theory is used in practice, and how multilingual Web pages are now relatively simple to create. But it is worth flagging up some other areas that still pose major challenges to the Web as currently constituted. For example, although it is relatively straightforward to create multi-lingual Web pages, the HTML code remains in ASCII. Alongside complex proposals such as the Extended Reference Concrete Syntax (http://www.sgmlopen.org/sgml/docs/ercs/ercs-home.html), there are simpler ones (http://www.alis.com:8085/ietf/html/draft-ietf-html-i18n.txt) involving the use of a language attribute with HTML elements.

Few people in the anglophone world worry much about these, and probably equally thin on the ground are those who have tried to add multilingual capabilities to their browsers. A good reference point for users interested in this area is the site for multilingualism and the Internet is http://wwli.com/library/localize.html.

Adding this capability to Netscape is much harder, and reflects this young company's relative inexperience in dealing with international markets. First, you need to find the relevant fonts yourself (in practice, the simplest solution is to use those provided by Microsoft). Once these have been installed, you must then activate them in Netscape. This done from the Options menu, choosing General Preferences and Fonts. For each of the encodings you specify the font that you have added. Then, to use this encoding for a Web page, you will need to go to the drop-down list available on the Document Encoding entry on the Options menu.

None of this is very intuitive; worse is the fact that for Japanese font capabilities you have to edit an entry in the Windows registry - the software equivalent of brain surgery, and about as risky. If you want full foreign language capabilities for Netscape, it may be easier to buy the plug-in (http://www.accentsoft.com/) called Navigate with an Accent from Accent Software. This adds a new drop-down list of character sets alongside the main menu buttons. An evaluation copy is available from http://www.accentsoft.com/download/dleng.htm. Unfortunately the add-in disables important features such as plug-ins, frames and Java.

Another browser product based on the original Mosaic is Tango, from Alis Technologies, whose site was mentioned last week as a useful starting point for exploring Internet multilingual issues. An evaluation copy can be downloaded. Tango can display no less than 90 languages, including Arabic, Chinese, Greek, Hebrew, Korean, Russian and Thai. The interface can be switched to any of 19 languages. The corresponding creation software called Tango Creator lets you compose HTML pages in 90 languages using character sets other than Latin-1, and supports tables and frames.

The changing (type)faces of the Internet

Advances in font design for the Web are ushering in a new era of Net publishing that puts the emphasis on corporate identity.

Design is a crucial issue for businesses today. For example, it is commonplace for companies to adopt a distinctive housestyle when it comes to typefaces for information about themselves and their products, and corporate designers naturally expect to have full control over how materials will look.

On the Web, by contrast, it is the consumer rather than the supplier who determines the onscreen appearance. This arises from the way in which HTML works, whereby overall structure is transmitted from the Web server and then given a local form by the Web browser. For example, within an HTML document there are various kinds of headings, but the exact details of how each of these is implemented - typeface, size, weight - cannot be guaranteed.

One solution to this problem is that adopted by Microsoft. It has introduced a special HTML tag that allows Web page designers to specify what typeface should be used if available. So <FONT FACE = "Arial Black, Courier New"> specifies two possible typefaces for the text it refers to.

One difficulty with this approach is that it is browser-specific. For example, not even the latest version of Netscape supports this syntax. A more formal approach to the problem is offered by the idea of style sheets.

These are a way of specifying the overall design elements of an HTML file, including things such as how the various levels of headings will appear, colours of body text etc. They are analogous to the style sheets found in many word-processing and desktop publishing packages. There are a couple of rival approaches to style sheets, of which http://www.computerweekly.co.uk/gwfeat/gwspeak/css1.html is the most important.

The agreement of an official HTML style sheet standard is good news because it will avoid the proliferation of local HTML dialects that had started to occur. However, apart from the fact that the idea is so new that few browsers currently support the feature, there is a more fundamental problem.

Style sheets specify what typefaces should be used if they are available. However, if a particular font is not on the client system it will be substituted with a default - and completely change the original design. To avoid this it is necessary to provide some kind of font delivery mechanism whereby any fonts that are needed in a page are sent with it (rather as Java applets are sent to add extra functions).

This is such an obviously sensible approach that two rival standards were proposed to implement it, one from Adobe, Netscape and Apple, and the other from Microsoft. Miraculously the Internet spirit of co-operation seems to have prevailed again, and a joint standard called Opentype (http://www.microsoft.com/truetype/fontpack/pr3.htm) has been announced with support from all parties.

On its own, this kind of font embedding would not be enough: Web pages would be grossly inflated and become unusable for most kinds of users. Two other components of the Opentype idea are therefore crucially important. The first is called subsetting, and means that not every character of a font set needs to be sent. For example, a font may include irrelevant characters; by omitting these from the embedded font that is sent with the Web page the overall size is kept within reasonable limits.

Size is further reduced by employing a special kind of font compression technology. Developed by Agfa, Microtype Express is a lossless, on-the-fly compression technology that can compress a font in seconds and decompress it in far less than that.

Given the very broad industry support for the Opentype standard, and the long-standing need of companies for precisely this kind of control, it is likely that a new era in Web publishing is about to begin. In the not-too-distant future businesses can start to treat the Net as a medium for marketing and creating brand-awareness that matches any other for power and flexibility and goes beyond them all in terms of reach and cost-effectiveness.

Frame relay

Frame relay, like Asynchronous Transfer Mode (ATM), is an advanced data communications technology that is starting to become more common in corporate environments. It is sometimes seen as a rival to ATM, but in reality it is a complementary solution.

ATM is optimised for use with networks capable of tens of megabits per second or more and is good for carrying data-intensive applications such as multimedia.

It employs fixed-length packets of data that can be quickly switched. Frame relay, by contrast, uses variable-length frames. This means it is more efficient for slower applications where the overhead of ATM can be a penalty. Frame relay is related to the older X.25 packet-switching technology, which used to be common among companies, but again it is more efficient.

Its other advantage is that it tends to cost less than a leased line of equivalent speed. It uses virtual circuits that may exist only for the duration of the connection, rather like the Net, although frame relay also uses permanent virtual circuits.

Frame relay is well suited to traffic that comes in bursts, where dedicated lines would not be filled to their capacity all the time.

Internet traffic and communications between intranets often follow this pattern. Therefore, frame relay is a good solution for users connecting to Internet providers at higher speeds and for joining up intranet islands.

Free Internet Services

One of the Internet's most unsettling aspects for business users is the way it seems to fly in the face of conventional economics. Its basic operational characteristics, whereby connections to the other side of the globe cost the same as those to the other side of the road, are confusing enough. But even worse for many is the constant recrudescence of the "free" idea.

Ghostscript

Ghostscript as you might guess from the name, is intimately related to PostScript, and in a sense is the Internet's home-grown version.

PostScript is Adobe's page description language for documents (see http://www.adobe.com/prodindex/postscript/overview.html ). That is, instead of describing a printed page of words and images in terms of the dots that computer printers typically place on paper, it creates those letters and graphics by defining them mathematically as a series of lines, curves and shapes, placed together in a certain way.

This has a number of advantages. First, the description is typically very compact: for example, to describe a large solid circle in PostScript is trivial, but a conventional graphics file holding the positions of every dot required to create it would be sizeable. Perhaps more importantly, PostScript descriptions are platform-independent: this means that the same PostScript file can be printed out from any machine without the need for further modifications to take account of the fact that the dots used to realise the image will be different in each case.

Ghostscript is the online community's response to this powerful but proprietary product, a clone that can process PostScript files to produce the correct output, but one which is available free to end users.

Ghostscript's home page is at http://www.cs.wisc.edu/~ghost/index.html.here you can download the latest version of Ghostscript (which is an interpreter for Postscript) and Ghostview an X11 previewer for Postscript files. GSView at http://www.cs.wisecedu/~ghost/gsview/index.html)is a graphical interface for Ghostscript, and can be used as a helper application for browsers. This allows postscript files to be displayed in the browser.

A first guide to postscript can be found at http://www.gkss.de/W3/PS/postscript.htm.General resources can be found from http://www.geocities.com/SiliconValley/5682/psotscript.html.

Global Network Navigator (GNN)

One of the most interesting developments on the Internet has been the appearance of commercial operations offering free information. Obviously this is not pure philanthropy: rather, it is a kind of sponsorship that gets these companies' names before the vast Internet public.

One of the first firms to do this is O'Reilly & Associates. Computer Weekly readers will know it for its series of technical books on Unix and network issues. The Global Network Navigator (GNN), its World Wide Web server (best accessed by UK users through the mirror-site http://src.doc.ic.ac.uk/gnn/), has many other jumping-off points for further exploration of the Internet, and is open to all, though you are asked to register.

At the heart of the GNN is the catalogue of Internet sites found in Ed Krol's classic The Whole Internet book, converted into a searchable hypertext page with hotspots.

Particularly interesting is a list of the top 50 sites. A separate Business Pages contains various commercial services, while NetNews details some of the latest Internet developments. Other sections include a Help Desk for Internet beginners, and online 'exhibitions' that look at particular topics.

Nearer home, and on a slightly smaller scale, there is the NewsWire server (http://info.learned.co.uk/) from the UK company Learned Information. As well as providing subscription details, this lets you read features from the current and previous issues of NewsWire. This often carries interesting and unusual stories from a nicely Euro-centric viewpoint. There is a Gopher at info.learned.co.uk

Finally, the Perl server at http://www.cis.ufl.edu/perl/ is a WWW server with serious information. It may not tell you everything about the Unix junkie's favourite scripting language, but you will find links to manuals, FAQs, FTP, other Perl sites and quotations from the Perl god Larry Wall.

GNU General Public Licence

A far more sophisticated kind of shareware has become an important part of Net culture. This is the GNU General Public Licence, which arose out of the larger project based on GNU (which stands recursively of GNU's Not Unix, and is a program to create a complex version of Unix in the public domain).

The essence of the GNU General Public Licence is that it tries to safeguard everyone's rights to use a program by setting out the minimum freedom available, unlike most licences which try to do the opposite by strictly limiting what you can do with a statement of maximum freedom.

In particular, the licence specifies that you can distribute copies of any software subject to the GNU licence (and charge for this service if you wish); that you can receive the source code or obtain it if you desire to do so; that you can change the software or use pieces of it in new free programs; and that you know you can do all these things. The other important element of the GNU Public Licence is that there is no warranty for any software distributed according to its terms.

Glue Languages

the Internet has had a dramatic impact on many aspects of computing, but one that has been little remarked upon concerns programming.

For the new connected world of computers has led to a shift away from traditional heavy-duty programming languages such as C/C++ towards lightweight tools. Most of these are generally called scripting languages, and the reason for their adoption has varied. For example, Javascript is the scripting language par excellence for Web pages, while on the server side VBScript is popular because of the large installed base of Windows NT Systems.

Also important on the server side is Perl, which has established itself as the language of preference when creating CGI scripts. Other languages popular among Internet users include Tcl/Tk and Python.

Scripting languages are becoming more widely used in part because the latest generation of Web programmers is growing up.

For them lightweight languages that can be learnt and applied quickly to solve immediate problems are more suitable than complex languages like C which can require months to master. Such scripting tools also reflect a new approach to software projects, rather than creating huge programs, the tendency is to assemble pre-existing elements. In this respect, scripts for the glue that binds these components, so these new programming tools are sometimes referred to a glue languages.

The use of smaller programming elements in this way is part of a more general move towards component based development for example with Javabeans or Microsoft's Common Object Model architecture.

Rise and rise of glue languages

Greatly simplified, the history of computer hardware can be viewed as one of progressively smaller and more personal units linked together ever more widely. After the mainframe with dumb terminals came minis offering a client/server architecture, to be followed by local area networks of peered PCs and the currently-evolving global Internet of IP-enabled devices.

This evolution has gone hand-in-hand with the development of the interface for each hardware generation. After batch processing came character-based command lines for real-time control, followed by desktop graphical user interfaces culminating in the present browser-based, application-independent Webtop approach.

And in the world of programming, there has been a parallel movement. Machine code and assembly language were followed by higher-level languages such as Cobol, Fortran, C, C++ and Java, and these in turn by today's scripting languages.

Such scripts frequently make use of software code written in other languages. In this respect they act as a way of hooking up a broad range of pre-existing elements - hence their other name of 'glue languages'.

In part the rise of scripting is down to economics. Programming is an expensive activity, and the ability to employ tools that can be learnt quickly, and applied widely and easily, together with the possibility of re-using existing code, is a compelling combination.

Moreover, the emphasis is increasingly on getting more value out of current systems and the data they hold, rather than writing entirely new ones. This has put the focus on connecting to and working with often disparate collections of legacy resources - an area where scripting excels. But alongside such considerations, the Internet is also proving a major force in the current rapid uptake of scripting tools.

By their very nature, the Internet and related technologies such as intranets and extranets are heterogeneous. Portability is therefore a key requirement of tools that can work in these environments, something that scriptings tools typically offer.

The shifting collection of platforms and applications found in IP-based distributed systems also means tools must be very adaptable. Scripting languages are typically highly extensible, allowing them to be modified quickly to meet the demands of a particular situation.

Alongside these general considerations, there are also several specific reasons why scripting tools are spreading rapidly in the Internet world. For example, Perl is almost routinely used where the Common Gateway Interface approach is employed to link Web sites to back-end database and other software. And because Perl excels in string manipulation, it is a natural tool for processing HTML files, which are just text.

Microsoft, too, has contributed to the popularity of Internet scripting, particularly on the server side.

Its widely-used Active Server Pages technology employs scripting to generate Web pages on the fly from back-end databases and other sources. One reason why ASP has taken off is that it is scripting-neutral: you can use Perl, Microsoft's VBScript (a cut-down version of Visual Basic) or Javascript, among others.

Netscape's introduction of Javascript in the Navigator 2 browser, its formalisation as ECMA-Script, and Microsoft's support for the latter, have all helped to ensure that scripting is also very common on the client side.

Finally, the increasing acceptability of open source software like Apache or Linux has meant that free scripting languages such as Perl, Tcl and Python - all of which are open source - are now viable options in a corporate context.

The Active Scripting organisation aims to promote scripting further through the development of open source ports of Microsoft's ASP technology to other platforms such as Apache and Mozilla. The fact that even new commercial scripting languages, like Rebol, are being released for free download seems likely to help spread the scripting word further.

Gophers

Gopher is the name given to a way of finding your way round the Internet using a simple menu-driven approach.

Gophers start off with a list of top-level categories - e.g. arts, business, etc - each of which takes you to a further list of subsidiary options - the business option might lead to finance, economics, taxation, etc - until at last you reach the detailed information itself. Connections are handled automatically with text files being displayed on your screen and binary files being downloaded to your computer.

The name gopher was chosen partly because, like its human counterpart, it can 'go for' things and partly because the animal gopher was the mascot of the University of Minnesota where the system was developed.

Gophers consist of a client and server. you use the client to ask for information held by the Gopher server that contains the index.

The easiest way to use a gopher is to have a Gopher client resident on your system - you need more than just an E-mail connection for this. Then, to search for information, call up your local Gopher client (either by typing something like gopher at the command line, or by running a Gopher application such as HGopher or WinGopher under Microsoft windows), and connect to a Gopher server over the Internet.

Because Gophers are merely a way of structuring references to other information, it is also possible for companies to use them for disseminating information about themselves, by providing a menu-driven interface to marketing and sales documents, for example.

You can save time and energy by choosing an appropriate Gopher before beginning to search for particular information. One of the best general Gophers is the Gopher Jewels implementation. This was an ASCII file containing interesting Gopher sites; it has now turned into a Gopher site itself with links to all the 2,200 or so entries to be found at the end of its nested menus. Among the mirrors sites is gopher.csv.warwick.ac.uk.

If you choose option number 9 on the main menu, and then option 4 on the sub-menu you will be taken to the Gopher Jewles section. From there you can either explore specific subjects or else carry out searches in a variety of ways.

Gopher exists in two forms, the basic Gopher and Gopher+ (pronounced 'Gopher plus'). Among the more advanced features of Gopher+ is the ability to request information about a Gopher menu item before you download it. This might include what kind of file it is (for example, a sound file or executable), who created it and when it was last updated.

Gopher+ servers are also able to store the same file in different ways: for example, words as a text file or a PostScript file, graphics as both .gif and .jpg files. If you use a Gopher+ compliant client you can set it up in such a way that the choice among these alternatives is made for you automatically according to your preferences.

Since the Gopher+ standard is backward-compatible with the vanilla variety you only see the extra features if your client software is capable of handling them. For everyone else, a Gopher+ machine looks exactly the same as the older kind.

One of the richest selections of material available from one site can be found at the University of California gopher://peg.cwis.uci.edu:7000/. Here you can find links to other Gophers, telnet services and various kinds of White Pages. This site calls itself the Peripatelic Eclectic Gopher (PEG). PEG's main menu includes a number of fairly academic areas such as biology, the humanities, physics and philosophy, but also some general resources that business users will find well worth exploring.

The link to Electronic Journals connects to the Electronic Newsstand with tables of contents and sample articles from business magazines such as the Economist, Business Week, Inc etc.

The menu entry headed Virtual Reference Desk leads to the Internet Mall, a searchable Acronyms dictionary, the CIA World Factbook, information organised according to subject - including very good lists of economics and business information - the US National Trade Databank, stock market reports, Roger's Thesaurus and worldwide weather.

How to Gopher it using E-mail on the Internet

To begin an E-mail Gopher search, you send the message help to the address gopher@earn.net. In the subject line you put the address of the Gopher where you wish to start. For example, using the address of the 'Mother Gopher' at gopher.tc.umn.edu in the Subject line you would receive a message beginning: Mail this file back to gopher with an X before the menu items that you want. If you don't mark any items, gopher will send all of them. 1.Information About Gopher/ 2. Computer Information/ 3. Discussion Groups/ 4. Fun & Games/ 5. Internet file servers (ftp) sites/ 6. Libraries/ 7. News/ 8. Other Gopher and Information Servers/ etc.

This is the main Gopher menu: to retrieve one of these items, say number 7, you would send back to gopher@earn.net the message (the subject line can be left blank) Split=64K bytes/message <- For text, bin, HQX messages (0= No split) Menu=100 items/message <- For menus and query responses (0=No split) Name=News Numb=7 type=1 Port=70 Path=1/News Host=gopher.tc.umn.edu

In addition to the first two line (essentially specifying information about how a message is to be packaged) a block of lines has been added from the second half of the returned E-mail (not shown): this gives vital information to the E-mail Gopher on which menu item you wish to retrieve.

You can also obtain this item by sending back the whole message with an 'X' placed in front of the appropriate menu number.

In due course you will receive the following sub-menu corresponding to item 7 in the main menu: 1. AMInews Ski Reports/ 2. Cornell Chronicle(Weekly)/ 3. French Language Press Review/ 4. IT Connection (University of Minnesota)/ 5. Minnesota Daily/ 6. NASA News/ 7. National Weather Service Forecasts/ etc.

To retrieve item 6, say, you would send to gopher@earn.net the following message Split=64K bytes/message <- For text, bin, HQX messages (0 = No split) Menu=100 items/message <- For menus and query responses (0 = No split) Name=NASA News Numb=6 type=0 Port=79 Path=nasanews Host=space.mit.edu

which has the same bipartite structure as the other request you sent: the format information and the details about which menu item you wish to retrieve.

After a little while you should then receive the text corresponding to item 6: nasanews: "space" Fri Apr 7 01:46:44 1995 MIT Centre for Space Research This NasaNews service is brought to you by the Microwave Subnode of NASA's Planetary Data. etc.

Obviously any menu item could be chosen at any point in the above, and any Gopher could have been chosen for the subject line of the first message.

Graphics

One of the most important elements of the HTML pages that underlie the World Wide Web is graphics. These can be of two kinds: those that are pulled in only when you click on a link to them (and therefore exist as a separate element in the hypermedia web) and those that are inline images.

Links to the latter are embedded in the HTML coding in such a way that they are generally loaded automatically. There is a very important subclass of inline images that are becoming increasingly common, and which appear on most of the more advanced Web sites.

These images are called transparent, and they are readily recognisable by the way they seem to float in the page, with the background colour of the Web browser meeting the outline of the images exactly. Contrast this with a non-transparent image where there is a definite boundary, which is usually rectangular, that separates the image region from the rest of the surrounding page.

A good example of a transparent image can be found in the top-left hand corner of Silicon Graphics' Home Page at http://www.sgi.com/. Here the company's logo seems to be an embossed element on the page, but is in fact a transparent image.

Transparent images derive their name and their effect from the fact that it is the image's background colour (outside the central area) that is rendered transparent, and thus allows the underlying Web browser background to show through. Various graphics editing tools exist that can be used to create this effect.

Graphical file formats

One of the most important shifts that has taken place on the Internet in recent years has been that from an essentially text-based medium to one that routinely uses multimedia elements. Among these, visual components are by far the most common for the simple reason that no special equipment is required (as with sound on PC, for example). The ineluctable rise of the World Wide Web browser with its inherently graphical approach has also contributed.

And so it is that graphical file formats have moved closer to centre stage. One manifestation of this was the recent brouhaha over the .gif format. The Graphics Interchange Format was devised by CompuServe, but used older proprietary compression technologies - hence the undignified saga of accusations and counter-accusations between CompuServe and Unisys when it came to placing the blame for the sudden attempt to impose licensing fees for these.

The important thing to note about .gif files is that they are lossless: that is, although they succeed in reducing the size of a raw graphical image, no data is lost in the process. This makes .gif files ideal for higher-quality images.

However, even though they are compressed, .gif files can still be large and hence take slower connections a long time to download. This explains in part the popularity of the main alternative graphical format, .jpg. Named after the Joint Photographic Experts Group that devised it, .jpg files are generally smaller than the corresponding .gif file. However, in the compression process information is lost, and so the quality tends to be lower.

Fractal compress techniques (http://www.iterated.co.uk/info) provide higher compression ratios and images that produce better zoom facilities than JPEG. These require plugins to browsers such as Netscape. Fractal compression is asymmetrical - it takes more processing power to compress than to uncompress. This makes it idea for all multimedia use as the receiver of the material does not need to be as powerful at the creator.

These, then, have their advantages and disadvantages, and the appropriate format should be used according to the particular situation.

Gutenberg Project

The Gutenberg Project is another of those seemingly crazy, altruistic exercises that in part help to define the unique spirit of the Internet. The Project's basic plan is to convert first hundreds, then thousands and perhaps one day millions of books into a simple ASCII format, so that anyone can download them for free from the Internet, or acquire them for the cost of a floppy disc. The name derives from the Gutenberg revolution of the 1400s, when the spread of new printing technology brought the cost of a book down by a factor of several hundred, allowing a whole new class of readers to emerge. The aim is to reduce the effective cost of a book by the same ratio again, and perhaps enfranchise another great swathe of the population.

Starting with a modest conversion rate of 10 books a year in 1991, and reaching around 100 in 1994, the project hopes that by the year 2001 it will have a library of some 10,000 E-texts. Practically everything is the work of dedicated amateurs who painstakingly enter volumes for no reason other than a desire to see the Project flourish and the love of reading spread.

Already most of the obvious titles such as The Bible and Shakespeare have been entered. Copyright remains a problem, but happily there is a generous supply of worthwhile books that are in the pubic domain. One of its most recent feats shows the ambitions of the Project: Volume 1 of the classic 1911 edition of the Encyclopaedia Britannica is now available online, part of something be called the Project Gutenberg Encyclopaedia that will grow to include all of the text. It can be downloaded from ftp://mrcnext.cso.uiuc.edu/pub/etext/etext95/pge0112.txt; but note that the file is over 8 Mbytes in size.

H.323

Using the Internet as a means for telephony, with all that this implies for costs, is clearly a fascinating idea. There are now a very wide range of Internet telephony products available. But in a sense, this is the problem: the many competing products mostly follow their own proprietary path. Clearly, the whole point of Internet telephony is nullified if you can only contact a small subset of those active in this area - itself a fraction of the total Internet population.

What is needed is a standard, and it is becoming increasingly likely that it will be one called by the rather unmemorable name of H.323. This has been drawn up by the International Telecommunication Union (ITU), and applies more broadly than to just Internet telephony. As the standard's summary states, H.323 "describes terminals, equipment, and services for multimedia communication over Local Area Networks (LAN) which do not provide a guaranteed Quality of Service. H.323 terminals and equipment may carry real-time voice, data, and video, or any combination, including videotelephony."

The key phrase here is the absence of Quality of Service: this means that some of the digitised packets sent out of the network (the Internet in this case) may get lost, and H.323 knows how to cope with this. One of the first Internet telephony products to support H.323 was from Intel; now Microsoft has come out with a revised version of its NetMeeting, while Netscape too has announced that it will be following the standard. With this kind of backing, the future of H.323 looks rosy.

Hacker

Although the Internet originated from work carried out for the US Department of Defence (a principal concern of which was designing a network system that could withstand a nuclear attack, one of the reasons for the Internet's robustness to this day), it soon developed into a mainly academic medium.

As well as providing university researchers (mainly in the US, but later throughout Western Europe and further afield) with a means of communicating, it became closely bound up with the prevailing student ethos, particularly among those working in the computing fields. One of the consequences of this was that the early enthusiasm among this group for programming and related areas soon became central to the Internet culture too. (It was already tightly woven into that of computers and communications.)

Among these young adepts, this activity was known as hacking. As Hackers, Steven Levy's highly readable 1984 description of this world recounts, "hacking" had none of the negative connotations it so often does today, and the term "hacker" was a badge of honour.

According to this view, hacking refers to any skilful activity to do with computers, and has no sense of the illegal or illicit. This aspect is covered by the term "cracker".

However, the increasingly widespread pejorative use of the term hacker has muddied the waters: it is not always clear whether an Internet hacker is to be admired or not. For this reason the unambiguous term cracker is preferred among Internet purists for those whose intentions are less laudable than their abilities.

HDML

Even though HTML is just an application of the more general and powerful SGML, its extraordinary success has meant that it has become an example others have tried to emulate. One of the latest manifestations of this is the hand-held device markup language (HDML).

This takes as its starting-point the observation that HTML, for all its power, is essentially wedded to a navigational model that presupposes a decent-sized screen and a fair amount of memory and computing power. None of these is available in the world of hand-held devices. These are the not-so-small computers running Windows CE, but those with very small displays and limited bandwidth.

The idea of HDML is not to cram all the content of a Web page into a format that can be displayed by these units - this is clearly impossible - but to come up with a way of displaying appropriately formatted information on these kinds of devices.

In particular, HDML offers a navigational model that is suitable for these screens. This is built on the idea of cards, rather like Apple's Hyper Card, with the concepts of next and previous corresponding to forward and back buttons on a browser.

HDML has a surprising number of big names behind it, including AT&T, Mitsubishi, Sun and Tandem. Its originators claim that more than 2,000 applications have already been created using it. Several US mobile phone companies offer HDML services and both parcel delivery firms Federal Express and UPS provide information in this format.

Hits

In 1994, a 'hit' in the context of the Internet would have meant a successful search item, probably using a WAIS system. Today that meaning is still widely found, but another has usurped it as the primary connotation - and become one of the hottest current discussion topics online.

This type of hit refers to some kind of visit to a Web page: typically a site-owner would claim to get so many thousands of hits a day (or hour if they are a top spot). Hits matter because they are the nearest thing that the Internet (and in particular the World Wide Web) has to readership levels. And these are crucially important for attracting advertisers (now well-established as the preferred way of paying for free Internet sites) and extracting healthy ad revenues from them.

Although everyone agrees hits are important, nobody can agree quite so well on what constitutes a hit. For example, many sites count every Web page access attempt as a hit. This clearly overstates the number of visitors in many ways, notably by including failed and repeated attempts by one person to access the same page. The other extreme is to count only each Internet address, but here there is the problem of multiple users of one account (for example in a company), to say nothing of the fact that no further information is obtained.

It will probably be some time before the hit is pinned down satisfactorily, but its central importance in establishing who is looking at what where when visiting online sites means that at some point and in some way a generally accepted definition will evolve.

HTML

The HyperText Markup Language (HTML) lies at the heart of the World Wide Web, the multimedia hypertext system that has become the most popular way of providing and accessing information over the Internet.

Web documents offer an amazing variety of features, including the ability to employ forms, drop-down menus and area-sensitive clickable images. Remarkably, the underlying HTML document is extremely simple and can be produced by nearly any text editor or word-processor.

Any Web page can be viewed and saved locally. You can then view the file produced with a word processor. This is a good way of learning about features that interest you.

HTML consists of a few tags that are placed in a text file in order to indicate to Web browsers how they are to be displayed. For example, the HTML command <B> turns any text if frames into boldface. So if you type the line:

This use of the angled brackets and pairs of commands (one initiating, the other cancelling) is found throughout HTML.

Text elements

Just the title of the page is enclosed within the <TITLE></TITLE> tags, so the main body is within the <BODY></BODY> pair (with the same "cancelling" forward slash in the second of them).

The first thing to place in the body is generally a heading, which appears in the first line of the page. Headings are normally of a larger type face. There are six headings tags to choice from <H1>,...,<H6>, along with it's cancelling twin </H1>,...,</H6>. The heading text is placed between them.

You can enter the main text, broken into paragraphs using the <P> tag, one of the few that does not require a matching tag to cancel its effect. Line breaks within the HTML file have no effect on its appearance in a Web browser, which is determined wholly by the tags within it. This means you can use them to create files that are more readable in their raw state. Putting these elements together you lead to an HTML file along the lines of: <HTML> <TITLE>A Minimalist HTML document<TITLE> <BODY> <H1>My home page</H1> <H2>First Document</H2> Text for section one. <P> Note that the line break within the HTML document have no effect on its final appearance. </BODY> </HTML>

You can change the character of the text using <B> for bold and <I> for italic. You could replace the line HTML documents have no

Another useful element that is easy to add is the list. There are two basic kinds, ordered and unordered.

The former numbers each entry, while the latter uses simple bullets. The syntax is straightforward: first, you place the tag for the list type that you want - either <OL> for an ordered list, or <UL> for an unordered list - then each element of the list is prefixed by <LI>.

Note that <LI> is like <P> in that it does not require cancelling, whereas both <OL> and <UL> have their corresponding closing tags </OL> and </UL>.

A simple unordered list which could be placed anywhere within the body of the HTML document would therefore take the form: <UL> <LI>My first point<LI>My second point </UL>

Although there are many refinements that can be added to these HTML elements, the basic use described above are more than adequate to create a perfectly viable home page, albeit a slightly dull one.

HTML 4.0

The hypertext markup language (HTML) not only sparked the rapid uptake of the Web for business purposes, but also arguably drove the wider acceptance of the Internet in firms. Indeed, without HTML, the Internet would be little more than a global E-mail system for most companies.

This central importance lends a particular weight to each HTML standard, and makes the formal release of the most recent, version 4.0, a significant event. The full standard can be downloaded in various formats HTML 4.0's relationship to previous standards and to current Web practice is complex. The first official release was version 2.0, published as RFC1866 . Unfortunately, by the time it emerged, the real world of Web publishing had moved on. In particular, the sharpening contest between Netscape and Microsoft led to both introducing unofficial extensions to the HTML standard.

HTML 3.0 was meant to address this threat of fragmentation, but proved too ambitious in terms of achieving all parties' agreement on what should be included. As a result, it was never implemented.

The next official release after 2.0 was 3.2, which included fewer new features than 3.0. Against this background HTML 4.0 was designed to accomplish a number of objectives. First, it had to bring the official standard closer to the reality of what users are doing on the Web.

Second, it needed to establish a firm foundation for future developments. And third, it had to try to incorporate at least some of the long-planned extensions that HTML 3.0 had put forward. So, for example, HTML 4.0 formalises the use of frames for the first time. The standard also builds on the increasingly widely used tables, offering features such as the independent scrolling of table bodies (with table heads and feet fixed), new styles for cell dividers and body fills, and support for grouping columns.

Some new buttons for forms are introduced, as are rather more important internationalisation features. These have long been a glaring omission: the anglocentric bias of HTML was clearly inappropriate in what has aspirations to become a truly global medium.

HTML adds support not just for other character sets (following the suggestions put forward in RFC2070 but also the tricky question of direction. It is now possible to incorporate both left-to-right and right-to-left writing systems on Web pages, and even to mix them in the same line.

Some of the most important aspects of HTML 4.0 are the least obvious, because they flow by implication rather than explicitly from the new standard. For example, there is a general move away from using attributes to change particular HTML elements - such as altering the typeface or typesize of a heading or body-text - to employing style sheets.

These allow full control over the appearance of a document while keeping form and content quite separate; something of a religious issue for HTML purists. There is also a new <SPAN> tag that allows HTML elements to be grouped together: this is particularly useful when taken in conjunction with some newly-defined intrinsic events. These are things like mouse-clicks or the gain and loss of focus for HTML elements.

Combined with scripting languages, intrinsic events allow new levels of interactivity to be added to HTML 4.0 pages.

It is now officially possible to attach scripts to actions such as passing a mouse cursor over a hyperlink: for example, a new style sheet could be applied when this occurs, causing the hypertext link to change colour and font. These new features, coupled with the emerging Document Object Model formalise what is called Dynamic HTML, perhaps the most important advance in Web design since the introduction of tables.

Frames

The essence of the World Wide Web is non-linearity. Hypertext, by definition, lets you leap from one part of a collection of documents to other locations, determined only by what links have been embedded, and where you wish to go.

However, in one respect, the Web page remains a prisoner of its linear origins in text. Although it may contain wonderful multimedia elements, each of which can be a hotspot taking you to many other pages, the basic structure of the Web page on the screen is remarkably conservative. What you see is essentially monolithic, a single page with fixed areas on it.

The introduction of tables in HTML version 3 (through implementations from Netscape and others) alters the appearance of the page, not its underlying structure. This is what makes the new Web page element that Netscape calls "frames" so interesting. It represents a genuine step beyond what is currently possibly (and along the way introduces yet another non-standard HTML element).

Frames let you divide up your Web page into completely independent areas, each of which may have hot links that pull in new elements to that area, leaving the others unchanged. So, for the first time you can choose to place different combinations of hypertext elements on screen as you navigate through the pages.

A more conventional but highly useful way of employing frames is to fix one down one side or along the bottom to act as an on-screen index. Some commercial sites have exploited frames to create advertising banners that do not go away when you move through the main document, making them something of a two-edged sword for users.

A multi-purpose form for your pages.

What are forms?

Forms enable you to collect information from people viewing your Web pages in a structured way and have it automatically mailed back to you.

How do I set one up?

What appears on my page?

Have a look at this example. If you view the source, it's fairly easy to work out what's going on.

<FORM ACTION="/public-cgi/user_form/index" METHOD="POST">
<INPUT TYPE = "hidden" NAME = "to" VALUE = "johnc@ukonline.co.uk">
<INPUT TYPE = "hidden" NAME = "from" VALUE = "Manual">
Heading for field:
<INPUT TYPE ="text" NAME="field0" SIZE ="100"> <BR>
<INPUT TYPE = "submit" VALUE = "Submit" NAME = "Send">
<INPUT TYPE = "reset" VALUE = "Reset" NAME = "Reset">
</FORM>

A autoresponder for your pages

What is it?

The auto-responder allows you to get a person's email address and then email them a file from your web space.

How do I set it up?

<FORM ACTION = "/public-cgi/autorespond" METHOD = "post">
<INPUT TYPE = "hidden" NAME = "send" VALUE = "john/example.txt">
<INPUT TYPE = "hidden" NAME = "subject" VALUE = "exampledoc">
<INPUT TYPE = "hidden" NAME = "from" VALUE = "john">
Please enter your email address:
<INPUT TYPE = "text" NAME = "to"> <BR>
<INPUT TYPE = "submit" NAME = "Send me the file">
</FORM>

A guestbook for your pages

What's a guestbook?

This form allows users to make comments that are then displayed on your Web page for all to see.

How do I set it up?

Guestbook entries are added after the  line in your page source.

<b>Use this form to add your entry to my guestbook:</b> <br>
<FORM METHOD="post" ACTION="http://web.ukonline.co.uk/public-cgi/guestbook">
<INPUT TYPE="hidden" NAME="URL"
VALUE="http://web.ukonline.co.uk/Members/john/guestbook.html">
<INPUT TYPE="hidden" NAME="page" VALUE="john/guestbook.html">
Name: <INPUT TYPE="text" NAME="name" VALUE="" SIZE=20>
Email: <INPUT TYPE="text" NAME="email" VALUE="" SIZE=20>
Comment: <TEXT AREA NAME="comment" rows=5 cols=60> </TEXTAREA>
Where are you? <INPUT TYPE="text" NAME="location" VALUE="" SIZE=20>
<INPUT TYPE="submit" value="Okay">
</FORM>

Stylesheets/CSS1

Style sheets are a way of complementing the structural information provided by HTML, and offer a means of imparting concisely design characteristics. There are currently two main style sheet approaches, DSSSL and CSS1. The former stands for Document Style Semantics and Specification Language, and is an ISO standard

Although DSSSL has various virtues, it is CSS1 - Cascading Style Sheets Level 1 - that is likely to have the biggest impact on the Internet. This is largely because it is backed by the official World Wide Web Consortium and all the major players in the Internet market.

The style information contained in a style sheet can either be applied to an entire document, or to just parts of it. In the former case it can be stored as an external file, perhaps used by several Web pages. This can be done using a new tag that points to the external style sheet.

Alternatively it can be embedded within the HTML document itself. In this case, each mini-style sheet - and there may be several within a document - uses the tag. Style properties can be applied to blocks - for example paragraphs and lists - or to individual aspects such as bold, italic, emphasised text etc.

Elements that can be controlled include font size, font weight, font family, letter spacing and word spacing. As the name implies, Cascading Style Sheets can be applied in layers, so that various style properties are overlaid on top of others, replacing them wholly or in part.

HTML Programs

It is one of the World Wide Web's many interesting features that even the most complex of Web pages can be created on the humblest of desktop PCs. In fact all you need is a simple text editor like Windows Notepad to produce the HyperText Markup Language (HTML) documents that underpin the WWW. Notepad may do, but this has not stopped people from seeking to devise more powerful HTML editors. The overwhelming majority of these have been written for Microsoft Windows, not surprisingly given its dominance of the PC software sector.

Three quite distinct approaches are used: word processor templates, standalone programs and extensions to browsers. The first, building on the fact that HTML files are essentially text, employs pre-existing word processors (mostly Microsoft Word, but also WordPerfect) to provide the main editing functionality. The word processors are customised through document templates that provide both the basic HTML document structure and also modify the on-screen appearance (usually through the addition of button bars and drop-down menus) to provide easy access to the HTML elements (such as lists, images and links to a Web page).

Microsoft, no less, has also taken this route with its Internet Assistant, available from ftp://ftp.microsoft.com/deskapps/word/winword-public/ia/wordia.exe - note that this is a big file, over 1Mb for the 16-bit version (Word 6 for Windows 3.1x) and over 2Mb for Word 7. What is particularly remarkable about this program is not just that it is free, but that included with the basic template is a fully-functional Web browser. Moreover, this browser can be used as a standalone program: just run the file iwia.exe found in the Internet subdirectory of Winword on your PC and you will be able to access Web, Gopher and FTP sites.

In the face of the competition from these free programs it was a somewhat foolhardy decision on Quarterdeck's part to charge £99 for its WebAuthor template, a product which really does little to justify its price-tag (see http://www.qdeck.com/beta/WebAuthor-highlights.html for more information). Perhaps a better move would have been to adopt the alternative approach of a standalone HTML editor which allows more advanced features to be added.

More advanced is WebEdit, which can be obtained from its home page at http://wwwnt.thegroup.net/webedit/webedit.htm. As well as its very clear design, WebEdit is notable for supporting advanced HTML options like tables and the extensions introduced by Netscape. WebEdit is shareware, and costs $99.95 to register.

Another product which is a step beyond the simpler HTML editors is Live Markup. Its innovative interface goes much of the way to offering a fully WYSIWYG editing environment for HTML - no mean achievement given that this is almost a contradiction in terms: HTML does not define the exact appearance of a Web page, only its underlying structure which is then realised by the Web browser that views it. Although slow in its current version, the $99 Live Markup is worth examining as an indication of how HTML editors are likely to develop; it can be downloaded for evaluation from http://www.mediatec.com/mediatech/download.html.

Netscape Gold (and has had since version 1.2) includes a basic editor, but not all features of HTML 3 are covered and it is a little pedestrian on longer documents.

Help on HTML

Extensions to HTML introduced by Netscape (see http://home.netscape.com/home/services_docs/html-extensions.html for a full definition of these) are open to abuse. Netscape has done more than any other product to turn the Web into some kind of hallucinogenic online theme-park full of the gaudiest and most bizarre designs imaginable.

Anyone who doubts this should visit http://www.europa.com/~yyz/netbin/netscape_hos.html. Justly called the Netscape Hall of Shame, this presents the very worst offenders against good taste and public morals who, through their reckless use of Netscape HTML extensions, show how not to put together a World Wide Web site. Look, loathe and learn.

Starting to get graphical with Web pages

One of the most striking aspects of the World Wide Web is its graphical nature. Indeed, although the World Wide Web was creating much interest after it was devised by Tim Berners-Lee in 1989, its current steep growth curve can be dated from the appearance in 1993 of the graphical browser Mosaic.

Given the dramatic effect inline graphics have had on the WWW, and how important an element of Web pages they can be, it is surprising how easy they are to insert. There is one basic command, which takes the form:
<IMG SRC="one.jpg">
This places the file one.jpg at the point in the HTML document where the tag is found. This assumes that the graphic is in the main Web root directory. A graphics file elsewhere can be used simply by giving its full path (or URL if it is located on another machine on the Internet).

There are several refinements that can be added. For example, with this default form, the bottom of the image is aligned with surrounding text. However, you may well want the image to be placed differently. This is done by specifying the alignment explicitly using ALIGN within the tag:
<IMG SRC="one.jpg" ALIGN=MIDDLE>

Another variation is to change the size of the image as displayed within the page. This is achieved by specifying the number of vertical and horizontal pixels it should occupy on the screen:
<IMG SRC="one.jpg" ALIGN=MIDDLE HEIGHT=50 WIDTH=50>

If the ratio of the HEIGHT and WIDTH elements differs from that found in the image, the latter is stretched in the relevant direction. Since the size of the graphics file does not change if it is used in a scaled-down form (it is the browser that does the scaling of the original), it is seldom worth using this scaling for any but very slight changes. As the user will pay (in download times) for the graphic whatever size it is used, better to fit the graphic to the space beforehand if possible (or use the HEIGHT and WIDTH only for scaling up rather than down).

There is a final subtlety involving graphics images, generally accepted as good practice. Using an element of the form ALT="a .jpg image" the words "a .jpg image" will appear in the graphic's stead when it is not possible to display the image itself. Obvious applications of this are when visitors to a site are using text-based browsers such as Lynx, or when the visually-impaired employ devices for converting text into sound, say, where graphical elements without such text make the pages unusable.

Moreover, there is another important class of users who will benefit from the use of ALT. Many people prefer to navigate around the Web with inline graphics turned off. This allows them to move through pages very rapidly without waiting for the sometimes large graphics files to be downloaded. The more types of users you cater for with your Web page, the more popular and successful it is likely to be.

Putting all these elements together in the standard HTML containers gives a minimal Web page of the following form:

<HTML> <TITLE> Graphics </TITLE>
<BODY> <H1> Graphics in Web pages </H1> This is text <IMG SRC="one.jpg"> at the bottom. This changes the image size <IMG SRC="one.jpg" ALIGN=MIDDLE HEIGHT=50 WIDTH=50> This copes with text-only browsers by displaying the words <IMG SRC="one.jpg" ALIGN=TOP HEIGHT=50 WIDTH=150 ALT="a .jpg image"> </BODY> </HTML>

HTTP 1.1

The hypertext transport protocol (HTTP) is so fundamental to the World Wide Web - it mediates every connection between Web client and server - that it is all but invisible apart from its presence in URLs. And yet ironically this crucial element of the Internet is responsible for much of the current slowdown in response.

Of course, this is hardly the fault of Tim Berners-Lee when he drew up HTTP. It does what it was designed to do very efficiently: move packets across the network. But the pattern of use found today is so far removed from the original intention that HTTP is now beginning to show the strain. HTTP 1.1 is designed to address some of these problems.

For example, every URL that is retrieved initiates a completely new request by the client to the server, even if it refers to the same Internet server. Setting up and tearing down these connections is wasteful, so the revised HTTP specification allows what are called persistent connections.

Once a connection is made, data can be buffered up so that it is sent in the most efficient way possible, leading to faster throughput of information and faster Web page downloads.

Another advance over the original HTTP is to do with acknowledgements. Under version 1.0, a server would wait for an acknowledgement after every packet of data that it sent. With HTTP 1.1, it can start sending subsequent packets without waiting for the acknowledgement of the first, a process called pipelining.

Although apparently minor changes, these refinements should go some way to speeding up the Web and the Internet, but only once HTTP 1.1 is widely implemented in software.

HyperText Transfer Protocol

Anyone who uses World Wide Web browsers such as Netscape or Mosaic soon becomes familiar with the four letters 'http'. All Web site addresses (URLs, or Uniform Resource Locators) begin with them, as in http://www.ukonline.co.uk/. These initial letters change according to the type of service you access.

So, for example, if you want to connect to an FTP site with your Web browser, you would enter something like ftp://ftp.microsoft.com/.

In this case 'ftp' stands for File Transfer Protocol, and refers to the set of rules (or protocols) used by the two computers in question - your client and the distant server you wish to access - to negotiate the admission to the FTP site in the first place and the subsequent transfer of files from there to your machine.

In a precisely analogous way, 'http' stands for HyperText Transfer Protocol, and defines how your Web client program contacts the Web server whose address comes after it (the www.ukonline.co.uk part of http://www.ukonline.co.uk/), and how data is passed between the two.

A common source of confusion in this context is the almost universal use of HTML - HyperText Mark-up Language - alongside HTTP. In fact they serve quite different functions.

HTTP is about how contact is made and maintained between the two sites - how messages are sent, with no reference to what is sent. HTML, on the other hand, defines the structure of the hypertext documents that are retrieved - what the messages are - with no reference to how they are passed around.

Hypervideo

One of the key elements of the World Wide Web is the hypertext link. Although this idea was not new when the Web offered it, it was the Web that made it easy to implement and that demonstrated so convincingly how useful the ability to jump from document to document could be.

The Xlink application of XML extends the range of possibilities as far as hypertext links are concerned. It still only applies to static objects in a document and does not attempt to embed hyperlinks in elements that exist in time. This may be remedied by IBM's Chinese research laboratories. It has devised a hypervideo system with rather unfortunate name of Hot Video, and has written creation and viewing tools for tit. These allow hot spots to be embedded within popular video formats such as AVI, MPG and MOV.

Users can play Hot Video content from local discs, CD-ROMS and servers as well as view contents over the Internet using special Hot Video plug-in for Netscape Navigator or Internet Explorer.

HVML

Even though many high-speed corporate connections are not being installed, the telephone is still a key element of the Internet's infrastructure. The Internet is also wedded to the telephone through Internet phones that allow conventional voice messages to be digitised and sent over global wiring for the cost of a local call.

But there remains another important facet of the Internet-telephone interface that has so far remained untouched. Accessing Web pages, say, using an ordinary phone without a computer.

The idea is that the most important elements of hypertext markup language (HTML) pages are text and links; the former can clearly be converted into audio files, while selecting hyperlinks is no different from already ominipresent touch-tone menus employed by many voicemail systems.

All that is required is some standard way of converting information on HTML pages into output that can be channelled down the phone. A company called Stylus has extended work in the area of interactive voice response systems (such as touch-tone banking and fax-on-demand) to come up with the hypervoice markup language (HVML).

These are extra tags added within an HTML document, transparent to most browsers, but enabling suitable software to take the content they refer to and make it available to users calling in with an ordinary phone, but no computer, to special servers.

The HVML tags include ones for playing a pre-recorded file, prompting a caller for touch-tone input, speaking information from the HTML page, and following a hypertext link.

ICQ

One of the ironies of the Internet is that a medium with potentially tens of millions of simultaneous users has almost no sense of this huge online community. The two main forms of communication - E-mail and Usenet newsgroups - are not real-time, and Internet Relay Chat, which is real-time, is unknown in mainstream business circles.

Various programs have allowed real-time conversations across the Internet, for example WS_Chat and Wintalk. Both of these allowed messages to be typed into a small window which would then appear on the recipient's screen. But before you could use such software, you needed to know both the recipient's Internet address and when they were online. These requirements meant that chat programs remained niche products with small user bases.

Messenger
AOL got round both problems with its AOL Instant Messenger service (see www.aol.com/aim/home.html), since user names could replace Internet addresses, and the central AOL servers could maintain lists of who was online at any given time. The disadvantage, of course, was that only members of AOL's online service could use this system.

More recently, Netscape offered AOL's approach to anyone on the Internet using its browsers (http://home.netscape.com/newsref/pr/newsrelease511.html), but in the meantime an Israeli start-up company called Mirabilis has colonised this sector in the most dramatic manner conceivable. The firm claims that its ICQ program has more than 13 million users, with an astonishing 60,000 signing up every day. ICQ is available free for most platforms, including Java (see www.icq.com/productsdesc.html); there is also a groupware version for intranet use (see www.icq.com/groupware/). A cluttered home page (at www.icq.com/icqhomepage.html) gives some idea of the amazing culture that has grown up around this program.

After you have downloaded the program and signed up to the service, you can create contact lists of friends and colleagues (sometimes called "buddy lists") with whom you wish to chat. The ICQ server monitors who is online when you are, and indicates this in your ICQ window. It is then a simple matter to initiate a chat session with one of your contacts, through further windows that appear (including one that shows the entire history of the conversation, useful for keeping copies). You can also send files, URLs, contact lists and E-mail. ICQ offers an E-mail service as well as free home pages.

It is possible to search ICQ servers for particular users, or to investigate general groups of people with common interests. One powerful extension of ICQ is that it can be linked with Net telephony: that is, you can use ICQ to alert you when certain users are online, and then call them up using one of the many IP telephony products.

Eminently usable
As the level of users suggests, ICQ is something of a cult program. Despite this, and its rather gaudy user interface, it is eminently usable for business purposes.

For example, with sales people on the road ICQ would be a valuable way for managers and colleagues to find out when they are logged on - and perhaps contact them cheaply using Net telephony, even if they are abroad. You can also create small closed groups of users, called Virtual Private Online Networks - see http://www.icq.com/create-network.html. ICQ will work behind corporate firewalls, including those using proxy systems, as the excellent resources for the subject at www.icq.com/firewall/firewallhelp.html explain.

Alongside its vast user base, ICQ's most notable feature is that it provides Mirabilis with no revenue stream. The software is provided as a rather curious time-limited free beta, and there is currently no advertising. Quite why this company without profits or income, and with huge and constantly growing costs, was recently bought by no less than AOL - which already has its own buddy-list service - for $287m (£179m) is discussed below.

Portal mania grips the Net

The race is on as Web companies aim to bag the most users through portal sites.
America Online's (AOL) decision to pay $287m (£179m) for the company Mirabilis (see press release at www.icq.com/press_release26.html), described above, was not some desperate attempt to take its rival out of the chat market. AOL was really buying the ICQ program's claimed 13 million users.

Aside from their number, what made these so attractive was their geographical distribution: more than 60% are outside the US. With the purchase of Mirabilis, AOL was increasing its global Internet reach dramatically. That reach is important because the race to generate concentrations of users - through the creation of portal sites - is hotting up.

Even though the costs of adding new content and services, and of buying complementary or rival sites, is enormous, all the major players have plunged into this cyber-maelstrom almost without worrying about the consequences.

Valuations
They are able to do this in part because the US stock market is placing truly incredible valuations on these key sites. Even though very young, with few tangible assets and never having turned in a significant profit, portals are now valued in billions of dollars.

For example, according to the stock market (June 1998), Yahoo is worth about $6.85bn, Excite $1.94bn, Lycos $1.27bn, and Infoseek at $950m. In fact portals are now so attractive to investors that Netscape, the original Internet company, is trying to redefine itself in part as a portal.

Microsoft, too, is trying to join the portal club with its new Start service, although the beta version (at http://home.microsoft.com/) doesn't adopt the Yahoo approach.

This portal frenzy, and the accompanying stock market madness, are driven by two important trends: one happening now, and one that most people involved with the Internet business believe will occur at some point in the future.

The first trend is the move by old media - press, TV, cable, and film - to get into new media as quickly as possible. There seems to have been a sudden realisation that the Net is not just a fad, but will revolutionise all aspects of every media.

Established media companies are only too conscious that they are being wrong-footed by smaller, swifter competitors, and are terrified that they have left things too late. As a result, there is a rush to create major alliances with the main online players - the portals.

Fledgling
Other deals have already taken place - for example, NBC with the fledgling portal Snap (see www.snap.com/main/help/item/0,11,-8032,00.html) - and many more will doubtless follow.

The other, future trend is in many ways the justification both for the crazy valuations placed on these portals, and for the rash of morganatic media marriages. Most people believe E-commerce will be huge, and that portals will be one of the main channels for such purchases.

Certainly the pace at which E-commerce initiatives are coming through is encouraging, and the technical issues such as security have largely been addressed.

But it is hard not to see current stock market valuations as excessive, even against this positive background. In many cases, flotations of Net firms look to be highly opportunistic, riding a wave of investor enthusiasm that will surely evaporate as losses mount and huge returns prove slow in arriving.

The danger is that there will then follow a backlash, with massive drops in share prices, a knock-on effect on the stock exchanges, and a general disenchantment with all things Internet.

Infoseek

One thing is certain: it's going to be a busy year for the portals, which will undoubtedly form the focus of these new forces.

Inline images

The original World Wide Web browsers were purely text-based, and offered a basic means for following hypertext links (one such browser, Lynx, is still encountered today). Perhaps the crucial advance beyond those path-breaking early versions was the addition of graphical elements.

There are actually two quite distinct types of images found in such multimedia-enabled programs. The first are those images that are displayed as separate entities; more interesting - not least from a design viewpoint - are graphical files that are integrated into the page, so-called inline images.

Although these are of necessity sent separately from the ASCII HTML file that defines the overall structure of the Web page, they seem to be embedded within in it (though exactly how they will appear on the screen depends critically on the capabilities of the browser, as is the case with nearly all attributes of a World Wide Web creation).

Browsers like Netscape or Mosaic that are able to cope with such inline graphics process them first (since they are almost always sent as a compressed .gif or .jpg file) before displaying them at the appropriate point in the on-screen representation of the HTML document.

Other graphical formats (for example .bmp or .tif) require the specification of a helper application that can similarly process and display the file. However, in this case the image is placed in a separate window, and thus is not integrated into the fabric of the Web page. One of the latest developments in the Web arena has been the extension of the inline idea to other, more complex file formats such as VRML.

Intermercials

Web advertising remains the main way of generating money on the Internet. Given this importance, it is not surprising that advertising agencies are starting to create variations on this basic theme. The simplest approach is to place advertising messages at strategic points in a document, usually the top and sometimes the bottom. These banner ads normally take the form of rectangular areas with static messages and links to advertisers' sites or short animations.

But visitors to Web sites can pass quickly over these formats. Indeed, the commoner such banner ads become, the more reflexive is the user's act of scrolling down a page to reach the editorial. As a result, a different approach is being developed. Rather than placing advertisements within the space of a Web document being viewed - in the page - the idea is to place them in the dimension of time.

That is, as the visitor passes through a Web site, advertisements interpose themselves between the editorial pages. The viewer has to wait the pre-defined number of seconds before the ad gives way to the next editorial page. Because these advertisements take place in the self-created gaps of the viewing experience - the interstices - they are generally known as interstitial, or even Intermercials (an abbreviation of interstitial commercials). Clearly this forced viewing is good news for advertisers. But equally it goes against the whole dynamic of the Web, and if abused may prove counter-productive.

Internet Connections

The simplest Internet connection is that provided by E-mail. On its own E-mail is restrictive. To be "on the net" you need a true Internet address. These are expressed numerically. The equivalent to E-mail address is expressed in characters. For example jane.smith@abc.industries.co.uk is uniquely expressed numerically as four number sets: 199.100.128.64.

You can also use other Internet facilities through dial-up online services, such as FTP (file transfer protocol), gopher, archie, telenet and World Wide Web.

Many companies set up a central server to act as a gateway. to use the Internet in this way the company requires some kind of full Internet connection and the system administrator has to set up accounts for end-users within the organisation.

Full control - and your own Internet address with its numerical equivalent - is available when you have access to some kind of transmission control protocol/Internet protocol (TCP/IP) link, which can be provided over a company network, or over a dial-up line using a modem. TCP/IP underpins the whole Internet and is basically a set of standards that determine how information is transmitted and received. It is also the name of the implementations of those standards.

If you have a TCP/IP link to your desktop computer, you can then be truly "on" the Internet, have full access to all of its facilities, and can run advanced Internet programs that understand the TCP/IP protocols on your machine.

When you obtain a full Internet connection, you will be allocated one or more numbers that make up one form of your Internet address - something along the lines of 129.34.139.4 (this is one of IBM's many addresses). A friendlier form, along the lines of bloggs.co.uk is provided. These names are registered centrally to catch potential abuses, such as intentionally registering someone else's name. It is possible for UK businesses o register names in the US (to give an address of blogs.com). this means that although the UK committee will probably stop your competitors from filching your company name in the UK, there may be nothing to stop them doing so in the US, where the central registry will probably never have heard of either company. The thorny subject of trademarks and Internet names remains unresolved (see http://www.fenwick.com/newpub/sma-trade.html for a useful summary of the situation in the US).

At the main InterNIC name registry, free registration has been replaced by a $100 charge for two years (see http://rs0.internic.net/announcements/fee-policy.html). In the UK the former gentleman's club has been replaced by a non-profit making organisation called Nominet who now register the .uk domain. The charge is £100 for two years' registration. See http://www.nic.uk/press.html for details. A new company called Nomin Nation is offering a two year registration using the uk.com sub-domain (instead of .co.uk, .plc.uk or .ltd.uk) for £45. Details at http://www.nomination.uk.com/. It may be possible in the future to create a domain, e.g. IBM could use .ibm instead of .com. Individual brands could also be registered. To join the debate and mailing list, send the message subscribe to the address newdom-request@iiia.org. Previous postings are held as a series of linked Web pages at http://www.iiia.org/lists/newdom/. Other useful documents on this topic are:-

http://www.iiia.org/draft/draft-postel-ianaitld-admin-01.txt which presents the overall framework for extensions to the current registries.

http://www.iiia.org/draft/draft-higgs-tld-cat-01.txt offers some proposals on what form the new domains make take based upon the International Trademark Schedule of Goods. Examples include .rope, .clothes, .handtool, .meat, and .cyborg: the last for manufacturers of surgical and medical equipment, apparently.

Internet Message Access Protocol

In the section POP3 v. SMTP the distinction between Simple Mail Transfer Protocol (SMTP) and Post Office Protocol 3 (POP3) is discussed. The former is used for the sending of E-mail messages across the Internet, which pass from system to system until they arrive at their end-destinations, while the latter is designed to allow users to retrieve E-mail from a POP3 server, which acts as a kind of poste restante.

POP3 therefore allows you to download your E-mail at anytime, and from anywhere, whereas SMTP can only send to a given address, and delivers as soon as it can.

Although POP3 is a convenient adjunct to SMTP, it is very limited. All you can really do is download a message.

More complex manipulation of the E-mail held at the electronic post office is not possible. For this reason the Internet Message Access Protocol (IMAP) was developed. It builds on the ideas of POP3, but offers a far wider range of features. These include the ability to create, delete, and rename mailboxes held on the IMAP server; check for new messages; permanently remove old ones; search through the messages; and fetch message attributes, texts or portions of texts.

Messages are accessed through IMAP by the use of numbers. These are either message sequence numbers (where each message's relative position to the first message is used) or other, unique identifiers that have been assigned.

Internet Relay Chat

Internet Relay Chat, or IRC, is one of those areas of the Internet that provoke strong feelings. For some, it represents the point at which the millions of Internet users can come together in the most unmediated way; for others, it is the biggest electronic waste of time ever invented, as proved by the thousands of helpless IRCers who spend most of their waking hours inhabiting this twilight world.

The idea (first thought-up in Finland) is simple: to use the Internet as a means of communication, but not just from one person to another, but as a way of passing typed messages in real time among a group of people. This is achieved using the standard client-server architecture: an IRC server relays the messages sent by IRC clients to the other participants, who may be anywhere on the Internet and connected via other IRC servers (who send and receive the messages among themselves before passing them on to the users).

Obviously it would not be feasible to pass all messages to all users: instead, IRC is divided up into channels. Each channel has nominal topic, and is controlled by the person who creates it (new channels are created very easily). Typically there will be a few dozen people joined to a channel, with several thousand simultaneous channels.

Just to complicate matters, there is not one but two IRC systems, both working in exactly the same way, but consisting of separate groups of IRC servers. The main one is called EFNet, and is larger but renowned for its political divisions and consequent temporary fragmentation; the other is called the Undernet and at least theoretically offers a slightly stricter operational framework.

Internet Users

For a market as important as the Internet, remarkably little is known about its users. While it is true that there are an increasing number of surveys and market research reports, it is still quite hard to draw all these together to form an overall picture of the online world. This makes the CyberAtlas site at all the more valuable. It has been put together by I/Pro, a company specialising in the field of Web measurement. Aside from the odd discreet plug of this company's own services, the site is pleasantly free of advertising.

Its basic navigation metaphor is extremely simple, using points of the compass on the image of the opening page (though there seems to be no route in for text-only browsers - a rather basic blunder to make). The site itself is very shallow structurally, so there is no danger of getting lost in ever-deeper levels.

Most of the links are self-explanatory. So News leads to the latest news in the field of Web measurement, while Market Size pulls together figures from many sources to provide a fascinating overall view of this area. Other obvious links are Demographics, Geographics and Usage Patterns.

There are market research figures for browsers, modems, and servers. Other ways of looking at the online world are in terms of advertising, electronic commerce, intranet activity, and overall site building costs. One of the best links is to other related resources. It is also worth emphasising that all of the above-mentioned pages have links through to the original sources for their figures, allowing further research to be carried out at the original site.

ISDN

As most people know, to connect a computer to an ordinary telephone line requires a modem. This stands for MOdulator/ DEModulator, and refers to the process of converting the computer's digital 1s and 0s into audio tones (the modulation) that can be sent across the telephone network and then decoded (demodulated) at the other end. Clearly this is an inefficient process, and ideally you would like to be able to send digital signals directly across a phone connection.

This is precisely what the ISDN (Integrated services digital network) system offers. ISDN allows you to send digital data across special lines to other computers without the need for conversion. Aside from obviating this need at both ends, another benefit is speed: ISDN lines usually run at 64Kbps, twice as fast as the raw speed of a V.34 modem.

Moreover, ISDN lines typically come in pairs that can be combined to double the throughput to 128Kbps; although most service providers are not able to offer this service.

ISDN is not a new technology, it is one that has hovered on the sidelines of the computing world for many years. It was originally developed with a view to allowing a wide range of high-throughput services - hence its name. However, it is increasingly attractive as a means of offering fast Internet access.

More and more suppliers in the US are now starting ISDN services, with those in the UK following suit.

The biggest barrier to the take-up of ISDN remains BT's high cost of installation for the lines. In this respect, the UK is well behind other European countries such as Germany where installation costs have fallen, leading to an sharp rise in the number of ISDN lines installed. Until recently, BT's charge for installing an ISDN line with 2B+D channels was £400, while annual rental was £336. This was far higher than in countries like France and Germany (both around £80 for installation and lower than BT for rental, and has doubtless been the main reason for the slow take-up of ISDN in this country compared to those countries.

BT re-organised the price structure in September 1996. Although reducing the first year cost by over £100 the customer has to pay in advance for as the new price includes a number of calls and standing charges in the start-up cost (ISDN calls are charged at the same rate as analogue calls, as they are in most European countries). It is clear that BT do not want to offer this service as they make it very difficult to work out the pricing information from them.

Then Oftel intervened. There was some hope that it would force BT to offer real reductions rather than the window-dressing of the new package. But instead, the revised charges that emerged from consultations between BT and Oftel actually increased the overall charges in the case of the frequent user - the only novelty of BT's pricing, and the one most likely to appeal to Internet users who typically spend more time logged-on than for voice calls. As of

October 1996 the prices for ISDN-2 (2 lines) are as follows Connection charge, £199 Quarterly charge (for the first 24 months) £132.75 which includes £105 worth of calls Annual standing charging (after 24 months) £230 Calls, locally, nationally and internationally are charge at the same rate as standard lines. International data connections are priced differently. e.g. Destination, day-time, evenings, week-end (per minute) Hong Kong, 86.5p, 69.3p, 63.8p USA. 25.2p, 23.9p, 22.2p

In terms of performance this compares favourably with leased line rates, which start at 64kbps. However, leased line connection offers permanent connectivity, whereas ISDN (and Pots) is on-demand. The dream of all Internet users must be to possess either a T1 or T3 line. Sometimes the letter E is used in Europe instead of T, with slightly different values: E1 is 2Mb/s and E3 34Mb/s while T1 offers 1.5Mb/s and T3 45Mb/s. Understandably, E3/T3 connections are rare at the moment, although probably there will come a time when everyone will have one.

If BT is the continuing bad news about real-life ISDN for business Internet users, fortunately just about everything else is good news. For example, Demon Internet offers ISDN connectivity to the Internet for the same basic £10 a month it charges for its ordinary national dial-up service - very good value compared to many previous (and some current) ISDN pricing which ran to thousands of pounds of year. And Demon Internet is not the cheapest: Global Internet offers nationwide ISDN access to the Internet for just £7.50 a month. Moreover, even consumer-oriented services like UK Online offer country-wide ISDN, which shows how general ISDN is becoming.

On the software front the biggest breakthrough, in Europe at least, has been the arrival of the German-led cross-platform standard Common ISDN API - CAPI - now at version 2.0. Basically it offers a standard software interface that lets software control ISDN hardware. This allows generic comms programs such as Procomm Plus to support any compliant ISDN hardware and drivers. An older standard sometimes still encountered is WinISDN.

As far as hardware is concerned, there are three main ways of connecting to the Internet using an ISDN connection. External devices - usually known as Terminal Adapters or TAs - hooked up to a serial or parallel port; routers, and internal cards, the cheapest route.

Two years ago, ISDN cards for PCs cost around £1000; today there are several for just a few hundred pounds. Perhaps not surprisingly, it is a German manufacturer, Teles, that offers about the cheapest product; its plug and play card supports all the main European standards, comes with a wide range of useful software, and costs just £169. I was able to connect at 64Kbit/s to the three ISPs mentioned above without problems with this card. Another European company, Chase Research, has launched the new NetChaser TA to complement its ISDN-PC internal card.

Worth mentioning is Microsoft's growing support for ISDN which is likely to push along these emerging standards even faster. It has devoted a whole section of its Web site to this area. As well as downloading ISDN hardware drivers you can even order ISDN connections online - at least you can if you are in North America or France (with Germany to be added soon).

Why ISDN is (finally) perfect for business

For many years, the Integrated Services Digital Network (ISDN) telephone technology has been a solution in search of a problem. There are increasing signs that the Internet may well be that problem, and that ISDN can offer the perfect fast and low-cost dial-up solution for companies.

When ISDN first appeared over a decade ago, it was an exotic beast; it was based on the then strange idea that all information should be transmitted digitally over the telephone network, rather than as smoothly varying analogue signals. Today, when most of the telephone infrastructure is built around digital technology, it is the analogue last leg to your desktop that is the anomaly.

Indeed, we find ourselves in the crazy situation where digital data from a computer is converted into analogue signals (using a modem) to be transmitted to the local exchange where it is then converted into a digital format for transmission over the telephone network. ISDN simply cuts out the intervening analogue step - with obvious benefits in efficiency and throughput. Moreover, no new wiring is required.

The basic kind of ISDN, generally known as Basic Rate Interface, or BRI, offers three distinct channels in this datastream. These are conventionally called 2B+D, and refer to one channel (D) used for signalling, and two bearer channels (2B) which carry the data. The D channel has a capacity of 16 Kbit/s, while each data channel offers 64 Kbit/s. That is, each B channel has over double the throughput of a standard 28.8 Kbit/s V.34 modem. Because the signal is purely digital, there are almost no line noise problems typically found with high-speed analogue modems.

As well as this far greater throughput, the distinguishing characteristic of ISDN is the presence of the D channel. This acts as a control channel, and coupled with the intelligence residing in a computer allows many advanced services to be offered. Perhaps most importantly, the D channel allows ISDN connections to be made almost instantly. Those connecting to ISPs via ordinary dial-up connections will be only too familiar with the long period of negotiation between modems at each end. With ISDN this generally takes less than a second.

As a consequence, ISDN is perfect for obtaining quick and frequent access to the Internet. The fast connection and disconnection times (known as set-up and tear-down in the ISDN community) mean that it appears as if you have a permanent link, but you pay only for the time that you are connected - the best of both worlds.

Other tricks that are possible using the D channel and PC intelligence include the combination of the two B channels to give you an effective throughput of 128 Kbit/s; the possibility of making and receiving calls on one B channel while the other is already in use; and the ability to use a series of numbers with a single ISDN line, allocating each to a different preset function (multiple telephone handsets, data, fax etc.).

Given these and other advantages, ISDN represents a natural progression from ordinary analogue telephone connections. However, until recently, the obstacles to accessing the Internet in this way were so great as to render it impractical. The first problem was that there were many different and largely incompatible implementations of ISDN. By some miracle, Europe has managed to act in concert and devise a standard form of ISDN, known as Euro-ISDN.

Ironically the US is much further behind in this respect, and ISDN services there have a dizzying array of standards and options. For a full explanation of this and all other aspects of ISDN see the excellent book Using ISDN (£36.99, ISBN 0-7897-0843). There is also an extremely comprehensive set of links relating to ISDN .

The second problem was the absence of support for ISDN within general software, and few dedicated programs. The introduction of standards like WinISDN and CAPI have changed that, as has the ISDN support in Windows 95 and NT. Similarly, until recently there was no agreed way to connect to an ISP using ISDN, where today there is Synchronous PPP. Last, and by no means least, prices for ISDN equipment were prohibitive. During 1996 prices have continued to drop dramatically, and there is now no reason why the golden age of ISDN in business should not at last begin.

Intranet

One of the most obvious trends during 1996 has been the rise to predominance of the World Wide Web for business purposes, particularly marketing and sales, and there seems little doubt that this growth will continue. But alongside the expansion in the use of these public sites - that is, those that are expressly-designed to be accessed by anyone on the Internet - there is a new and interesting use of the Web purely internally within a company, often with no direct links to the outside world - Intranets.

At first this might seem a strange thing to do, since part of the World Wide Web's power and appeal is that it is so effortlessly worldwide: anyone can access a public site from anywhere. But companies are beginning to realise that the Web is not just a trendy alternative to conventional marketing media, but a complete information delivery system. Moreover, it is one that possesses a number of advantages over proprietary solutions that attempt to offer the same facilities.

For example, the Web is an open system that is completely independent of platform. You can mix and match any operating system and any hardware that supports the TCP/IP protocols underlying the Internet (and hence WWW). Similarly, it is completely scalable: you can start from a minimalist Web where the HTML files reside on the same machine as the Web browser and progress incrementally to millions of connected machines and thousands of distributed databases - as the Internet itself proves.

Another advantage of internal Web systems over proprietary groupware and executive information systems (EIS) is cost: even Netscape costs only $39 for the supported commercial version, and there are other good browsers that are free. Ease-of-use is also a major benefit: few people have any difficulty using Web browsers with little or no training; moreover, the nature of the hypertext systems means that navigation is truly a point-and-click affair, and that multimedia elements are integral and active components rather than redundant frills tacked-on afterwards.

The flexibility that has made the World Wide Web so popular for commercial applications means that it can be applied to almost any department and to any task. An obvious application is internal communications, replacing telephone directories, company organisation charts, corporate newsletters, annual reports, general circulation memos, noticeboards and job postings. The seamless integration of Web clients and servers hanging off intercontinental networks means that multinational companies can bring together staff in a way hitherto impossible.

Alongside company-wide Webs, there are more local applications. Product development teams can set up stores of information that are shared among the relevant participants, ensuring that the appropriate parties are informed in a timely manner and working together efficiently. Internal Webs are an ideal way of gathering and distributing all kinds of marketing information, and allow physically separated departments to work together on common projects (which can involve graphics, audio files for radio work and even complete videos for television or cinema).

Sales teams can similarly obtain up-to-the-minute information on company products and services, pricing, competitive data, client records, and access these at any time of the day or night when out on the road by dialling into the corporate network using TCP/IP protocols and a browser on their portable. Financial departments can use Web resources for distributing quickly time-sensitive information such as key indicators or budget allocations and obtaining managers' regular forecasts for analysis and consolidation. The new security features of Web servers means that even the most confidential information can now be sent safely over open corporate networks.

In fact, combined with standard Internet e-mail for one-to-one communication and newsgroup servers for many-to-many discussion groups, there is little that internal Webs and ancillary Internet software cannot now accomplish in a way that is far easier and certainly much cheaper than current solutions involving groupware or EIS.

'Intranet'

In a way, the name given to the Internet is highly confusing. It derives from the part of the TCP/IP suite of protocols that define its operation: IP stands for Internet Protocol, and refers to the way in which data packets are routed between networks.

In fact, before the Internet, internets were nothing special, and despite the overwhelming importance of the Internet, internets still perform vital functions - though they probably tend not to be called internets for fear of confusion.

Almost the opposite has happened with the word 'intranet'. The term is a neat adaptation of the word 'Internet' to apply to TCP/IP-based networks that exist entirely within a company - with no external portions. Just as the Internet supports multiple services, so intranets are not just Web-based, but can offer E-mail, FTP, telnet and much more.

In some ways, intranets can be thought of as hidden zones of the Internet, either entirely disjoint from it, or perhaps attached at one or two points by a firewall that acts as a one-way mirror, allowing intranet users to see out, but no one to see in - in theory, at least.

Obviously, any company can set up an intranet - or even more than one, if there are several disconnected internal networks running the TCP/IP protocols (if they are connected then they just merge into a larger, single intranet).

For this reason, intranets should, strictly speaking, have a lower case first letter, since they represent a general class, rather than an upper case, as with the Internet, which is unique, and refers to the global TCP/IP network that started it all - and also gave rise to all this hair-splitting.

Real-life internal Internets

For Hewlett-Packard (HP), such an internal Internet system is central to its whole way of working: it has over 1,400 servers, an e-mail system that handles 1.5 million messages a day, private newsgroups and routine internal transfers of files using FTP.

Among the 200 or so home pages that exist on its internal Web is one set up by the sales office in Seattle. Since this group of employees has responsibility for the major local customer Boeing, the home page lists all those working on the account, not just in Seattle, but in HP offices around the world. There are details of the current projects as well as links out through a firewall to Boeing's home page so that those involved have an up-to-date picture of their customer's activities.

Another home page is run by HP's European personnel group. Jobs across the whole of European operations are listed, allowing employees easily to keep abreast of promotion opportunities at many sites and departments. One sales force uses a home page for product information, Q&As and news about product updates. There are also a few personal Web documents. More generally, HP uses the internal Web to share organisational charts, mission statements, executive speeches, employee newsletters and the personnel policy manual.

Although the example of Hewlett-Packard shows how effortlessly Internet technologies can cope with enormous systems (the total monthly volume of data passed over the internal network is 5 trillion bytes), this certainly does not mean that they are only suitable for globe-spanning companies. An equally successful example of how these techniques can be applied comes from the other end of the spectrum.

Zeneca Engineering is the corporate engineering design and build function of Zeneca, a company born out of ICI by demerger. The Zeneca Engineering Web (ZEW - which of course has its ZEWkeeper) links around 150 users, and is unusual in that it is completely serverless. Instead, a shared network drive is used, and Web pages are pulled in to Netscape simply by opening them as a local file.

The benefits of this approach are that it is extremely easy to set up and maintain, can be implemented across any local area network (not just one running the TCP/IP protocols), and yet can also contain links out into the external WWW (where full Internet connectivity is provided). The downside is that advanced Web options like forms and clickable images are not available.

The main application of ZEW is called Corporate Memory, a store of engineering information. It acts as a kind of expert system for non-specialised engineers, but unlike dedicated systems is both easy to set up and use (since it employs the familiar Netscape interface). Possible future plans include evolution into a collection of pointers to information on the 'real' World Wide Web, placed there by engineering equipment suppliers - a transition that is easy to accomplish because of the way HTML elements can be added and deleted incrementally. Other applications include distributing general information bulletins - for example, on corporate IT strategy.

More case studies of internal Web systems, this time in the US at Eli Lilly and Sandia National Labs, can be found at http://home.netscape.com/comprod/at_work/customer_profiles/index.html, which describes some uses of Netscape products (unsurprisingly, given the URL). As you might expect from a company which is attempting to position itself as an Internet trendsetter, Netscape is well to the fore in evangelising about the use of this technology internally, and has produced an interesting paper on the subject which can be found at http://home.netscape.com/comprod/at-work/white_paper/index.html.

IP Multicast

Two noticeable trends on the Internet in recent years have been the upsurge in multimedia traffic, and of push services such as PointCast. Both of these place great loads on the Internet's infrastructure. This is largely because of the extremely inefficient way these heavy loads are sent. This, in turn, arises from the use of Internet techniques that were never designed to cope with these kind of situations.

For example, there are currently two basic ways of sending information across the Internet. The first is unicast. In this, packets are sent from one server to one client. If more than one client wishes to receive the data, the packets must be retransmitted separately. This is clearly inefficient, especially if services like Pointcast are involved: the same information is being sent out over the Internet many millions of times, clogging up the network unnecessarily.

The main alternative has been broadcast. Here a single transmission from a server is sent to every client out on the Internet: much more efficient for the server, but hideously inefficient for everyone else. The new multicast aims to combine elements of unicast and broadcast. Packets are sent out only once by the server, as with broadcasting, but these are then only forwarded to the network segments that have clients who wish to receive them.

This is achieved by registering users on sub-networks: if all users on a particular network segment unsubscribe to a multicast service, that part of the Internet will no longer receive the multicast packets, thus saving bandwidth.

Java

One of the most impressive aspects of the World Wide Web is the fact that you can click on hotspots within a hypertext document to navigate through what is sometimes rather grandly called Webspace. This gives a sense of interactivity that is actually rather misleading.

For the range of options open to you from a normal Web browser is extremely limited: you can move from document to document, view various files (if you have the right helper programs set up correctly), and enter information in on-screen forms in order to obtain simple customised responses. But in terms of real interactivity - where the page you are viewing responds in real-time in an infinitely-extendible way - this is pretty poor stuff.

The main problem is that although very easy to use, the HTML language is limited and develops very slowly (even HTML 3 is not yet universally used or even supported). This means that it is not possible to add custom features to a Web page because the current HTML language simply cannot support anything too complex.

The new HotJava Web browser from Sun gets round this by downloading the extra functionality required for a given page in the form of specially-written helper programs, called applets, as and when they are needed, from the server holding that page. In other words, the Web page automatically creates any extra features that the browser will require. To initiate these downloads from a Web page requires the addition of a single new HTML tag <APP> which is ignored by other browsers unable to use applets; in this way the Web page is still accessible by the rest of the Internet population, even if the full range of its features are not.

Applets are written in a language called Java, similar in many ways to C++, but with particular properties developed specially for the task of extending Web browsers dynamically. One important issue that its design addresses is that of security: the potential of an automatically downloaded executable to wreak havoc on a system is obviously high, but Sun insists that its multi-level approach solves the problem (see http://java.sun.com/1.0alpha3/doc/misc/SystemSecurityHelp.html for more on this).

HotJava is available for Sun Solaris 2.3, 2.4 and 2.5 SPARC-based machines, and for Windows NT 3.5 or Windows 95 (see Java's home page at http://java.sun.com/ for downloading instructions). In due course versions will also appear for other platforms such as the Apple Macintosh. Also well worth visiting is the Java repository at http://www.gamelan.com/. JavaWorld magazine is found at http://www.java world.com/.

A list of Java applets to run and examine can be found at http://java.sum.com/applets/index.html. There are two kinds of Java applets, those designed to run on Netscape and those designed for Sun's own HotJava browser. for non-programmers the site http://www.noware.com.au/index2.htm helps you generate Java applets without coding directly.

At the moment the Java applets that are available are fairly trivial demos of some of the language's capabilities. They include things like running tickertape displays of live share prices; on-screen animations; three-dimensional objects that can be moved in real-time by simply dragging them on-screen with the mouse pointer; and simulations that evolve as you input data. A good range of applets can be found at http://java.sun.com/1.0alpha3/doc/misc/SystemSecurityHelp.html and http://java.sun.com/contest/results.html (the results of a small competition to develop applets that show the potential of Java).

The hope of adherents of the Javanese way is that this Web-mediated approach will re-introduce a level playing-field, giving other software manufacturers a second chance to colonise the desktop. Java's methods chime well with the idea of component-based software, built up in a modular fashion. Java also fits in with the rather more fanciful ideas of Internet terminals: supposed sub-$500 (£325) units with no local storage, which pull all their software from the network (Intranet or Internet).

For companies, Java potentially offers yet more. In-house software projects can be split up into smaller and more manageable elements, delivered over the corporate Intranet to a heterogeneous mix of platforms, and developed incrementally as the need arises. Java itself will allow anyone with C++ skills to create applets, while the associated JavaScript (see http://home.netscape.com/comprod/products/navigator/version_2.0/script/script_info/index.html for an introduction) means that even those with limited programming skills will be able to deploy the full Java applets in advanced interactive Web applications.

However much companies like Sun, IBM and Netscape might wish it, the success of Java is, of course, no more certain than that of Microsoft's Internet strategy. For example, in the face of the obvious appeal of Java and JavaScript, Microsoft is offering its own component approach using Active-X, together with the new Visual Basic (VB) Script. The latter has the advantage of being supported by more programmers than any other language (more than three million VB developers, claims Microsoft). Against this, Active-X (formally OCXs) have been mostly used on the Windows platforms, whereas Java was designed to be platform-independent from the start.

Crucial to the success of Java is Netscape and its browser. The 32-bit Windows version Netscape comes with built-in Java support. Recognising this, even Microsoft has licensed Java for its Internet Explorer browser. Equally, through plug-ins from NCompass (see http:// www.excite.sfu.ca/) and Object Power (at http://www.opower. com/) Netscape can support the OCXs that form such an important element of Microsoft's counter-attack.

Books on Java include Teach Yourself Java in 21 days (£37.50, ISBN 1-57521-030-4). This is aimed at those who want a step-by-step guide to learning the language. It is written by Laura Lemay, whose book on HTML has been much praised, and Charles Perkins who writes the last third of the book covering more technical issues.

Also worth noting is Active Java (£21.95 ISBN 0-201-40370-6). Although rather thin, has a dull layout and lacking a CD-ROM it is written by Adam Freeman and Darrel Ince and has some exceptionally clear explanations which manage to show how Java works without going into complex details.

The electronic magazine (E-zine) Javaworld put out by IDG (http://www.javaworld.com/) and Javology (http://www.magnastar.com/javology/) are a rich source and valuable points of reference. Javology is concise, written with a real love for the subject and has a simple but effective design. Everything is available from its opening page; the five or so main news stories, updated weekly and the regular columns. Back issues are also online.

Java-L is an independent mailing list devoted to Java. To join, send the message sub java-I YourFirstName YourSecondName to listserver@vm.ege.edu.tr.

When will Java get down to business?

Java has been such a constant theme of the Internet and intranet world during the last year that it is easy to get swept along by the enthusiasm of its supporters. But as any hard-headed manager will point out, for all the excitement it has generated, Java's concrete achievements - both in terms of real-life use and commercial products - are rather thinner on the ground. One of the important steps made by Java has been the delivery of server-side technology. Originally code-named Jeeves, Sun's Java server indicates how the use of servlets will allow Web pages to be generated on-the-fly using Java technology.

Although apparently an incremental change, the release of the Java Development Kit 1.1 is actually significant because this updates the basics of the Java language, and addresses many of the more fundamental omissions in the original release. For example, the latest version of Java now offers Java Database Connectivity (JDBC) for hooking up to SQL databases using Java - indispensable if Java is to be taken seriously within the enterprise.

More interesting, perhaps, is the greatly increased support for various aspects of security. It is becoming apparent that elements such as encryption, digital signatures and certificates are all indispensable for Internet and even intranet use, and Java now supports these. It also allows code-signing - the ability to attach a unique personal identifier to a Java applet, say. This, of course, is the preferred technique of Microsoft with its ActiveX technology.

The advantage is that if you decide to trust the signed applet you can allow it far greater freedom on your system: Java applets until now have been restricted to operate within a safe 'sandbox' - the idea being that this stops malicious code from wreaking havoc on files. If you are sure through a digital signature that the code comes from someone trustworthy, you may decide to allow the applet to access files etc - greatly increasing its power. The downside is that such code-signing ultimately says nothing about whether the coded applet is dangerous or not - just that you can be sure it comes from the person who claims to be its author.

Another interesting addition is Java Beans. This allows Java applets to become fully-fledged objects and, as such, to communicate and interact with other objects such as ActiveX and OpenDoc components, which may be distributed across a network. JavaBeans essentially fills in some missing details to bring Java up to par in this area. In addition to the JavaBeans API, Sun has also been very busy defining other APIs to extend Java's reach. These include those for Java Management, the Java Card API for smartcards and JavaTel for computer telephony. In addition to the APIs there is also the fully-blown JavaOS.

But in a way all this high-powered programming activity only underlines the basic problem with Java: it remains mostly theory and little practice. Given that the language has been around for over a year, Java software remains worryingly thin on the ground. Even Sun has only come out with only a few things like its Java development tool Java Workshop, plus what it symptomatically calls the "1.0 preBeta 2" of its HotJava Web browser.

Put bluntly, Java cannot remain a brilliant and promising idea for ever. If it does not begin to generate real business uses soon it risks being sidelined by the ever-adroit Microsoft. The latter's ActiveX technology may be technically inferior, but it is already being pushed out hard into the real world where being first counts much more than being best.

What's going on under the surface of Java?

Microsoft may be criticising Java for not being good enough as a platform, but in fact the language is improving all the time

The case for Java has not been helped by Sun's rather risible attempt to have itself accepted as an independent international standards body for the language. But despite this there are increasing signs of a kind of Java iceberg: little visible on the surface, but a lot happening below.

For example, the Java Solutions Guide is evidence that real Java applications are starting to come through in quantity. Perhaps more importantly for the long term, there are also a number of market reports charting the surprisingly high level of Java activity within corporations .

Interestingly, this support seems to be both top-down and bottom-up. Management sees Java as solving a number of strategic problems, while programmers like it for its design. C/C++ is never going to disappear, but it might well turn into the Cobol of the 21st century: a highly paid but rather despised skill.

One of the major criticisms of Java remains its poor performance. The increasing use of just-in-time compilers is already solving the worst problems here, certainly in the context of applets, where speed is not that crucial.

In those circumstances where speed does matter, the solution will probably be to compile the Java programs to native instructions rather than virtual machine bytecode. Although such instructions will not be portable, cross-platform support can be supplied simply by compiling the Java programs for each platform in turn.

One of the fascinating recent developments in this area is the definition of versions of Java tailored for different markets. These include Personal Java for networkable devices in home, office and mobile use and Embedded Java for embedded devices.

The other major shift becoming clear is that as Java moves away from applets to applications it is growing into a full component-based architecture, mostly through the definition of Java Beans and the Beans Development Kit home page.

Recent developments include the Infobus for exchanging data between beans and the draft of the next version of the Java Beans standard, code-named Glasgow.

Against this background of increasing maturity, the main threat to Java's success seems to be that of schism. Netscape added its own extensions to the basic Java classes with the Internet Foundation Classes now merged with Sun's work on the official Java Foundation Classes. Microsoft has produced its equivalent Application Foundation Classes.

This in itself is not so serious - the classes are platform-independent and have been praised for their quality - but is symptomatic of Microsoft's go-it-alone approach.

More worrying, of course, is the release of its Windows-specific J/Direct extensions to Java , which allow direct calls to Win32 application programming interfaces through the virtual machine in Internet Explorer 4. The fact that Java will be split into Windows and non-Windows versions may be fine for Microsoft but is not so good for the rest of the burgeoning Java world.

Several top-rank computer companies are engaged in the struggles over who controls key elements of the language. Hewlett-Packard has created its own flavour of Java designed for embedded devices. Microsoft has licensed this divergent Java for windows CE operating systems (see http://www.hp.com/pressrel/mar98/20mar98b.html ).

Sun's response is a technology called Java Plug-in, available free from http://www.javasoft.com/products/plugin/index.html. This plug-in replaces those in both Microsoft Internet Explorer and Netscape's Communicator ensuring programs written to take advantage of all the advanced features of Java will run - and that non-standard extensions do not. This way Sun creates a uniform platform for Java software and counters the increasingly fragmented and variable world of current Java Virtual Machines. Unfortunately the practice is not so smooth; when you first load one of these pages you are greeted with a message to the effect that you need to download either a plug-in (for Netscape) or ActiveX control (for Internet Explorer). These weigh in at 8Mbytes and is a strong incentive not to bother.

Advice in Java-speak for forward-thinking firms

Finding a Java book among the hundreds available that offers some strategic insight into this key technology is a daunting task. Making Sense of Java (£19.99, ISBN 013-263-2942) is a good place to start: its subtitle is "A guide for managers and the rest of us". Aside from a few unexplained acronyms it generally succeeds in explaining what Java is all about and why it matters to businesses, without delving into the gory details.

One of the problems with books about Java is that the people who know most about it tend to forget what it is like for those who know nothing, and make unwarranted assumptions about background knowledge in the books they write. This might be a reason to consider the titles Java for Dummies (£23.99, ISBN 1-56884-641-X) and Java Programming for Dummies (£28.99, ISBN 1-56884-995-8).

Whether you like the jokey, highly American style of the writing is a matter of personal preference; at least they start from the premise that you need to have everything explained. The first title is perhaps rather too lightweight - it also covers Javascript, which has little to do with Java - while the second has flashes of inspiration, but is a little patchy.

Rather better is Java Now (£15.49, ISBN 1-884133-30-4) which is cheap and genuinely helpful in its explanations. Also good is Java by Example (£32.99, ISBN 0-7897-0814-0), which is very sensibly written and rather fuller in its coverage. Both of these titles concentrate on writing Java applets for use in HTML pages, rather than on standalone applications.

Java 1.1 Programming in 24 Hours (£23.50, ISBN 1-57521-270-6) is once again rather jokey in its style, but otherwise quite gentle in its pace. The less ambitious-sounding Java 1.1 in 21 Days (£37.50, ISBN 1-57521-142-4) is written partly by Laura Lemay, less a famous author now than a publishing phenomenon. Indeed, as well as this title, there is her Java Starter Kit (£38.25, ISBN 1-57521-077-0) which comes bundled with Symantec's useful Café‚ Lite development tool (as do many other Java titles) and Laura Lemay's Java 1.1 Interactive Course (£46.14, ISBN 1-57169-083-2).

The last of these is the best. Not only does it offer the possibility of accessing online teaching resources (Q & A and personal tutors), but it provides far more information than the other versions through its excellent appendices. One of the biggest problems with learning and understanding Java is that there are so many inter-related elements that the beginner can become confused. Placing all this information in one location helps a great deal.

In fact, the absence of such a resource is about the only thing missing from the otherwise excellent Beginning Java (£36.99, ISBN 1-861000-27-8). This is a fat book - more than 1,000 pages - and it provides more explanation than any other title discussed here. However, even though it deals with advanced topics such as hooking up to SQL databases, and it is a little obsessive about covering every tiny detail, it does so with great clarity.

If you want something more digestible, that adopts a very different approach, you might try How to Program in Java (£27.50, ISBN 1-56276-478-0). This uses metaphors and comparisons to get the points across, has a good glossary and is in colour throughout. It is not as trivial as it sounds: like many Java development tools, it uses colour to help differentiate between different elements in the language.

Java Developer's Resources (£22.99, ISBN 0-13-570789-7) does the same in black and white, though not quite so neatly. It too is fairly idiosyncratic, and Apple Macintosh users may prefer it since it uses screen shots from this machine.

Finally, two more advanced titles. Exploring Java (£18.50, ISBN 1-56592-184-4) assumes you have some knowledge of C/C++, though it is not essential. This is an intelligently written book, and is recommended if you want a faster pace.

Also good in this respect is The Java Handbook (£21.99, ISBN 0-07-882199-1). This is of note because it is written by one of the founders of Java. It is hard to argue with an author when he can write "when we designed Java" and yet explain the basic ideas so clearly.

Web leaders decide to stick to the Javascript

With the high-profile dip in Netscape's fortunes (March 1998) it is only too easy to overlook the company's considerable achievements. As well as more or less creating the commercial browser sector and introducing Web page features such as tables and frames, it also drew up the de facto standard for secure transmission of data across the Internet, SSL.

But there is one other area where Netscape has succeeded in defining an entire new field: Web scripting. Almost without anyone noticing, the Javascript language that the company introduced with Navigator 2 has become the undisputed victor in the scripting stakes.

Anecdotal support for this can be found in the most unlikely place: Microsoft's own Web site. This uses scripting extensively, but almost without exception it is Javascript, not VB Script, the cut-down version of Visual Basic that was meant to be Microsoft's knockout riposte to Javascript.

It is possible to quantify Javascript's conquest of the Web rather more exactly. The Hotbot search engine, which uses the powerful Inktomi engine, has a Super Search facility (link on home page).

As well as allowing sophisticated searches using standard AND and NOT operators, the Super Search option also enables searches restricted by time and place to be carried out. These can be found on other search engines, but to my knowledge no other site offers the ability to search quite so easily for the presence of multimedia elements, Java applets or scripting languages in Web pages.

Using this facility, I found about 58,000 pages using VB Script, and 3,108,000 pages using Javascript, which demonstrates pretty conclusively that VB Script for Internet Web pages is of almost no importance. However, it is worth emphasising that it is probably more widely used for server-side scripting in the context of Microsoft's Active Server Pages.

Javascript's dominance is likely to be consolidated by three developments. The first is the elevation of Netscape's proprietary technology to an independent standard. Javascript is now officially ECMA Script - as defined by the European Computer Manufacturers' Association.

Another move makes ECMA Script the official scripting language for the new Extensible Style Language of XML. And if further proof of Javascript's triumph were needed, the XSL proposal comes from a consortium led by none other than Microsoft. Against this background of almost universal use of Javascript, it is surprising how few tools there are to help in the creation of scripts. Although an ordinary text editor is sufficient, the structured nature of the language means that dedicated development environments can aid programmers enormously.

Appropriately enough, Netscape itself was one of the first to come up with a Javascript tool. Details about its excellent Visual Javascript can be found at , while a trial download can be obtained. Microsoft has a Javascript development environment of sorts, available as part of its Visual Inter Dev 1.0 suite but nothing comparable to those available for Visual Basic or Visual C++.

One of the few other Javascript tools is NetObject's Script Builder 2.0, which is able to work with several scripting languages. A trial version is at available

JavaBeans

Although one aspect of the current battle between Microsoft and the rest of the Internet industry is sometimes portrayed as ActiveX versus Java, these are not strictly speaking equivalent technologies. Java is a language, used for writing applications or applets, while ActiveX consists of components that can be used together to create more complex entities. Because of the security measures built into Java - one of its strong points - it is not possible for Java applets to interact with either the Web page they occur in or other applets found there.

To get round these limitations, and to allow Java applets to be extended into a full component architecture, JavaSoft has devised what it calls the JavaBeans specifications - continuing an increasingly tiresome coffee theme (for Americans, Java is a drink rather than a place) which is almost completely lost on the rest of the world.

JavaBeans may be anything from simple on-screen elements such as buttons, or large applications. But as well as allowing more complex software to be built, JavaBeans will also enable non-programmers to create Java-based applications by plugging together pre-written components. This will be achieved using development environments of the kind popularised by Microsoft with its Visual Basic programming language - which is where ActiveX has its roots.

Indeed, the JavaBeans project effectively aims to marry the ease of Visual Basic's component approach with the power of Java's platform-independence and built-in networking. Other more technical aspects including the creation of bridges between JavaBeans and other component architectures such as ActiveX and OpenDoc.

Javabeans in 21 days (£27.95, ISBN 1-57521-316-8) is a good practical introduction to creating Javabeans, the Java-based component architecture.

Java Virtual Machine

When Java first appeared in 1995 it was announced with the slogan "write once, run anywhere". This of code that could be written to run on any platform seemed to promise an end to barren discussions about which machines and operating systems were better, so users would be able to concentrate on choosing the best applications rather than worrying about backing the wrong product.

Java insulates the program from the underlying hardware or operating system because Java programs are compiled into bytecodes: processing instructions for a virtual chip, the Java Virtual Machine,

By implementing a Java Virtual Machine on a particular hardware and software platform the same bytecodes can be run whatever lies underneath.

the cost of this feature is that an extra step is required to convert the bytecodes into a form that will run on the actual hardware and software.

Formulating this vision has proved easier than realising it, and some cynics claim that the hope of "write once, run anywhere" has turned out to be "write once, debug everywhere".

Inspite of this Java has flourished despite this failure to deliver completely on one of its much-hyped benefits - probably because it has moved on from its early rather simplistic vision.

JEPI

It is becoming increasingly accepted that the issues surrounding online electronic commerce have now essentially been resolved. In fact, part of the difficulty of moving forward in this area is simply that there are now too many different solutions to the problem of secure payment transactions across the Internet.

This lack of a common standard might be seen a major obstacle to developing this market. But a moment's reflection will indicate that precisely the same fragmentation is present in the general economic marketplace. That is, there are very many different solutions to paying for goods and services. They include techniques as diverse as cash, cheques, credits cards and barter.

What happens in the standard economic model is that among these various possibilities the buyer and seller agree on a common transaction method. In a similar way, the many alternative and rival electronic commerce payment methods can be regarded as a range of options for the buyer and seller to choose among.

The widely-supported Joint Electronic Payments Initiative (JEPI), drawn up by the World Wide Web Consortium and CommerceNet, aims to offer the framework for precisely this kind of negotiation to take place across the Internet. It does not aim to set which of the alternatives is the best.

One of the most interesting theoretical aspects of JEPI is that the whole process of negotiation between the buyer and seller - or equivalently between server and client - can be effected completely automatically, with both parties only becoming involved in the last stage of the process: approval of the choice made by the two computer systems.

Jini

the Java programming language and environment was originally designed to control devices in the home. when this market failed to develop, Java was re-positioned as an Internet language and platform. this created an exciting new way of writing programs that could run across platforms, and that could be delivered across the network as they were needed. But in a way Java remains something of a half-way house. Even though Sun has been extending the range of applications that can employ Java, the emphasis is still squarely on some kind of computer that can run a Java Virtual Machine to process Java code. And yet the vast majority of objects with chips in them are not computers, put peripherals. for the Java idea to be taken to its logical conclusion, a way needs to be developed to allow even the humblest of peripherals to participate in this vision of a networked world.

that way is called Jini (pronounced "genie") which is Sun's attempt to generalise the Java approach to embrace almost any kind of electrical/electronic appliance. the idea is simple: that such devices be endowed with just enough microprocessor power to be able to run a form of Java and be connected to a network and that it will be using the Jini framework they will be able to announce themselves on the network and utilise other Jini-enabled resources.

For the end-user, the result should be a true plug and play world where any Jini device can be just added to the network and be left to set itself up automatically. this would apply to traditional computer peripherals such as printers and disc drives, or new classes of wireless or consumer devices - a return to Java's roots.

Kerberos

Client-server technology lies at the heart of the Internet, which means that the latter also suffers from the basic problems of this approach. A case in point involves the issues of authentication and authorisation. When a client requests some service from a server, a basic question is whether the user has permission to access those facilities. If the service is the World Wide Web, this usually amounts to controlling which pages are available and to whom. More generally, though, it is necessary to specify a complex set of permissions for each user or class of users.

One technique employed for client/server systems is to use the Kerberos program, named after the three-headed guard dog of hell in Greek mythology - with authentication, authorisation and accounting forming the corresponding software triple. Kerberos works by using a secure server that holds users' secret passwords. When a prospective client requests a service from a server, the user first contacts the Kerberos authenticating server.

This sends what is called a ticket - a permission - back to the user, encrypted using various keys. Encryption ensures that only the authorised user will be able to access the correct server. The ticket is then sent to the server as proof that the user is indeed permitted to receive the requested service. Although important in the US, Kerberos is not available in Europe since it employs encryption techniques the US Government deems munitions and hence unexportable. However, there is a home-grown version called Sesame. In some respects this is superior to the original Kerberos, though it is more of a development tool than a final product.

Key escrow

Governments around the world are struggling to come to terms with the power of cryptography that is available to Internet users. Their various security organisations are frightened by the prospect of effectively unbreakable encryption. Strangely, for a government that has otherwise been among the most liberal in matters of personal freedom, the US administration has one of the most repressive approaches, including a ban on the export of products employing so-called strong cryptography.

As a nominal concession to the US software industry, the current administration came up with a modification to this ban, whereby moderately strong encryption could be exported provided a copy of the key used by every program was filed with what was called an escrow agent in the US. This key escrow would mean security organisations could obtain the keys if they needed them.

Understandably, businesses are not happy at the thought of their supposedly private keys being held by third parties, but this has not stopped other governments from flirting with the same idea. The underlying problem, of course, is that the power vested with the escrow agents is extremely great. If the security of one of them is compromised and the secret keys of companies stolen, this would jeopardise the integrity of almost every aspect of a company's online activities, so central are private keys now to the functioning of businesses on the Internet.

Kleiner Perkins Caufield & Byers (KPCB)

Take-overs and tie-ups that are likely to sweep through the Internet industry, an obvious example was the purchase of Excite by @Home, followed by another major fusion with Yahoo buying Geocities.

Anyone who wants to explore the likely pattern of mergers and take-overs in the future could do worse than to consider the implicit partnerships already present among the members of these two key groupings.

Linux

The Unix-compatible operating system Linux is perhaps the Internet's best-kept secret. A robust, secure and constantly evolving product, Linux can match any of the commercial Unix implementations and trumps them all in one respect: it is free.

This means Linux is a highly attractive option for businesses that would prefer to use a Unix-based system but balk at the fees for commercial versions.

Another great advantage of Linux is that it is designed to run on PCs (more or less anything from a 386 with 4 Mbytes of Ram up), and so the hardware side is cheap too. Given the abundance of good TCP/IP freeware, this must make Linux the least expensive way of setting up an Internet or intranet server.

As you might expect, Linux is available from many sites on the Internet. However, there is a problem that must be overcome because of the way Linux has grown.

Originally a personal project of the Swede Linus Torvalds, Linux is now run by a band of volunteers who refine it and add to its capabilities. This means that the many hundreds of component parts of Linux are in a constant state of flux, with upgrades and bug fixes appearing every day.

To get round this, various people have put together what are known as Linux distributions. These are essentially complete packages of Linux elements that are relatively easy to set up.

The latter charge not for the Linux files themselves but for the effort put into packaging them as easy-to-use distributions. Even so, many of these are available free of charge over the Internet. However, the cost of downloading them over normal telephone lines may be prohibitive.

For this reason, Linux distributions are also available on cheap CD-ROMs. For example, the Linux-FT set (Lasermoon 01329 834944) lets you run Linux from a CD-ROM drive, installing files progressively. The cost is £64.95 for six CDs including mirrors of the main archives.

Even though these distributions make it easier to get hold of all the files you need, installation is still somewhat involved. Of course copious documentation for Linux is available online (see ftp://ftp.ox.ac.uk/pub/linux/LDP_WWW/linux.html), but you will probably want some hard copy help too.

Dr Linux, a 1,200-page book covers every aspect of Linux installation (£28.95, from Lasermoon). Linux Unleashed (£39.50, ISBN 0-672-30908-4) and Using Linux Special Edition (£56.49, ISBN 0-7897-0742-X) offer useful help on installing the distributions (Slackware and Red Hat respectively) they come with. O'Reilly & Associates offers the Red Hat distribution on CD-ROM together with installation instructions (£18.50, 1-56592-171-2), designed to be used in conjunction with the book Running Linux (£18.50, ISBN 1-56592-100-3).

From little acorns do large oak trees grow

It is rather remarkable that more than 40 per cent of Internet Web servers are using the free software Apache. This is largely a testimony to the power of that system rather than any simple desire to save money. Similarly, rising use of the Unix clone Linux is down to the fact that it matches commercial products in just about every respect - and yet costs nothing, comes with the source code, and may be modified in any way. Linux began some six years ago as a private project of the Finnish student Linus Torvalds, and has grown into a vast collaborative venture that involves hundreds of volunteers working across the Internet.

Partly because of the way it is created, Linux is an operating system that is probably the best-equipped in terms of core Internet functions, and which also offers support for the latest communications technologies such as IPv6 and ATM. It is a natural match for Apache, and it is likely that many sites using this Web server are also running Linux, which becomes, in effect, the Internet's operating system.

Linux is available in various packages, called distributions. Currently the most popular distributions are Slackware, Debian, Red Hat, FT-Linux and Linux Pro.

The Linux project has concentrated almost exclusively on producing the kernel of an operating system. Many of the ancillary tools - for example C and C++ compilers - have come from the complementary GNU (GNU's Not Unix) project. Combining these two movements has led to one of the richest and most robust development environments around. Indeed, since Linux is so cheap, and can run on just about any hardware, it is very widely used by programmers in companies - often without the knowledge of their management.

Until recently, Linux has been signally lacking in desktop applications, which has precluded any wider rollout within corporates. However, the growth of the large Linux user base - conservative estimates put this at around two million - mean that commercial companies have now stepped in to meet this need. For example, Applixware (£370) offers a full office suite of applications - word-processor, spreadsheet, presentation graphics and e-mail - that is both powerful and fully integrated.

The entry-level Caldera product is called OpenLinux Base, and comes with a number of software packages, including the Apache server, Netscape Navigator 2, FTP and Usenet programs, PERL and CGI support. OpenLinux Base is available now for £55, and later this year OpenLinux Standard will be launched.

Additional features include StarOffice, Netscape Navigator Gold 3 and the Netscape FastTrack Web server. The top-of-the-line OpenLinux Server will offer a five-user licence for Novell Groupwise as well as extended Netware capabilities.

Alongside these basic products, Caldera also offers a number of applications. The Caldera Solutions CD contains Linux versions of leading software such as Adabas D, CorelDraw, and the Caldera Office Suite (also available separately for £259), which includes WordPerfect, the NExS spreadsheet and e-mail). Caldera has also come out with a Linux port of Sun's WABI (£139) which allows you to run Windows 3.1 applications directly. All Caldera products and the Applixware suite can be obtained in the UK from Starstream Communications. The current burgeoning of Linux desktop applications could well mean that the Internet's operating system may soon become a serious option for ordinary business users too.

Cut operating system cost with Linux books

The unique virtues of the free Unix clone Linux have been discussed several times in these columns, emphasising how this collaborative effort of public-spirited programmers around the world (and across the Internet) offers businesses the chance to save considerable sums of money without compromising on quality.

One of the many advantages of Linux is that it is so easy to try out: all you need is a PC - even one that is underpowered for Windows 95 will be ample for Linux - and the software. For those with a fast Internet connection, the software can be downloaded from various sites. But even if you do have such a link, it may well be better to take the alternative route of buying a Linux book. Doing so will not only give you a fat tome with detailed help on setting it up, but also the software on a CD-ROM, since including this costs the publishers almost nothing.

In fact, so cheap are CDs now that the third edition of Using Linux (£56.49, ISBN 0-7897-1132-X) comes with no less than three of them, one each for the three leading distributions of Linux (a distribution is a self-sufficient collection of Linux files, together with an installation program).

These are Red Hat and Slackware, and a cut-down version of OpenLinux from Caldera. As well as useful introductions to installing the first two of these, Using Linux is particularly good on Linux and the Internet. There are several chapters on TCP/IP and Internet services (including details of how to set up and run the Apache Web server), and even an electronic version of another book, Running A Perfect Web Site With Apache.

Another massive volume covering all aspects of Linux is Red Hat Linux Unleashed (£46.95, ISBN 0-672-30962-9). However, this comes with only the Red Hat release, and lacks information on Apache, a fairly serious omission for business users, since running this Web server is likely to be one of the main incentives for turning to Linux.

Linux Secrets (£45.99, ISBN 1-56884-798-X) comes only with the Slackware distribution, which is beginning to fall out of favour with users since it is rather old-fashioned in its approach to installation, and also has nothing on Apache, so it might seem out of the running. But its contents are so hands-on, and so obviously written from deep knowledge of Linux, that I would none the less recommend it to anyone who wants a more personal, and perhaps ultimately more helpful, book than the two blockbusters discussed above.

Linux, The Complete Reference (£29.95, ISBN 0-07-882189-4) comes with a distribution from Caldera (though not an up-to-date one), but the fact that the name of the originator of Linux, Linus Torvalds, is consistently misspelled does not inspire confidence. The book does have several good chapters on the Internet, and tells you more about setting up a Gopher server than you'll probably ever want to know.

Some other titles well worth considering, even though they come without CDs, are those that have been written by key players in the Linux movement. For example, one of the authors of Running Linux (£18.50, 1-56592-151-8) is Matt Welsh, who co-ordinates the massive Linux Documentation Project . However, the book is probably not the best for beginners, since it often assumes some knowledge. Rather better is Linux Network Administrator's Guide (£18.50, ISBN 1056592-087-2), which is an excellent introduction to TCP/IP and the Internet as well as to using Linux in these areas.

Two other more specialised books worth noting for more advanced users are Linux Multimedia Guide (£24.50, ISBN 1-56592-219-0) and Linux in a Nutshell (£14.95, ISBN 1-56592-167-4). Finally, for those who really want to understand Linux in depth, there is Inside Linux ($22, ISBN 0-916151-74-3, available from SSC Online Catalog, a low-level explanation of how it works; and Operating Systems, Design and Implementation (£26.95, ISBN 0-13-638677-6), the second edition of the book which inspired the Linux project in the first place.

Linux (almost) moves into the mainstream

As previous features have explained, despite the fact that the Linux operating system kernel is highly robust, available on just about every hardware platform and is free, it has remained something of a brave choice for corporate users.

Linux is hard to install and use, there is no formalised support and there are no applications written for it: such was the litany of accusations against it. But something of a sea-change has taken place, and the obstacles to installing Linux systems in companies are steadily falling.

Increasingly, Linux is being talked about as a serious rival to Windows NT - especially as NT 5 (windows 2000) moves further into the future, and more doubts are expressed about its eventual stability, an area where Linux excels (a comparison is available).

This is not just wishful thinking on the part of Linux advocates. It is true that figures on the spread of Linux are hard to come by. Alongside market research estimates, there are only a few objective indicators such as the IOS++ counter. Through polling more than 700,000 Web servers in Europe, this has come up with a market share of 27% for Linux in that sphere, well ahead of Windows 95/NT (24%) and nearly double that of Solaris (15%) [September 1998].

But the best evidence that Linux is catching on within companies is provided in a more striking, if indirect, way. No less than four of the top database suppliers - Oracle, IBM, Computer Associates and Informix - have announced that they will be porting their products to Linux. Clearly, they would not be wasting their valuable programming resources if the demand were not there.

Indeed, one of the main criticisms against Linux has been the absence of just such a presence from the top-rank business software houses. The entrance of giants like Oracle suddenly confers a new legitimacy on the operating system built and maintained by hackers across the Internet.

A further significant straw in the wind was IBM's announcement that it would be bundling another Open Source program, the Web server Apache, with its WebSphere Application Server.

Again, for a top player like IBM to give its imprimatur to a program which, though widely used (running 52% of the Web according to the Netcraft survey), has often been installed covertly.

Other big names jumping on the accelerating Linux bandwagon include Corel, which sells a range of Linux-powered PCs called Netwinder and even Intel.

The latter's future support for Linux on the Pentium and the 64-bit Merced chip promises to be yet another reason why NT 5 will face increasing competition from the Open Source operating system.

Of course, Linux's journey into the corporate IT department is not yet plain sailing. Although the code is pretty irreproachable, support remains an issue. As well as Caldera, which has offered corporations Linux support for some time, another supplier, Red Hat, has started to provide this key service.

And IBM's example with Apache shows that it is not impossible that hardware vendors could start doing the same with PCs.

Although the arrival of the big databases will boost it further in the server market, as a client there are still too few top-class applications.

An honourable exception is the Gimp, a program offering Photoshop-like image manipulation capabilities, and which is completely free.

Finally, one of the biggest omissions in the Linux portfolio is a decent graphical user interface (GUI) - indispensable if the operating system is to compete with the polished front-end of Windows. In fact Linux has two candidates that both promise to match Microsoft's GUI, Gnome and KDE.

Unfortunately there is a fierce ideological battle going on between their adherents. If Linux is really going to become a serious business operating system its supporters will need to learn to leave behind such fruitless internecine conflicts.

Long transactions

The early days of Internet commerce proceeded rather innocently. If you wanted to buy something online, you would typically enter your credit card details into an on-screen form. These details would then - if you were lucky - be encrypted as they were sent over the Internet, along with information about the item being bought. At the other end, they would be taken down by hand, and the order fulfilled, again manually.

Things moved on, with manual actions being replaced by integration between Web servers, and purchase and perhaps inventory databases. But with this automation came a new danger: that things might get stuck half-way, that overall transactions might not be fully completed, leaving datasets hanging and systems in ambiguous states.

More recently there has been a shift to a transactional approach, whereby the overall integrity of complete transactions is controlled by dedicated software: transaction processors. More advanced versions are designed to handle operations that may also be distributed across physically separate systems.

But as Web commerce systems have become yet more complex, so the need arises to cater for even more sparse transactions. Among these are the so-called long transactions, those taking place over extended periods, weeks even, and involving many separate and often quite heterogeneous acts.

For example, an E-mail order might set in train many complex Web server and database operations; to roll-back the entire process involves not just undoing the server-side operations, but also sending an E-mail to cancel the one that initiated the whole process.

Loopback

Every individual node on the Internet has its own address. This takes the form of 123.45.67.89 where each of the four numbers is less than 255. Not all addresses are available, even among those that have not yet been allocated, since some are reserved for special purposes such as MBone multicasting. Another of these reserved addresses, 127.0.0.1, is highly unusual in that it is not allocated to any particular Internet node because it can be used by all of them for a very specific purpose: to refer to themselves.

That is, this loopback address as it is usually known, refers to the local machine employing it. Any messages sent from the local machine to this loopback address are simply routed straight back without ever leaving it.

This might appear a singularly pointless thing to do, but it does have its uses. For example, if you are running a Web server and Web browser on the same machine, you can access the former with the latter by entering the URL http://127.0.0.1/. This saves you needing to know the real Internet number assigned to that machine; indeed using this loopback address you can dispense with a registered address altogether.

Another use is indicated by some of the offline Web readers discussed in this week's main feature. There, an ordinary Web browser may be modified so that it sends its requests for Web pages not out to the Internet, but to the offline Web reader program residing on the same machine. Here, though, there is one important addition: the use of an alternative port number along with the loopback address.

Lotus ESuite

The network computer (NC) has remained one of the more spectacular examples in recent years of marketing hype. Announced with great fanfare nearly two years ago, products that use thin clients to pull down Java programs from a central server have signally failed to materialise.

Various NC machines have appeared, but the real key - fully functional NC software - has been conspicuous by its absence. This makes the release of the first beta of Lotus's ESuite collection of Java applications all the more significant. For the first time users can begin to evaluate whether the NC theory is viable in practice.

The Dev Pack Preview of ESuite may be downloaded, and there are also numerous online demos and white papers on the subject of ESuite and NCs at the same address.

Lotus ESuite Dev Pack (for developers) includes a word processor, spreadsheet, graphics and presentation software, and a project scheduler. The full ESuite Workplace (designed for end-users) acts as a container for these applications, and adds other functions (details at the above URL).

Each Dev Pack applet (consisting of several megabytes of Java code) is downloaded from a Web server and then displayed in a Java-enabled Web browser. Along the bottom of each applet area is what Lotus calls the Info Center, which is like the menu bar in a conventional application.

The Info Center has three main elements: the action bar, which displays top-level commands; pop-ups, like drop-down menus, only in the other direction; and panels, which are similar to dialogue boxes. Data is entered into each application just like any conventional program, but since you are typing into a Java applet inside a Web browser, the sensation is rather curious. Surprisingly, perhaps, the response seems no different from when using ordinary software. There is no flicker as the screen is redrawn, and overall the approach works well.

Similarly, the pop-ups and panels allow ready access to all the main features of the programs; they have been well thought out, and make using different programs from within a browser simple.

It is noticeable, though, that the range of options in the ESuite applications is quite limited compared with, say, Office 97. This is unsurprising, given the considerable difference in program size. Whether users will feel the lack of some of the more arcane features available in standalone applications will depend very much on individual situations, which is why the release of the Dev Pack for testing in real business environments is so valuable.

However, I doubt that many people will abandon their current programs to embrace this novel way of creating document not least because ESuite requires users to learn new command structures and working techniques, something that is always a brake on the uptake of new products.

None the less, I was very impressed with ESuite, simply because it is here and it works. At a stroke Lotus has demonstrated that the NC idea is at least an option. Whether it becomes a mainstream choice is another matter. Moreover, ESuite should encourage other software houses to produce their own NC applications.

ESuite is also important because it demonstrates once and for all that Java is not restricted to trivial applets that create a few visual effects in browsers. Examining ESuite in detail shows that Java can match any of the design subtleties found in ordinary programs. Even speed does not seem to be a problem any more.

In addition, ESuite is an emphatic practical demonstration of just how successful Lotus has been in transforming itself from an IBM division locked into a proprietary architecture - its Notes groupware - into a dynamic Internet company.

Lotus's mastery of Java as demonstrated by ESuite shows it is now fully capable of leading-edge programming. This, taken with the adroit shoe-horning of TCP/IP technologies into Notes sets the seal on Lotus as one of the key players in this sector.

Lucent

But two things do seem certain: that Lucent is going to move out of the shadows of its current obscurity into the spotlight of greater public recognition, and that it is destined to become one of the top players in the Internet arena.

Mailing lists

Although electronic mailings lists may seem unglamorous compared to the dizzy multimedia delights of World Wide Web pages, they represent one of the most important and popular ways of disseminating information online, not least because they are accessible to anyone with the simplest of Internet connections - e-mail.

As one of this week's news stories indicated, there are now an incredible number of such mailing lists - over 23,000 of them, on every subject imaginable. All such e-mail discussion groups function in approximately the same way: messages are sent by mailing list subscribers to a central point where they are then sent out to everyone else on the list (though the list owner or moderator may vet them before passing them on - the amount of this varies from list to list).

Joining, leaving and interacting with the mailing list are generally effected by sending a few simple commands to a special e-mail address. This conveys the commands to the software that runs the list. In the vast majority of cases, this is completely automatic, though a few are run manually, which is obviously far more time-consuming and impractical for larger memberships.

There are several kinds of automated mailing lists in common use, differentiated by the software that handles the day-to-day running. The commands for these are similar but often with subtle differences. Some software offers additional features not found on others.

Perhaps the most popular mailing list software is that known as Listserv (short for 'list server'); other common varieties are Listproc ('list processing) and Majordomo, while more unusual are lists run using the Mailbase or Mailserve software.

Metadata

Metadata is data about data. More concretely, in the context of Web pages, it is information about the data held on the page. In contrast to the visible data that is meant to be viewed by visitors to the Web site, metadata generally remains hidden. Traditionally, it is added using HTML's <META> tag: information placed between the meta tag pair is not displayed by browsers, but can be used by programs designed to look for it.

One example of metadata can be found in Microsoft's Site Server 3. This allows the creation of custom site vocabularies that can be used to describe Web pages according to predefined schemes. <META> tags are perhaps most commonly used by search engines. As the number of Web pages continues to grow, so the task of providing useful information in response to queries to search engines gets harder.

One way of trying to winnow the huge number of hits for a typical query is to use information contained within <META> tags. Ideally, this metadata picks out key words and concepts to help search engines sort and rank pages. Sadly, like all technologies, such metadata techniques have also been subverted, through what is called spamdexing.

By adding spurious metadata information, unscrupulous Web site operators attempt to obtain better or even totally inappropriate positions in lists of search hits.

A crude approach
The <META> tag is a very crude approach, and offers only the most basic of aids to search engines. The new extensible markup language (XML) should be considerably more useful. By allowing customised XML vocabularies to be defined for each site, it will be possible to categorise all of the data there. For example, by referring to the tags that surround it, a search engine will know whether the word 'net' has to do with the Internet, fishing or profits.

But in a sense, XML is too powerful, and goes too far in enabling sites to define their own vocabularies. Imagine a search engine trying to build an index based on its visits to many such sites. Potentially, each of these will have defined its own XML tags to provide metadata about the contents.

The search engine will therefore be faced with the daunting task of comparing disparate vocabularies and working out which pages belong together, based on the similarity of the underlying meaning. To get round this problem a specialised XML application has been devised. Called the Resource Description Framework (RDF), it is designed to allow Web resources to employ metadata in a consistent way.

RDF has two main advantages. First, it will encourage Web sites to employ standard XML sets for metadata purposes, rather than just making up their own. An example might be the so-called Dublin Core set , which provides basic metadata information about a document, including the creator, subject matter, date created etc.

If this fairly simple metadata set became widely adopted, search engines could collate basic information very easily (because the XML tags would be standardised).

Semantics
The second advantage of RDF is that, even if a new XML set of tags is created - because the Dublin Core metadata set is inappropriate, for example - there is at least a standard way of describing them.

This is an important point: XML on its own is just a system for conveying structured information using markup: a syntax; there is no sense of what that markup refers to. RDF brings an extra ingredient - technically known as semantics - that adds meaning to the tag set. This sense of meaning is precisely what search engines require.

Assuming RDF catches on - and with Netscape supporting it in future products, this seems likely - there will be much activity in terms of defining such RDF sets (known as schemata).

For example, industries may decide to define their own set of 'official' metadata tags, enshrined in RDF, for use in their members' Web sites. Alongside these there will, of course, be thousands of 'unofficial' systems, some of which will be widely used, while others may exist on just one Web site.

Meta tag infringement

One of the most disconcerting aspects of the Internet for companies is the lack of well-defined legal precedents. The problem is that technology is moving ahead so fast, the law is having difficulty keeping up. But, in the case of so-called "meta tag infringement", the courts are starting to take action.

Meta tags are special HTML markers - the elements of a Web page file that are enclosed by angled brackets - which provide meta-information about a page. That is, rather than specifying structure as <BODY> or <H1> (indicating body text and a top-level heading, respectively), the <META> tag is used to convey ancillary information, usually about the general contents, history or creation of the page. For example, typical uses include information about the software used to create the page, or about the creator.

Another use that has arisen is to try to influence search engines. As a previous Netspeak explained, the practice of spamdexing often involves placing certain popular keywords within <META> tags; then, when the page is indexed by a search engine, these hidden words will cause the page to be placed higher in the ranked list of hits returned by search engines to users making queries.

A variant of this approach is to place trademarks in meta tags. Then, when visitors search sites to look for these words or phrases, the parasitic pages containing them in the <META> tags will also be returned, even though their content may have nothing to do with the search term. However, US courts are moving to ban this subtle kind of trademark infringement, and so make at least one legal facet of the Internet a little clearer.

MSIX

One of the fundamental properties of the Internet is that it does not matter how far data is travelling, or how much of it there is, the cost to the user is the same. This feature is about to cause a revolution in telecoms, as conventional telephony operators are forced to reduce international prices to reflect the existence of an alternative channel and pricing model.

It is ironic, then, that at precisely the moment when this novel economic approach is about to have a major impact on everyday business life, there are serious suggestions that it should be replaced with a more conventional one. The reason is simple: even with the huge increase in available bandwidth the demands placed upon the Internet are threatening to choke it. In particular, new multimedia applications, notably those involving video streams, are simply not practical with the current system of flat-rate pricing.

The IPv6 protocol for Internet routing already supports something called Quality of Service. Using this, it is possible to specify that certain data streams - for example for real-time videoconferencing - should be given priority on the network. But the corollary of this is that such priority streams should cost more (otherwise everybody would use them). This in its turn requires a way of measuring how much bandwidth is being used by a given application.

Just such a technique is offered by the Metered Services Information Exchange (MSix) protocol, which provides a convenient way of communicating this service-use information to access providers and carriers which will charge for these new premium services.

MIME

The disadvantage of E-mail is that it is restricted to short lines of seven bit ASCII text rather than the full eight bits used in binary programs. To send documents in different languages with accents or diacritical marks, or objects such as images, sound or general binary files the files must be encoded - converted to standard ASCII. Uuencoding has long reigned supreme in this area, but there is now an increasingly important alternative implementation that is part of a larger standard called Multipurpose Internet Mail Extensions (MIME); was originally defined in RFC (Request for Comment) 1341.

One of the ways MIME allows non-ASCII files to be transmitted in an E-mail message employs the same technique as uuencoding: that is, taking groups of three bytes and rewriting them as four six-bit numbers that can be represented by alphanumeric codes capable of transmission in standard E-mail (though not with the same codes as uuencoding). But in addition to this basic encoding scheme MIME adds a formal structure to its E-mail messages that is absent in uuencoded messages.

The purpose of this structure - which basically consists of information in the header at the beginning of the E-mail, plus section boundaries where appropriate - is to enable more complex documents to be sent, and to permit MIME-compliant E-mail software to respond intelligently to them.

As far as complex documents are concerned, the possibilities include multipart messages with several parts of the same kind; messages with alternative versions of the same content (for example an unformatted ASCII text version alongside one with various kinds of formatting present); and even a mixture of completely different kinds of elements (text, images, sounds, videos etc.).

The most obvious application of the ability of MIME-compliant software to process messages intelligently is in the extraction of embedded binary files automatically upon arrival. This saves the recipient from going through the time-consuming procedure of using a decoding program to retrieve them. It is also possible to configure MIME programs to run appropriate viewers for these files - for example, a graphics program for images, a player for sounds or videos.

Adding binary files with MIME-compliant E-mail programs is just as easy: with programs such as Netscape, Pegasus, Eudora or Mail-it for the PC, for example, you simply select the file you wish to include and the software does the rest - encoding binary files and creating the appropriate headers and section boundaries. This simplicity for both sender and recipient means that business files like formatted or DTP documents, spreadsheets, presentations, sounds files and even videos can be routinely exchanged via Internet E-mail, just as many local area network mail systems already permit.

However, amidst all this impressively invisible technology lurks a hidden menace. As was pointed out in a recent Net Speak on the subject of the scare over the fictitious Good Times E-mail virus, it is not possible to transmit viral programs in ordinary, plain text E-mail. It would, though, be possible to encode such a program using MIME; if the recipient's E-mail software were set up to extract and run such programs automatically, it could unwittingly unleash the hidden virus. Similarly, a word-processing document containing a macro that deleted or corrupted files could also be sent and received using MIME. A mail program supporting the MIME standard could then extract the document, load it into a word-processor and so run the malevolent macro. Such hypothetical cases show that along with MIME's indubitable benefits come pitfalls too, and these will need to be managed appropriately within a corporate context.

MIME types

MIME types, as the name suggests, represent different kinds of file-types - text, images, audio etc. - that were part of the original impulse behind the MIME encoding scheme. However, this idea has been extended to provide a way of describing the various kinds of multimedia file-types that are encountered on the Web.

MIME types consist of two parts: a general format (such as text, image or application), and a more specific name (such as gif, jpg or mac-binhex40). These are written together with a forward slash between them: image/gif, text/html etc.

Such MIME types are crucially important because they can be used when files are sent over the Internet from a Web server in response to a Web client's request, to specify exactly what kind of information is arriving. This, in its turn, allows the Web client to know how it should deal with this data stream.

For example, if the MIME type is text/html it can be processed as a Web document; if, however, it is application/mac-binhex40 it will almost certainly require an external viewer or helper application in order for it to be dealt with (one that knows what to do with Macintosh BinHex files in this case).

These MIME types are most often encountered by users when configuring a Web client. Generally it is possible to assign different actions to each MIME type or, indeed, to create entirely new MIME types.

MHTML, short for "Mime encapsulation of aggregate documentation, such as HTML" is an approach to send inline graphics or sound and Java applets as E-mail attachments along with HTML code. MHTML certainly opens up interesting possibilities, including a new way of providing push services. But it will also be wide open to abuse as junk mail with multi-megabyte graphics and video.

Micro-payments

Given the huge size of the Internet market - most estimates place the total user population around the 50 million mark - it is perhaps surprising how few financial transactions are taking place between them. The basic problem stems from a fundamental mis-match between what is easy to sell over the Internet, and how you pay for it.

Because customers may be anywhere in the world, it does not make sense selling large objects with high shipping costs. Instead, the Internet lends itself to smaller items or - ideally - ones that do not exist as physical objects at all. The latter would include all kinds of information that can be supplied digitally for almost no cost. But there is very little information for which users are prepared to pay very large sums. This means that the prices for such natural Internet goods tend to be low.

As prices of items drop, so the cost of processing the transaction rises as a proportion. This means that with ordinary payment systems such as credit cards there is a counterbalancing pressure against selling very cheap items on the Internet. The solution to this dilemma has been known in principle for some time. The answer is to devise a transaction mechanism that allows small payments - even very small payments such as fractions of a penny.

These so-called micro-payment systems would then allow a totally new kind of online commerce to flourish - so the theory goes - whereby Web sites could charge on a per-page basis, for example. Although the cost to each user would be low (and the transaction costs yet lower) the cumulative sums generated might make such an approach viable.

The problem lies with the current online payment mechanisms. Although it is now quite easy to use credit cards to pay for goods and services - employing Netscape's SSL or the more general SET protocols to ensure that credit card details are protected in transit - the processing costs involved mean that it is not possible to deal with very small sums. The same is true of the various rival systems of digital cash.

The theoretical solution to these problems has been known for some time. It requires a mechanism to allow transactions to be carried out across the Internet in a secure manner, but able to cope with sums involving fractions of a penny. Once such an infrastructure is in place, very low-cost subscription models become viable.

In fact the ramifications of these micropayments are much wider. As well as allowing publishers to charge on a per-page basis, for example, it could lead to the creation of interesting hybrid revenue models. Web publications could carry advertising that would be optional, where an ad-free version would require a small payment. Equally, readers might themselves be paid to click on banner ads, receiving a micropayment for every advertisement Web site they visited. Another use would be to gather detailed demographics from visitors, who would receive more money as they revealed more detailed information about themselves.

Nor are these models limited to the publishing industry. An obvious application would be the sale or rent of Java applets/JavaBeans, ActiveX controls and other similar software components on a per-use basis. A tiny charge could be made every time such elements were downloaded or embedded in users' Web pages. More generally, software publishers could sell extra features in this way, or start charging small fees for upgrades. Shareware authors could similarly generate revenues from even the trial downloads of their products.

Another interestingly possibility is to use micropayments as a way of creating priority channels. For example, busy sites could give visitors the option to pay a small sum in order to be upgraded to the fast track; those who did not wish to pay would be served afterwards. This could be applied to accessing popular pages, downloading programs or retrieving search results. More generally, such micropayments could be used to fund new Internet infrastructure through a kind of online toll system.

Micropayments would also have enormous impact on intranets. It would allow, for example, software to be metered, so that licenses could reflect usage accurately. Internal micropayment currencies could be used to charge back costs to departments on the basis of actual use of resources rather than estimated allocations. Micropayments would also offer a very fine-grained tracking system to establish which resources are being used most and by whom.

On extranets, micropayments would allow customers to be rewarded for participation in market research, and could be used to encourage timely delivery of documentation by suppliers and retailers, with various kinds of bonus - or penalty - schemes directly related to targeted extranet use. Clearly, then, the arrival of a viable micropayment system will change the way the Internet, intranets and extranets are used dramatically - probably more than any other business innovation since the creation of the Internet. Next week I will discuss what may well be the first viable example of this revolutionary new approach.

Microsoft Internet Explorer 5.0

But IE5 is here now, is stable and fast, and is clearly the best browser currently available. As a result, the pressure is on Netscape to come out with something better, and soon, if it wants to stand a chance in the browser race.

Microsoft Internet Information Server

In the ongoing battle between Microsoft and Netscape, more attention has been focused on the client side. And yet it is the servers that are in some ways more crucial, at least as far as corporate computing is concerned. Indeed, whoever wins the server battle will ultimately be able to set the agenda for the browser side too.

This has made Microsoft's weak showing in the server arena surprising. For example, when its mainstream Internet Information Server was launched 18 months ago, it offered only the most basic functions. And even with version 3, the only notable addition over previous releases was support for Active Server Pages. Moreover, the package stayed rather lonely: in comparison to the rich array of Internet servers offered by Netscape, Microsoft's range was pitifully thin. Meanwhile, the supposedly high-end Commercial Internet System has continued to languish in obscurity, and certainly represents no threat to Netscape. Few people even know about it, let alone use it.

But with version 4, Microsoft has at last shown serious signs of turning its attention to the server side. The beta may be downloaded Microsoft's Web site, and there are links to basic background information. The changes to the basic Internet Information Server (which offers Web, FTP and Gopher services) itself are fairly slight. Support has been added for HTTP 1.1, Web farms and multiple Web sites on a single IP address, allowing firms to run several departmental intranet sites or provide hosting services for multiple public Internet sites.

But the most dramatic difference of this version from previous releases is in the software that now comes bundled with the main Internet Information Server. There is a fully-fledged Network News Transmission Protocol news server, a Simple Mail Transfer Protocol server for E-mail, and an updated version of the relatively unknown Index Server search engine.

Certificate services have been added at last, allowing this crucial security element to be deployed with Internet Information Server. There is site mapping and site monitoring software to allow administrators to measure and control Web site activity.

There is also an overall management module called the Microsoft Management Console. This uses what are known as snap-ins to allow all Internet/intranet services that are installed to be controlled from this one point. In the left-hand pane is an inverted tree representation of the services as folders, and the right-hand pane presents details of the selected folder.

The Management Console can also be used to control what is perhaps the single most important addition to Internet Information Server: Microsoft's Transaction Server. Hitherto the Transaction Server was a standalone product that had to be bought separately. Now it is one element of a huge set of Internet/intranet software that comes standard with version 4.

Remarkably, they will all be bundled with NT Server for free. There could be no clearer signal that Microsoft sees Windows NT as its key product. By adopting this value-added approach, Microsoft hopes to make the choice of Windows NT almost inevitable for companies looking for a complete Internet/intranet solution.

Certainly the free bundle has an impressive set of features. In fact, it offers the richest and most wide-ranging collection of capabilities Microsoft has ever put together in a single package. The fact that it will be included for free with Windows NT represents an extraordinary upgrade for this operating system.

Of course, the real price you pay for this cornucopia is being firmly locked into the Windows NT platform, something many companies are still reluctant to contemplate. Whether or not this latest release is enough to overcome this resistance remains to be seen. It will put pressure on Netscape's complete server line, now seriously under threat for perhaps the first time.

Microsoft NT 4

Almost exactly a year after the media extravaganza that greeted the arrival of Windows 95, the launch of Microsoft Windows NT 4 presented some interesting parallels to that product. As with Windows 95, the most striking thing about NT 4 is its user interface, which is identical to that of the earlier operating system.

More interesting is what goes on under the bonnet. Windows NT Server always had close ties with TCP/IP - indeed, one of the reasons for its recent large gains in market share can probably be put down to the fact that it happened to be offering the right protocols for the Internet and intranets out of the box. Shrewdly, Microsoft has built on this solid foundation by adding numerous elements that make it an almost inevitable choice for intranets, especially those in smaller companies or within departments.

For example, Microsoft's Internet Information Server (IIS) now comes as standard; indeed, installation is part of the overall set-up process, which makes setting up a Web server as natural as choosing screen drivers. Moreover, IIS's security is tightly integrated into that employed by Windows NT in general. Also supplied is FrontPage, for Web site creation and management.

More advanced is the inclusion of a graphical DNS server. Hitherto, companies using Windows NT were forced either to buy the NT Resource Kit, which came with a DNS server on the accompanying CD-ROM (not very highly thought-of, though), or else use one from a third-party. Now a neat graphical representation lets network administrators set up and control this crucial area (though you will still need to know the ins and outs of this arcane subject).

One addition easily overlooked is the new Point-to-Point Tunnelling Protocol (PPTP). Not much has been made of this, but potentially it could transform the way companies work. Essentially PPTP enables the Internet to be used as a secure bridge between networks and computers. Since these may be anywhere in the world, this allows the creation of a secure, virtual private network at a very low cost (provided there is an Internet PoP nearby).

To use PPTP, a computer first hooks into the Internet. On top of this TCP/IP link, a PPTP connection is created by "dialling-up" the IP number (or domain name) of the distant computer. Once the connection has been accepted by the latter, the first computer can access files as if it were linked directly. An obvious application of this would be to link intranet islands - in subsidiaries, for example - to create a unified global network, but without the need for expensive leased links.

Another feature even less apparent is the inclusion of support for Distributed Component Object Model (DCOM). Microsoft's DCOM is essentially a home-grown rival to the OMG-supported Object Request Broker (ORB). It is part of Microsoft's plans to make its ActiveX technology central to the way programs and the Internet work (DCOM will be included in Windows 95 at some point). At the moment both CORBA and DCOM are largely unproven, and the battle for this key area is shaping up along familiar lines: Microsoft with its proprietary standard and huge Windows user and developer base on the one side, and everyone else following an open, joint standard - but without the market share - on the other.

On the whole, NT 4 is an extremely impressive product. The only sour note in NT 4's introduction has been sounded by the licence for NT Workstation. In an obvious attempt to encourage people to buy the server version, Microsoft has limited the number of simultaneous in-bound TCP/IP connections to 10 (originally this was in the code, now it is simply a legal restriction). Of course, Microsoft has a perfect right to impose whatever condition it likes (just as users have the right not to buy the product), but for a company desperate to prove how open it is, this is a strange move.

One moreover, that seems to have back-fired. Along with some rather heavy-handed propaganda, Netscape has come out with a Windows 95 version of its FastTrack Web server. There are no TCP/IP limits for Windows 95, so this undercuts NT Workstation completely. Since the personal Web server is likely to become an important element of intranet working, I would not be surprised if Microsoft did not reverse its position here - and if Netscape did not cut the price of FastTrack from its current $295 (£189) to turn the screw further.

Microsoft Office 97

The battle between Microsoft and Netscape for the hearts and minds of Internet users has so far been acted out in the obvious arenas of Web browsers and servers. As has been noted before, in the area of more general Internet/intranet servers - for news, certificates, proxies etc. - Microsoft is further behind, but catching up fast. In another key sector, that of desktop applications, Microsoft is of course completely dominant. But hitherto, the programs that go to make up Microsoft Office for Windows 95 have been little affected by the Internet tidal wave. Now, with the release of Office 97, this omission has been rectified, and all of the software has been worked-over to bring it within Microsoft's larger Internet strategy.

For example, the overall user interface employed is that found in Internet Explorer 3. Its distinguishing feature is the way in which icons are highlighted as the cursor passes over them - rather like some Web pages. This interface is also almost certainly the one that will be employed with Internet Explorer 4, which will be even more tightly integrated into the overall Windows 95 system.

Another manifestation of Microsoft's desire to tie everything together - and of an important shift in its strategy - can be found in the Help menus for each of the constituent products (Word, Excel, Access, PowerPoint and Outlook - the new mail and scheduling program replacing Window 95's unloved Exchange client). So, along with the usual links to local documentation, there are links to 'Microsoft on the Web' - ten distinct Microsoft Web pages, accessible by anyone, and replacing similar links to the proprietary MSN conferences in earlier versions of Office. Some of these pages are common to all the programs. For example, there is the Microsoft Office home page; a Best of the Web site; a Search the Web page and, of course, Microsoft's home page.

Other sites are specific to individual Office programs. For example, product news Word, Excel, Access, PowerPoint and Outlook. There are also links to FAQs, free downloads and online support. Another feature common to all of the Office applications (with the exception of Outlook) is the possibility of including a Web toolbar with the standard browser functions of moving forward and back through links, returning to a home page (MSN, by default), and visiting favourite sites (defaults include yet more Microsoft sites).

All of these Web pages are displayed in your default browser, which is automatically called up when necessary. This is also true for another interesting new feature of Office 97: embedded URLs. All of the programs allow you to create live URLs within documents. In Word, the program will automatically convert any URL you type to a live hyperlink; in Excel and PowerPoint, you can embed active hyperlinks at any point, and in Access you can create special URL columns in database tables. The use of hypertext links goes even further. As well as creating live connections to sites out on the Internet or within an intranet, you can also link to points within ordinary Office documents.

All of the Office programs allow you to post information to the Web in some form or another. For Word, this is simply a matter of turning a document into an HTML file. For Excel, you can either publish a spreadsheet as a table, or, more interestingly, employ a Web form Wizard to use a spreadsheet as a form for submitting information to an online database. You can also pull live feeds from Internet sites into spreadsheets.

Access lets you create static pages (you can publish tables, forms, queries and reports) or dynamic ones that query a Web server database, while PowerPoint files can be turned into online presentations. Office 97 is a fascinating example of how the Internet model has started to affect quite radically the key desktop applications, and how the basic metaphor of hypertext is entering the business mainstream. Microsoft's Office 97 is doubtless only the first of many to be affected, and it will be interesting to see how its rivals respond.

Microsoft Open Source Software

The courtroom battle between Microsoft and the US Department of Justice presented many unedifying spectacles but few real insights into the Seattle giant. However, these are precisely what have emerged by another route: the appearance of two internal Microsoft memos by Vinod Valloppillil (jointly with Josh Cohen for the second).

But to pollute the underlying protocols means to destroy the Internet, which depends fundamentally on a single set of open standards to connect heterogeneous systems. This, perhaps, is the most troubling aspect of the Halloween memos: it reveals a possible course of action for Microsoft that is so cynical as to make the current anti-trust allegations look trivial in comparison.

When the US Department of Justice, in 1998, presented its witnesses in the anti-trust action against Microsoft, the general tone of the proceedings was one of extreme boredom enlivened by occasional moments of drama.

Even a massive collapse of the Internet stock market (not impossible) is unlikely to help the company. Just as its share price has risen partly through the perception that Microsoft is now an Internet company, so it is likely to be dragged down by any broader industry setback. And a falling share price will make all those share options suddenly rather less attractive.

Microsoft's Site Server

Microsoft took an early lead in the electronic commerce market through its acquisition of the company e-Shop in mid-1996, and the appearance of its Merchant Server 1.0 (entry cost £12,000) at the end of that year.

But since then, things have moved on apace. Online commerce products are much more sophisticated and much cheaper. Microsoft's response is called Commerce Server (c. November 1997) and it comes as part of Siteserver Enterprise Edition, which has other elements as well.

These include a Personalisation System that lets users add personalised elements to a Web site, and an Internet Locator Server enabling chat or real-time communications among its visitors. There is also a Site Analyst program to provide a visual representation of the elements of a Web site, using technology licensed from Xerox. Alongside this is Usage Analyst for analysing visitor log files, as well as a content replication system, and a copy of Visual Interdev, Microsoft's Web-database development tool.

A detailed analysis in the first issue of Corporate Internet Briefing, which examines the Internet strategies of both Microsoft and Netscape (more information and a sample copy from cib@rbi.co.u by Glyn Moody discusses the full ramifications of this interesting bundling for Microsoft's Internet products.

At the heart of Siteserver Enterprise Edition is Commerce Server (there is also a variant of Siteserver without the latter, which costs £1,200 against the Enterprise version's £3,800 price-tag.

This works with any database that complies with the open database connectivity, but the installation instructions are skewed toward Microsoft's SQL Server product. Provided you adopt the pure Microsoft solution, the installation notes offer step-by-step guidance. Once installed, Commerce Server comes with Microsoft's customary polished online help.

Commerce Server is built around Active Server Pages - which are ideal for this kind of template-based electronic commerce - and ActiveX controls. In addition to the controls routinely supplied with Active Server Pages, there are several more dedicated Commerce Server Objects. These are all server-side programs, and are therefore not subject to the ActiveX security problems.

A clever graphical representation of the overall order pipeline is employed. The individual software components that handle things such as pricing, orders, shipping and tax are displayed as a string of objects that can be examined and manipulated on-screen.

Microsoft supplies two ways to do this: either via a purely Web-based interface (employed in Commerce Server for administration purposes) or through a standalone application supporting component drag-and-drop. Although there is plenty of innovative technology buried within Commerce Server, the product itself is not exceptional. There are only four shop templates supplied, which means firms will almost certainly have to design their own.

Moreover, the order pipeline approach, while very neat in theory, means in practice some fairly complex coding if custom elements need to be introduced. In other words, Commerce Server is a powerful solution for those who are prepared to program, but is rather limited for companies which just want an easy but flexible solution.

One other feature is worth noting. Commerce Server supports the Microsoft Wallet These are software elements added to the customer browser either as ActiveX controls or Netscape plug-ins, and allow, for example, credit card details to be entered once and then passed safely across to Wallet-enabled commerce sites. Other payment methods will also be supported.

A good introduction to how this fits into Microsoft's overall online commerce strategy can be found at the Web site

Microsoft Site Server 3.0

Site Server, Microsoft's high-end Web server solution, has followed a development trajectory typical for the company. The first version was fairly basic. And while Site Server 2 was a major upgrade that introduced a number of innovative ideas, it still felt like a work in progress rather than a finished solution. Now, with Site Server 3, the full extent of Microsoft's plans in this area become apparent.

As with the previous version, there are two incarnations: the basic product considered here, and the Commerce Edition. The home page is at www.microsoft.com/siteserver/default.asp, with links to information and trial downloads. Above all, Site Server 3 is a package bound up with other Microsoft products. To install it, you need NT Server 4, NT Server 4 Service Pack 3, NT Server 4 Option Pack and Internet Explorer 4.

Site Server can be regarded as a further upgrade to various Internet/ intranet elements in Windows NT and its various patches and add-ons. As well as extending the basic Web server, it builds on the Index Server search engine added with the Option Pack.

The basic search engine and interfaces are the same. But Site Server's version includes the ability to search through ODBC-standard databases - important to firms wishing to create a centralised catalogue of all their information - and wizards that make the customisation of search catalogues fairly straightforward.

Analysis functions introduced in Site Server 2 have been augmented considerably. There are now 17 Web server log formats supported, including Netscape, Apache, Lotus Domino and Real Audio. Moreover, the hyperbolic representation (licensed from Xerox) has been extended to use colour-coding to mark the busiest routes within Web sites. You can even create short animations to show how various kinds of data use change with time.

Web site personalisation now lets the user produce tailored content and E-mailings. A push server is included, and a wizard helps generate the Channel Definition Format files used in Microsoft's push technology. An interesting feature is the addition of multicasting support.

Conventional push technology is wasteful of bandwidth, sending the same content many times over the network. Multicasting allows messages to be sent just once, and for interested clients to subscribe to and receive content of interest. To cope with this the user's machine needs a multicast delivery agent (supplied with Site Server), and intranet routers may also need to be upgraded to support multicasting.

Another forward-looking feature is the ability to add meta tags to Web content. One of the biggest problems with intranet (and Internet) searches is that conventional indexes lack intelligence when it comes to sorting through the hits. By adding data using HTML's tag, it is possible to incorporate additional information into Web documents.

Site Server provides a tool that allows the creation of custom site vocabularies. These are then employed by users to describe the Web pages they write. Using the tag is something of a hack: the real solution would be to create some fully fledged extensible markup language pages with custom tags to mark up all data in a rigorous way.

All-in-all, Site Server 3 is a considerable enhancement of version 2. It underlines Microsoft's continuing attempt to make its products ever-broader in their reach, and more tightly integrated into the operating system. This means that it represents an excellent product for companies already committed to the Windows way, and an even bigger wrench for those who are not.

Microsoft Active Server Pages

Microsoft has never really blown its own trumpet when it comes to Active Server Pages (ASP) technology. And yet almost imperceptibly, this technique of creating HTML pages on the fly using server-side scripting to pull in customised content from back-end databases is fast becoming the de facto standard for dynamic Web-page generation.

One day all advanced Web pages may well be created using some descendant of ASP, but it is unlikely that most of them will be hosted on a Windows-based server as they are today.

Microsoft Visual Tools

Given Microsoft's wholesale conversion to the Internet it was only a matter of time before its software tools were revised in this light. The result is Visual Tools 97, a bundle that includes Visual J++ 1.1 (for Java), Visual C++ 5, Visual Basic 5, Visual FoxPro 5 and Visual InterDev.

Some of the changes, such as the increasing unification of the development environments, are cosmetic (though still welcome). Others, such as the claimed increased speed of Visual C++ , are to do with performance. But alongside these more technical issues there are other aspects that throw some interesting light on Microsoft's Internet strategy.

For example, Visual Basic 5 allows you to create several kinds of programming object, including an Active X control, ActiveX .exe, ActiveX .dll, ActiveX Document .exe and ActiveX Document .exe. In other words, the basic model of VB 5 - the use of controls to create programs - has been extended to embrace ActiveX. In addition, there are new controls aimed specifically at Internet program development, including one for handling HTTP and FTP transfers, and a Winsock control for managing TCP and UDP.

These low-level additions aside, the latest release of Visual Basic is notable in that it represents the final stage of its transformation from a very simple hacker's tool, used for knocking together often rather bloated programs, into a mainstream corporate development environment. In particular, the ease with which ActiveX controls can be created will doubtless lead to a flood of them, both within and outside companies. Some will be well-written, and others not. Most will be useful in varying degrees, but a small and worrying minority will have more malevolent intentions, and we can expect security incidents involving ActiveX controls to become more of an issue in the future.

Perhaps the most interesting element of Visual Office 97 is Visual InterDev .a:\!getting.wir It provides a Web site management tool, and is intended to be used alongside other Microsoft products such as Front Page Editor (included with InterDev) and Visual SourceSafe (for version control). It also comes with a couple of simple multimedia programs: Image Composer for manipulating images, and Music Producer for creating music tracks very easily. The former is particularly interesting as Microsoft's first product to enter the arena currently dominated by Adobe Photoshop.

The heart of Visual InterDev lies in its database management tools. As well as offering a view of all the files contained within a Web site project, the program also allows you to manage database resources. The SQL Query Designer enables you to construct SQL queries visually (in a way that will be familiar to users of Microsoft's Access database). This will work with any database for which there are ODBC drivers. But many of the more advanced features of Visual InterDev - notably the very simple and direct control of SQL databases - will currently only work with Microsoft's own SQL Server 6.5.

As a previous column indicated, the power of the Visual InterDev environment arises from its ability to create Active Server Pages (ASPs), notably using design-time ActiveX controls. ASPs are HTML documents with server-side scripting and embedded controls that allow Web pages to be created on the fly before being sent to the browser. Such an approach is likely to transform not only Internet Web site construction, say in online commerce applications, but also those used in intranets and extranets.

For example, using ASPs it will be possible to send different information over an intranet according to who is requesting it - whether the Finance Director or an entry clerk. Similarly, ASPs will allow the information offered over extranets to be controlled very finely and with greater security, since confidential information remains locked in a database until it is called up, rather than placed in a static Web page where it may be accessed in error.

Microsoft Windows 98

Microsoft may have woven its latest operating around the Web, but has the software giant produced one innovation too many?

Three years ago saw an unprecedented marketing campaign for the launch of Windows 95. Most attention was directed at the new interface, but one of the product's true innovations was the inclusion of Internet functions as standard.

In many ways, Windows 95 was a turning point for the uptake of the Internet outside the academic circles where it had hitherto been confined. You no longer needed to download extra software to get online, since it came as standard with the operating system.

Similarly the inclusion in Windows 95 of the albeit weedy Internet Explorer 1, a hastily-knocked together re-write of code licensed from the original Mosaic browser, signalled an important milestone in the Web browser's steady rise to its current central role.

Downplaying
Against this background the launch of Windows 98 has been a curiously muted affair. Microsoft has almost been downplaying its significance, insisting that firms, for example, should move to Windows NT instead.

And where before Internet capabilities were added more or less as an afterthought (as the constantly changing Windows 95 betas proved), the Internet is absolutely central to almost every aspect of Windows 98.

This is most visible, of course, in the much-bruited integration of Internet Explorer 4 with the entire operating system's graphical front-end. Paradoxically, this extremely important move is also the least interesting facet of Windows 98 for the reason that it has been available to Windows 95 users through the installation of the standalone Internet Explorer 4 product.

The features that this approach brings - the Active Desktop, pseudo-push channels - have been discussed in previous columns.

Certainly the logic of some kind of integration is pretty compelling, but at times you can't help feeling that Microsoft has almost gone too far and made things too complicated with its implementation.

Similarly, those who have suffered the worst excesses of push technology - screens that are constantly flickering and swooping with multimedia elements - may find the new features of Windows 98's user interface disconcertingly familiar. One more welcome use of Web technology is the re-writing of all Help files in HTML.

The other additions to Windows' Net functions are fairly minor. For example, Dial-up Networking now includes an ISDN configuration wizard, reflecting the gradual uptake of this old technology.

More importantly for firms, Windows 98 supports Microsoft's virtual private networks technology.

One novelty is WebTV. When used in conjunction with a PC tuner-card, this allows ancillary information to be offered along with TV programmes. Those without the card can use WebTV to download listings information via the Internet (in the US, at least).

But, more generally, Windows 98 has support for data transmitted during the Vertical Blanking Interval - the dead-time between successive TV frames. This should theoretically allow an impressive 10 Kbytes per second download speed - almost at ISDN levels, though the data flow is one-way only.

Windows Update
Finally, Windows 98 offers an Internet-based feature that could become widely adopted by other software suppliers too. Called Windows Update, this allows software updates to be carried out automatically via a web page (only accessible from within Windows 98).

ActiveX controls are downloaded to inspect a system and evaluate which of the current upgrades are applicable. Microsoft emphasises that no information is sent back to the company during this process.

After the user has chosen which of these upgrades to apply, new software and drivers are then sent directly over the Net to the machine and installed automatically.

IT managers will be relieved to know that this feature can also be turned off to prevent users from upgrading willy-nilly.

Middleware

One obvious way of getting at legacy information is to convert the commonly used 3270 data streams directly into HTML files. These can then be viewed using any browser across a corporate intranet. This approach has already been implemented by a number of companies, including IBM with its CICS Internet gateway (see http://www.hursley.ibm.com/cics/saints/main.html), Information Builders with its Web3270 product (at http://www.ibi.com/releases/web3270.htm) and Simware's Salvo (at http://www.simware.com/salvo/).

Clearly, though, this is a relatively crude solution, and exploits none of the potential power of linking back-end databases tightly with front-end Web servers. That power is being unlocked by a new generation of software that is often called middleware, despite the vagueness of the term. Essentially these products sit between the Web and the database servers, translating the general requests (which originate ultimately from a Web client somewhere) into formal database queries. There are therefore two general issues to be addressed by this middleware: how you interface with the Web server, and how you hook up with the back-end database.

The quick-and-dirty solution to the former challenge is also the most obvious. The CGI or Common Gateway Interface was tacked onto the HTML specifications to allow for requests to be channelled to other applications in just this way. And indeed, many middleware products adopt this approach. However, CGI is very limited, and was certainly never intended for the very high throughput of data, or for many simultaneous requests (and has a number of other more technical limitations). This makes it unsuited for precisely those kinds of applications where the Web-data connection is so compelling - direct, if limited, public access to corporate databases, and online commerce.

The alternative solution, becoming increasingly popular as the overall shape of the Internet software landscape begins to emerge from the mists, is to plug directly into the Web server software. This is done through the use of proprietary Application Programming Interfaces or APIs . The big gain of this approach is speed and scalability. The disadvantage is that it ties you to the particular manufacturer whose API you use in this way (Netscape for NSAPI, Microsoft for ISAPI). Given that between them Netscape and Microsoft are likely to dominate the world of Web servers (however the share of the market works out between them), taking this route is not, in practice, so limiting, provided support for both server APIs is included.

This polyglot approach is also the one being taken by Next with its WebObjects product (see http://www.next.com/WebObjects/). As well as benefiting from the high-profile (and highly vocal) Steve Jobs as Chairman and CEO, Next is also hoping to gain a head-start in this sector by taking a leaf out of Netscape's marketing book and giving away copies of its entry level WebObjects program. You can download copies for Unix and Windows NT platforms from http://wofapps1.next.com/cgi-bin/WebObjects/Betatron.

Web middleware grows up and gets active

The list of available middleware products designed to link Web servers with back-end databases software held at the main reference point (http://cscsun1.larc.nasa.gov/~beowulf/db/all_products.html) is even longer now, but since that time an important shift has taken place in this sector. When I first discussed this area, the idea was simply to take information from databases and drop it into Web pages that were sent back to a client. These Web pages were essentially pre-defined, and simply kept open slots where the relevant information would appear.

However, as Web sites have grown in complexity, managing such static sets of Web pages has proved increasingly problematic. The solution is to automate the whole process of page generation. This involves not only retrieving information from databases, but creating the entire Web page on-the-fly. By using server-side scripting and a few very basic templates it is possible for the system to respond to the changing context of the information that needs to be displayed in Web pages, with little detailed intervention from the designer.

One of the pioneers of this technique of active page generation was Netscape with its LiveWire tool. As well as the rather unsatisfactory Navigator Gold HTML editor, this includes a Site Manager program for visual site-management, to allow the creation and management of Web sites using drag-and-drop techniques. More importantly, it contains a JavaScript compiler for embedding server-side code in HTML pages. It is this that can respond in various ways and change the Web page sent back to the client. The back-end side is handled with a database connectivity library to allow direct connections to databases from Oracle, Sybase, and Informix, as well as more general ODBC connections.

LiveWire is not particularly easy to use, and the online information is very limited. There was a clear opportunity to do better here, and Microsoft has certainly seized it with its new Visual InterDev Web development environment. In fact it is not that new. It began life as a product code-named 'Blackbird', designed as a high-end publishing tool for the ill-fated proprietary first version of the Microsoft Network. More recently it has been distributed in beta as Internet Studio, and has now been rebaptised with the rather ungainly name of Visual InterDev - doubtless to avoid confusion with Microsoft's new Visual Studio 97 integrated development environment, which includes Visual InterDev as a component.

Unlike Netscape's rather feeble offering, Microsoft has put together an exemplary site for Visual InterDev which should be required reading for anyone interested in this whole area of active page generation. This is true even if you are not interested in following the usual Microsoft proprietary route: InterDev is strongly biased in favour of products like Internet Information Server 3.0 and SQL Server 6.5. Good places to start are the Overview, Q&A and particularly the White Paper. You can download a free beta of Visual InterDev - but be warned that the program is a massive 55 Mbytes (also available as 11 files of 5 Mbytes each).

Alongside the previously-announced Active Server Pages (which can be spotted by their .asp file extension), another important new concept introduced with Visual InterDev is that of design-time ActiveX controls. Unlike ordinary ActiveX controls, these produce pure HTML as output, which can then be sent to any Web browser, even those that have no ActiveX capabilities.

Moreover, within this HTML design-time controls can generate scripting - in a variety of languages - to offer the ability to create HTML designs dynamically. Visual InterDev is a prime example of how Microsoft can build on its experience in other areas - programming tools and general visual development environments - to come up with a product that is markedly easier to use than rival offerings such as Netscape's. The price you pay, at least currently, is that the product is tightly tied to the Windows 95/NT platform

Millicent

Digital called its approach to micro-transactions Millicent because it will have a basic transaction cost of just one thousandth of a cent. Items can be sold for as little as one-tenth of a cent without relative transaction costs being significant.

To ensure the validity of electronic payments, which consist just of easily duplicated 0s and 1s, current systems refer electronic cash to a server that checks for fraud and double-spending. But these controls typically cost a few pence to carry out, setting the lower limit for viable transactions at the dollar/pound level.

Millicent avoids using a central register by enabling the vendor to carry out the checks more quickly and cheaply. Moreover, since only micro-payments are involved, the security can be correspondingly weaker than that used for transferring credit card details, where the potential losses may run to thousands of pounds.

The lower-level security - effectively that provided by a simple digital signature - is easier to crack, but the negligible benefit will act as its own deterrent. After all, there is little point spending several pounds of computer processing time to generate fake E-cash worth a fraction of a penny.

Because Millicent's E-cash - called scrips - is checked by the vendor using relatively simple techniques, they are valid only for that supplier. The overhead in checking scrips meant for other vendors would raise the basic cost too much. In a way, scrips are like telephone cards or manufacturers' coupons: valid only for one kind of use.

Users would have to set up accounts with different vendors, and it does not solve the problem of somebody wishing to spend only a few pence with a supplier on one occasion. Before the scrip can be used it must be bought from the vendor with real money which can only be accomplished using conventional encryption techniques. This in turn means that a user would be forced to buy a fairly large amount of a supplier's scrip.

To get round this difficulty, another player is introduced: the broker. Users buy scrip for a particular vendor from a broker authorised by that vendor to sell scrip on its behalf. Because a broker can sell scrip for many suppliers, users can open one account with a broker, spending a few pounds, say, and then buy scrip worth pennies for each of the different vendors.

In this way the suppliers are saved the trouble of dealing with the initial purchase of scrip for a few pennies. Instead, they receive aggregated monies from the broker on the basis of the total amount of scrip the broker has sold to various users for that supplier. The broker charges a small commission - probably a few per cent - for this service.

Millicent works at the user's end through a browser plug-in that functions as a kind of electronic wallet. It can be used to buy scrip from a broker for a particular vendor. This scrip can then be sent to the supplier for a micro-transaction, and change sent back to the wallet in the form of vendor scrip, which can be used for further transactions with that vendor.

The wallet can also receive monies from a supplier, for example as payment for viewing ads or taking part in market research, which can then be converted back to real money by the corresponding broker. The wallet can be set up to carry out automatically micro-transactions below some configurable threshold level. For the vendor to deal with scrips, all that is required is some extra software that works alongside the Web server holding the pages paid for by micro-transactions.

From Millicent's home site there are links to an FAQ, glossary, a short demonstration of how the system works in practice, details of the protocols involved and more information about when Millicent will be available commercially.

Multicast Backbone

Generally, the Internet works by transmitting information from one node to another, with various packets of information taking whatever route they can over the global network.

An interesting variant of this multicasting. Here a single node (the transmitter) sends out packets that pass over the Internet, not to one recipient, but to many. Clearly this is just the online version of broadcasting, but translating the familiar world of radio and television to the Internet poses some special challenges.

For example, in the traditional model, such multicasting would require the transmitter to send out many - potentially millions - of identical packets so that they could be routed to their various (different) destinations. This is extremely inefficient close to the transmitter, and is not viable for more than a few recipients.

Instead a technique has been developed that requires the transmitter to send out only one copy of its online broadcast. This is transmitted not to a recipients individual Internet address, but to a special group address, one found among those that go to make up Class D (see the section on Address Classes).

In effect, this is like a radio or TV channel: the transmitter broadcasts on this cyberwavelength, and recipients tune in to it.

Along the way, special routers - the points at which the constituent parts of the Internet join together - know how to pass on the transmissions to one or more other special routers. This collection of Internet routers, able to pass on these multimedia transmissions, go to form the Multicast Backbone On the Internet: the MBone.

Net Brand

Web search engines have diverse extras, but they all have one thing in common: they aim to keep visitors at a site as long as possible. This brings the double advantage of increasing the effectiveness of the adverts they carry, and allowing them to complete the online shopping's trinity of content, community and commerce.

All the main sites have announced multimillion dollar deals whereby particular online retailers will be promoted within the expanded search sites as the preferred vendor for a particular domain. Although search engines have extended the range of their activities in this respect, particularly Excite which has emerged as the clear leader in this sector, none of them can compete with Yahoo.

Extension
According to the analysis of Relevant Knowledge, Yahoo is by far the most visited site on the Web,

This has been achieved largely through an extraordinary extension of its activities. As well as 11 overseas sites (Australia, Canada, Denmark, France, Germany, Japan, Korea, Norway, South-east Asia, Sweden and the UK) and the extra services now offered by search engines - news, yellow pages, white pages, free E-mail, chat, instant messaging - it has a dedicated site for young people and one for senior citizens.

In short, Yahoo is no longer just an Internet catalogue, but the first Net brand. Netscape may have been the first Internet company, but it is Yahoo that has succeeded in creating a brand that transcends any particular product. As well as reinforcing this brand by adding new services, Yahoo has another goal: to turn itself into a new kind of complete, Web-based online service, not just a Web site.

Its recent equity stake in the huge Geocities online community is an attempt to bolster its position in this respect. One early fruit of this move is a new joint Net service with Internet service provider MCI, called "Yahoo! Online powered by MCI Internet". One obvious competitor in this domain is America Online, the leading traditional online service. It has more than 11 million sub- scribers, and has made the transition to the Internet well.

Tellingly, its Web site looks almost identical to Yahoo and several search engines that offer content-based channels. Another firm hoping to turn its site into a Web-based online service is Netscape. It has added discussion groups.

Portal site
An interesting contender is Snap. This is a site that has been created by the widely-praised Cnet as a standalone Web-based online service. Finally, Microsoft is working on a portal site to compete with Netscape's new online incarnation. Its appearance will signal a further nail in the coffin of the Microsoft Network which was supposed to provide just such a starting point for Internet users. The current shift towards purely Web-based services can only hasten its ultimate demise, at least as currently constituted.

Netcraft

The Netcraft site is a UK site that is now recognised internationally as offering a key Internet information service: mapping out which software Web servers are running.

Its Web server survey (available free) represents perhaps the most complete, and certainly most objective, snapshot of the server side of the World Wide Web today. It does this the slow but sure way: by contacting in turn every server that it knows about.

When I first mentioned Netcraft some 18 months ago, it was boasting that its survey covered 20,000 sites; recently that figure had swelled to over a million - a clear indication of both the breadth and reliability of the survey, and of the unabated growth of the Web. A history of the survey's first year is available as are details of the methodology employed.

The survey page presents the top-line results in a graph as well as tables. As I have mentioned before, one of the amazing facts to emerge from this survey is that the most popular Web server by far is the freeware Apache program with over 43% - and rising. This is against a background where Microsoft is pushing heavily its (free) Internet Information Server - which just recently crept into the number two slot with 16% - and Netscape has been selling its popular products for several years (and has 12% of the market).

One other interesting overall statistic mentioned on this page is that at least 21% of Web servers are now running under Windows NT, a figure that has increased from zero rapidly.

Netcraft is very generous with its information, and supplies the data in many different ways. For example, there are more graphs , and detailed reports for each Internet domain (.com, .uk, etc.) A fascinating page shows which systems the software companies run themselves.

In many ways, one of the best features of the Netcraft service is the ability to check what software a given site is running by just entering the URL of the machine you wish to inspect.

Netiquette

The Internet has a culture and general etiquette known as Netiquette. Perhaps the fundamental rule for Internet access is to think globally but act locally. That is, even when seeking information that may relate to locations thousands of kilometres away, try to view it or download it from a site nearby. This will avoid unnecessary traffic on the limited intercontinental connections. Such local storage is common and reflects the Internet community's natural tendency to promote the capillary spread of important information from site to site.

As well as casual holdings there are many major collections - called mirror sites -, which purposely set out to hold, exact and updated copies to key international resources. An example is Imperial college's mirror at ftp://src.doc.ic.ac.uk/computing/systems/ibmpc/windows3 of the huge CICA Windows archives at ftp://ftp.cica.indiana.edu/pub/pc/win3/, which is almost impossible to get into, so great is the demand.

Another important factor that should affect your online behaviour is the time at which you access the resource in question. Many Internet sites are run on machines used for other purposes, notably at universities. If you log in during peak periods this may well divert resources form their main role, making such activity unpopular with both the site users and their system managers. If such access reaches excessive levels, it is possible that the facility with be withdrawn. Try then, to connect to other systems outside normal working hours. This means that GMT morning is a good time to access US sites and GMT evening is better for those in the Far East. Ideally European sites should be accessed as far as possible outside the main working hours, though this may be difficult for business users. Unfortunately some sites are only accessible during their normal office hours.

FTP sites usually ask you to leave your E-mail address as a password.: this is mainly for statistical purposes, and as a courtesy you should answer truthfully. similarly, an increasing number of WWW servers request users to register before accessing (there is rarely a charge for this), and once again, it is only fair that you comply if you are going to be using their resources.

Although the World Wide Web remains without doubt the most glamorous and visible manifestation of the Internet, it is E-mail that is both the most widely diffused and probably the most directly useful for businesses. Indeed, even sceptical IT managers who regard external Web activity by their users as little more than a waste of employees' time and corporate computing resources would never think of cutting off the corporate E-mail connection, so central has this become to many companies' way of working.

Technological changes have introduced important new issues that need to be addressed by all E-mail users. For example, in the early days of business E-mail, most messages were simple text. But gradually techniques for sending more complex documents have been introduced, notably through the use of encoding and attachments.

Encoding is simply a technique of converting files employing the full eight bits into those able to pass through E-mail's basic seven-bit format; examples include Mime and UUencode. Attachments are encoded files that are added to the end of an ordinary ASCII (text) message. Encoding and attachments are both useful technologies, if used appropriately. But unfortunately many people sending E-mail use them without realising it: for example, some programs allow you to create a document in Microsoft Word, say, and simply E-mail it without any further action being necessary. But to send what looks like a fairly innocuous letter written in Word means encoding it.

Similarly, the almost trivial operation of attaching a file to a letter - a spreadsheet or graphic, for example - results in an attached file being added, albeit transparently to the sender. However easy this operation may be for the sender, it can be problematic for the network and the recipient. For example, even short Word documents are large compared with the same message written as ASCII text. Moreover, the mechanics of encoding mean that the file size is increased by a third.

Attachments therefore represent a considerable and unnecessary drain on the Internet. The same is even more true where graphics or sound files are concerned: although it seems a nice idea attaching one of these to your message, you may in fact be adding several hundred thousand bytes to its size. As well as wasting the Internet's (and your company's) resources, it also means that the recipient must download correspondingly larger files, and hence pay more for the privilege of receiving your message.

But even assuming that they are willing to do this, there is another major problem. Because there are several alternative means of encoding messages and attachments there is no guarantee that recipients will have software able to decode them. Using encoding and attachments is therefore bad Internet manners. Neither should be employed when writing to strangers. They might, however, seem options when sending to regular correspondents who are happy to receive attachments, and are able to decode them, but in fact there is a very good reason why even in this situation they should not be used.

Formatted documents such as those written in Word can contain macros, and it has become painfully clear that macro viruses are not only possible but extremely common. Since you can never be sure that your system is not infected with one of these (for example with a new strain undetected by current anti-virus programs) you do not have the right to jeopardise the computer - and possibly the entire network - of your recipient.

Encodings and attachments should therefore be avoided completely when E-mailing. After all, the key element of E-mail is its content, and this can be conveyed just as easily - and far more efficiently and safely - with basic text. Files should be attached only if absolutely necessary, with the agreement of the recipient (who must have suitable decoding software), and if they are unable to carry conventional or macro-type viruses.

Net Navigation

Gophers have proved a very popular way of navigating the Internet's vast resources. Their menu-driven approach means you can see exactly what is on offer at any point. This, however means that you must know- roughly, at least - where it is located in the menu structure. For this reason the search feature called Veronica has been developed. Retrospectively the name has been explained as Very Easy Rodent-Oriented Netwide Index to Computerised Archive, but Veronica seems to have been chosen as a companion to Archie, since both names appear in a North American cartoon strip. Veronica searches through its own menu entries and those of other Gopher sites. It is simple to use and has been around the longest, meaning its links through Gopherspace are often very rich.

Since Veronica is an option on a Gopher menu, to access it you will need either a Gopher client running on your desktop machine or elsewhere in your company, or you can use the telnet function to run a Gopher client on a machine such as gopher.ic.ac.uk.

Veronica searches can turn up hundreds or even thousands of possible menu items. The Jughead tool circumvents this by allowing you to carry out a Veronica-type search at one Gopher only. Like Veronica, where it is available, Jughead itself is normally found as a Gopher menu item.

Net PC

The network computer (NC), an idea intimately bound up with those of intranets and Java, is one of the most interesting developments in the online world, even if it is only just passing from vapourware to hardware. If nothing else, it has provoked a great deal of discussion within computer and end-user companies about the future direction of corporate computing, and ways of reducing the current high support costs for business PCs.

The success of the NC concept can be clearly seen from its effect on companies threatened by it. Initially, Microsoft dismissed the whole idea as wishful thinking and/or irrelevant. But as the reality grew ever closer - culminating in the recent launch of Sun's JavaStation - Microsoft and its main hardware partner Intel have shifted ground considerably.

As this so-called Wintel duopoly sees its dominant position potentially challenged for the first time in years, they have fought back in classic fashion. Noting that some of the ideas of the NC have met with considerable support among end-users, the two companies have come up with the cleverly-named NetPC concept.

In effect, this is the basic NC idea hijacked and tricked out in PC guise. The argument of Microsoft and Intel is that the NC throws out the baby with the bathwater, sacrificing too much of the valuable PC functionality for the sake of otherwise laudable gains such as ease of support and lower cost of ownership. The NetPC, based around a minimum configuration of a Pentium chip with 16 Mbyte RAM plus hard disc, aims to offers all of these, together with the traditional advantages of the PC.

The NC specification, alongside basic hardware features such as VGA screen resolution, a pointing device, text-input capability and audio output, includes a number of software requirements. The most important of these is that a Java Virtual Machine and runtime environment must be offered, as well as the Java class libraries.

Other key elements for an NC device are support for HTML and HTTP, since Web browsing is seen as one of the central uses of such a system. Also required are SMTP (for sending e-mail), and POP3/IMAP4 for retrieving it. Network administrators will also be interested to see that two other elements must be included: DHCP and bootp, the latter to enable an NC to boot over a network.

There are no references to processor type at all in the NC specifications. Interestingly, the main implementation of the Network Computer is being put together by the UK company Acorn using the ARM7500 processor. The basic price will be around the £300 mark.

Other major companies which have announced various incarnations of the NC are Apple with its Pippin product and IBM with its 'thin client' intelligent network terminal, though it is likely that both of these will cost more than the Acorn unit.

One company conspicuous by its absence from the roster of NC supporters is Microsoft. This is hardly surprising since a major reason for the whole project was to fight back against the increasingly dominant Microsoft/Intel 'Wintel' platform. The NC approach makes the hardware irrelevant, and shifts the emphasis from the desktop to the server, using Java to deliver functionality over the network.

At first, Microsoft simply dismissed the Network Computer as patently absurd, rather as IBM had done with minicomputers, and Digital with microcomputers, when the ideas were first mooted. But just as those companies had eventually to recognise the value of those new approaches, so Microsoft seems to have shifted position of late. In its usual style, it has come up with its own variant of the NC idea, dubbed Simply Interactive PC.

SIPC is aimed primarily at the consumer market. Indeed many see these new 'fourth generation' computer products as a way of capturing the relatively large sector of the population that still does not own a PC, put off by high costs and the know-how required. NCs for the home aim to solve both problems, for example by offering units that plug into ordinary TVs, and which log on to the Internet automatically. One of the most interesting companies in this area is Diba (http://www.diba.com/), a start-up which is introducing a range of low-cost, dedicated devices that build on the basic NC idea.

The other main arena where NCs are expected to be employed is within companies, particularly where intranets already exist. Applications and operating system updates can be provided centrally, and the running costs of these devices should be far lower than those for desktop computers, where software maintenance is a major cost.

Other applications of the NC include production systems and stock control in factories, and myriad uses as intelligent terminals within retail outlets. Another interesting possibility is that national telecom companies might distribute NCs to their customers to offer an Internet-based service along the lines of the French Minitel system.

The main point to bear in mind about the Network Computer is that it is fundamentally complementary to the PC. It will never replace the latter for complex tasks where desktop power is required - for example, in the basic creation of information. But for accessing that information, especially in situations where a fully-fledged PC cannot be justified either in terms of cost or power, the NC may offer the perfect solution.

Netscape

It is extraordinary that a company and a product that was seem to dominate the world of the Internet in 1996, first entered the public awareness only in 1994 and largely lost out to Microsoft Internet Explorer by 1998.

Mosaic Communications Corporation (as it was called until the University of Illinois took exception to the use of the name of its then most popular browser, Mosaic) was founded in April 1994, by Jim Clark (ex-Silicon Graphics) and Marc Andreessen (the originator of Mosaic), it was not until October 1994 that Netscape Navigator 0.9 appeared on the Internet.

In a marketing move whose impact has not been sufficiently appreciated, Netscape Communications chose to make this beta version freely available. It is true that in doing so the company was following a time-honoured tradition in the Internet world - but not one that any commercial company until then had been brave enough to adopt. At a stroke it gained the sympathy of the Internet community, free beta-testing and a huge installed base. In the process, it has also revolutionised the way software is sold. For now it is quite a common practice among even hard-headed software companies to release free versions of programs on the Internet, and even to provide freeware alongside more advanced paid-for applications.

As a result of this move (and the very real virtues of Netscape Navigator), the browser in its various incarnations has something like 70% of the market. This has enabled Netscape Communications to sell its server products to the business world by offering the prospect of tapping into the advanced features of this huge community, notably secure financial transactions (though carelessness in the implementation of this meant that things were not quite as watertight as they should have been).

Other servers available include those without encryption support (for marketing purposes) and those functioning as a proxy server. There is also a news server product that adds encryption and password protection features to the standard Usenet structure (making it ideal for use within companies for collaborative work).

The company has also been busily extending its client-side products. As well as the new Netscape 2.0, already released as a beta, and offering advanced features such as support for Java, there is a new HTML editing tool called Netscape Navigator Gold and a Web site management product called LiveWire. Netscape Communications has also encouraged the development of add-ons like SmartMarks and Chat (see the URL http://home.mcom.com/comprod/netscape_products.html for more information on all Netscape products).

The reason for all this activity is not only to justify the $2 billion valuation placed on the company in its recent share offering (that saw Marc Andreessen alone worth $50 million) and to generate a profit (which so far in its short life it has failed to do): Netscape Communications knows that it is living on borrowed time. The belated entrance of Microsoft into the Internet field (with TCP/IP support in Windows 95, the freely-available Internet Explorer browser and a new Web server code-named Gibraltar under development) means that Netscape now has to fight not just against the old Mosaic browser and all of the new commercial Internet programs from other companies of varying strength and seriousness, but against the software world's undisputed Goliath.

It was for this reason that Netscape Communications rushed out the final Windows 95 version of Netscape Navigator 1.2 two days before the launch of Windows 95 itself (and Microsoft's own browser). And it is for this reason that Microsoft in turn has already rushed out version 2.0 of Internet Explorer, attempting to catch up with and even leapfrog over the young upstart.

To bolster its position Netscape formed alliances with many of Microsoft's rivals: Novell, IBM, Sun, Adobe, AOL and CompuServe. At stake is nothing less than mastery of the Internet.

Plugging in to Netscape's new Internet world

One of the most important features of Netscape Navigator 2.0 and later releases is the ability to use software plug-ins. These can be thought of as a natural development of standard Web browser helper applications. Whereas before, you set up external programs that could be called up by the Web client when various file types were encountered, the new plug-ins display their results within the main Netscape window.

Many of these plug-ins are simple viewers for graphics and sound files. Among the companies that have brought out plug-ins (all available free via links at http://home.netscape.com/comprod/products/navigator/version_2.0/plugins/index.html) are Corel (a .cmx viewer), InterCap (.cgm files) and FigLeaf (with a program that can view most graphics formats).

There are also various streaming video viewers (from PreVu and VDOLive) and a Midi player from Crescendo. One plug-in with an impressive specification comes from FTP software. Its commercial KeyView program plus plug-in will display around 200 different text and video formats within the main Netscape window (see http://www.ftp.com/ mkt_info/evals/kv_dl.html).

Alongside these fairly conventional applications of inline viewers, there are some more interesting uses. For example, Infinet Op and Summus offer new graphic compression technologies in their software, while SoftSource's SVF plug-in lets you view Cad files, and MDL Information system has created one for the Chime chemical structure standard.

With so many rival programs to choose from it is hard to know which to download. However, there are a few leaders. Adobe, for example, has released a useful plug-in version of its Acrobat Portable Document File reader. RealAudio also has a plug-in, and Macromedia's Shockwave for Director has established itself as the main multimedia enhancement technique on Web sites. But more interesting even than these are various programs that take the Netscape plug-in as a starting point for something much broader. For example, plug-ins from NCompass and Business@Web allow you to use OLE controls from Netscape. Similarly, software from Wayfarer Communications lets you create general client/server applications using Netscape as the client. These represent an important new trend: regarding Netscape less as a program and more as a complete platform (one plug-in, from Philippe Kahn's new Starfish Software company, even dispenses with the need for a connection to the Internet altogether, and uses Netscape as a software environment).

Just as the hardware-independent Windows represented the next stage beyond Dos, so Netscape is attempting to define itself as the next step beyond graphical desktop environments by becoming a platform-independent graphical Internet environment. This is the real importance of the Netscape plug-in: it represents a frontal attack on Microsoft's near-total dominance in this sphere.

This aspect could prove crucial to the survival of Netscape. When the plug-in facility was first planned, it was probably designed as an attractive extension to the Netscape Navigator product. But at that time the idea was still to make money out of selling the Netscape servers, while building the user base through low-cost sales of the browser.

On the other hand, corporate sales of the Netscape browsers are generating far more than was planned. So it may well be through strengthening this - by turning it into the central Internet platform - that Netscape the company will survive and thrive.

Intranet wars

Now that intranets are moving increasingly to the centre of the stage as far as business computing is concerned, the earlier skirmishes between Netscape and Microsoft over browsers have escalated into a full-scale intranet war. The result of this tussle matters considerably, since it will determine the corporate IT landscape for the foreseeable future. For this reason alone, it is well worth studying the various documents that both parties have produced outlining their respective cases.

Surprisingly, perhaps, it is Netscape's strategy paper that is the more polished. An extremely long and well-written document, it offers one of the best introductions to the general area of intranets, and should be read by anyone interested in this field. In particular, the eleven case studies of Netscape-based intranets are particularly valuable since they help to flesh out what is sometimes a rather abstract idea.

Netscape Communicator adds the long-awaited intranet collaboration tools such as local discussion groups to the client side. This is in addition to the CoolTalk Internet phone utility already available in Netscape Navigator 3, and the shared whiteboard and chat functions also found there. On the e-mail front, there is support for IMAP (an enhanced version of the popular POP3 protocol) and e-mail filtering. The emerging White Pages directory standard LDAP, will be supported, and new HTML features will include three-dimensional layered frames, improved font control and multiple columns.

Corporate IT managers will particularly appreciate the enhanced administration tools. These were introduced with version 3 of Navigator, and in version 4 allows IT departments to control and update centrally Navigator components such as plug-ins. It will also be possible to distribute semi-customised versions of the browser, with things like proxies, home pages etc. already configured.

Orion servers will come with an integrated LDAP directory, new server agents that will allow intranet administrators to automate certain tasks, and native support for certificates - secure ways of establishing identity and access privileges. Replication services are also being introduced in an attempt to bolster Netscape's weaknesses in this area and to go head-to-head with Lotus Notes. Ultimately full multi-master replication of directory services, content and discussions groups will all be offered.

However, the company to beat in the intranet world is not Lotus (even though it is now aiming to marry Notes with the Internet in its Domino product) but Microsoft.

Microsoft's intranet strategy

In contrast with Netscape's impressive intranet strategy document, Microsoft's plans were sketched out initially far less coherently. Most of the information is contained in transcripts of some rather rambling speeches by Gates and his senior managers during a recent intranet day. Perhaps shamed by Netscape's polished prose, the company did later pulled together a more formal document.

Like Netscape Navigator 4, Internet Explorer 4 is due out at the end of this year, and will offer the Active desktop capability, previously known by the code-name Nashville. This will integrate the Windows Explorer tools with the basic metaphor of the Web. Another re-christening is that of the Java development environment code-named Jakarta, which is to now called Visual J++.

Much of the intranet strategy day was devoted to ActiveX, Microsoft's answer to Java applets (though the latter are also supported). Support for ActiveX is already built into Internet Explorer 3, and development tools like the ActiveX Control Pad have started to appear.

The other main theme of the day - and of Microsoft's intranet strategy - was Office 97. As with Windows Explorer, the various components of Office have had Internet functionality built in to them in what is designed to make the Internet (and intranets) a seamless extension of a PC's hard disc.

All Office applications will gain something called the Office Web toolbar with Back, Forward and Home buttons. Web FindFast is another common feature, and designed to provide Altavista-like search capabilities across one or more intranet servers. Collaborative working on Office documents over networks is now possible, with revisions, conflict histories, comments and version information all supported.

Windows NT products are increasingly important in Microsoft's intranet strategy. It is remarkable how an operating system dismissed by many as too lightweight for serious corporate computing has managed to ease itself into a starring role, largely thanks to the new interest in intranets where it shines through its ease-of-use. In addition to the Internet Information Server, Windows NT will gain a directory server (supporting the new LDAP standard), a search engine and certificate facilities that will integrate with NT's administration tools.

Windows NT will also support the new Point-to-Point Tunnelling Protocol which will allow virtual private networks to be set up across the Internet and for intranet islands to be joined together cheaply but securely. Also on the security front, there is a proxy server code-named Catapult, while further out are Media Server (previously code-named Tiger) which will allow audio and video streaming for real-time multimedia applications, Merchant Server for online commerce and a product code-named Normandy.

Interestingly, this is the software that Microsoft uses for its own MSN network, now Internet-based. It is designed for commercial online services - CompuServe has announced that it will adopt it - but could equally be used within a very large organisation that wished to set up a full-service intranet online system.

As is clear from the above, Microsoft is approaching intranets from the desktop, with NT acting as its bridgehead into higher-end systems. The company's strengths are the tight integration of its products with Windows and Office. Its weakness is the limited cross-platform support, particularly on the server side. Netscape is stronger here, but weaker on the desktop. Fortunately the Internet's ability to support heterogeneous environments means that you can choose the best elements from each strategy for your particular needs.

Netscape rises to the challenge from Seattle

Just before the end of 1996 Netscape came out with the first public beta of its Netscape Communicator, effectively Netscape Navigator 4 plus significant groupware elements. In many ways this is the most important new version to date. Microsoft has already made considerable inroads into Netscape's share of the browser market, at least as far as end-users are concerned. Netscape needs to show that it can fight back on its own terms, and to do this more than a minor upgrade was required.

Judging by the first release (http://home.netscape.com/comprod/mirror/client_download.html) Netscape has probably succeeded. The new Communicator suite is aimed straight at the business sector, especially corporate intranets, and Netscape has not tried to fight Microsoft in areas where it would lose, for example on pricing, or integration with Windows. This does not mean Netscape has not cheekily stolen some of Microsoft's thunder. For example, one striking feature of Internet Explorer 3 (and Microsoft Office 97) is the adoption of a sleek new interface. Netscape has shrewdly copied this for Communicator, since the older face of Navigator looked clunky in comparison.

At the same time, it has re-arranged the drop-down menus, placing the important Options menu as the Preferences sub-menu of Edit (as many programs now do). It has also brought Bookmarks to the foreground of the interface, and adopted a rather neat roll-up form for icon bars. Links on the Help menu have been expanded, and the Help function itself - based on Netscape's new HTML help technology - is full.

Netscape Navigator acts as the foundation of the suite: you access most other functions from the bottom right-hand corner (just as mail and news were in earlier versions) from a detachable menu bar. Accessible from here is the Message Centre handling E-mail. This includes a module called Composition that lets you create E-mail and newsgroup postings.

As well as a far more sophisticated interface, new features include the ability to embed HTML elements directly, and wide-ranging security features. These include encryption of messages, attaching digital signatures, and digital certificates. All will soon be indispensable for sending sensitive information across the Internet and intranets. Another welcome feature is the possibility of filtering mail: based on the content, sender or recipient it is possible to set the software to redirect or delete incoming messages. Multiple rules can be applied.

One advanced feature is the ability to consult online E-mail directories (currently Four11 and Big Foot) using the new LDAP protocol. This is a big step to the creation of a global E-mail directory, and is also being supported by Microsoft (http://www.microsoft.com/ie/ imn/ldap.htm) through an update for its mail client. A separate window is used to display news messages, which may be from an Internet newsfeed, or from intranet NNTP servers used to create local discussion groups within a company or department. This uses the Collabra technology acquired by Netscape in 1995.

The rather confusingly named Composer program (separate from Composition, which is used for mail and news postings) is a replacement for the unloved and feeble HTML editor that was included in Netscape Gold. However, it still lacks many of the sophisticated features found in dedicated HTML editors. Also limited, but less damagingly so, is Netscape Conference. This allows you to set up voice calls over the Internet/intranets, or use a whiteboard jointly. There is also a feature called "Collaborative Surfing" which allows one person to lead another around the Web.

Not in the beta I looked at were the calendaring features that will form an important part of the Communicator suite. But it should be clear from the above that Netscape has moved beyond a simple browser, although at the cost of size: the full download is nearly 10 Mbytes.

Netscape climbs onto the push bandwagon

One of the consequences of push technology's high profile is that the market has become very crowded. In fact there are already signs that a consolidation is taking place: IFusion filed for protection under US bankruptcy laws, and BackWeb has purchased Lanacom, whose Headliner push product was praised last week.

This shake-out will undoubtedly be accelerated by the imminent arrival of the undisputed giants in the Internet software sector: Microsoft and Netscape. Both have announced that they will be offering push technology as standard in version 4 of their respective browsers.

Just as Netscape has pulled ahead of Microsoft in terms of releases with its Communicator product, which incorporates the Navigator browser as one element of this software collection , so the second beta of Netcaster, its push component, is out well before Microsoft's product.

Netcaster is available through the new and innovative Smart Update service offered by Netscape. Once you have installed version 4 of its browser, and connect to the URL http:// home.netscape.com/download/ smart_update.html , the site will help you update automatically, using Java applets to control the installation process. Although the screens explaining what is going on during this are a little confusing, I found that the update itself went ahead without a hitch.

Netcaster uses the standard channel metaphor and, once it has loaded it, places a band down the right-hand side of the screen where some of the available channels are listed. When you click on one of the channels displayed, you are connected to Netscape, which acts as a kind of central reference point, and allows you to subscribe to the channel.

Part of the subscription process is choosing how frequently the channel is updated. Netcaster works by pulling down Web pages (and as such is not a true push technology) which represent the channel. The advantage of this approach is that you can turn any Web site into a channel. Unfortunately, the Netcaster channel publishers seem to have repeated all the mistakes of the early push providers: I found most sites extremely intrusive and difficult to navigate.

This is even more the case when such sites are displayed as what Netscape terms Webtops. That is, they take over the whole of the computer screen and become in effect the graphical interface. Netscape calls this a tremendous boon for information suppliers, since it allows them to gain the full attention of their users, and to act as a gateway to further information. This may be true, but I felt I had lost control of my computer completely, which had been turned into an expensive television. Since operating system functionality is hidden completely, I found it very hard to navigate around the Webtop.

Netscape has always been the most genuinely innovative firm on the Internet, and its Netcaster software is no exception. By using a combination of Javascript, Java and HTML, it has managed to recreate a traditional push channel, and even to add to it: Netcaster supports Castanet channels natively.

But the company's attempt to take on Microsoft's Active Desktop with its Webtop is misguided. Netscape should not be trying to undermine Microsoft's operating system business (something that is plainly impossible), but should concentrate on producing the best browser (with extended functions of the kind offered by the Communicator suite).

Netscape has released the final version of Netcaster (c. September 1997), but the impression is that Microsoft's Webcasting channels are more fleet of foot: partly, no doubt, because they can make calls direct to the operating system.

This double disadvantage might seem to spell the death knell for Netscape's approach, but there are other important considerations to bear in mind. The first is that, as ever, Microsoft's approach is rooted very firmly in the Windows world, even if there are versions for a few other platforms. However, this will change: the use of Extended Markup Language means that eventually any browser supporting this new standard will be able to work with Channel Definition Format files.

Much more serious is the question of ActiveX controls. Microsoft has still not fully addressed the issue of security as far as these components are concerned. Unlike Java, which has a well thought-out security model (even if some implementations of it have proved less than water-tight) to prevent rogue Java applets from destroying or stealing information on systems they are downloaded to, ActiveX controls have no such constraints. They have full access to the operating system and hence the corporate network.

Microsoft's attempt to provide a security framework for ActiveX controls, Authenticode, merely ensures that the code is not tampered with in transit, and that its owner can be established with certainty. Whether that owner is trustworthy now or in the future - one dangerous option is to accept all ActiveX controls from a given certified publisher without question - is something that is hard for anyone to judge. It is certainly too much to ask of corporate users who have better things to do with their time than ponder the validity of digital certificates.

For this reason, Internet Explorer 4 and its Webcasting push technology will prove something of a conundrum for IT managers. If it is deployed within a company and ActiveX controls are permitted, there is the risk of machines being attacked and data stolen. If the ActiveX controls are blocked at the corporate firewall Internet Explorer 4 loses much of its power (for example, the Pointcast channel will not function at all).

Against this background, Netcaster begins to look more attractive in a corporate context. To avoid the confusing mish-mash of Webtops created by external channel publishers a single company front-end can be deployed across all platforms. By minimising unnecessary multimedia effects speed problems can be avoided.

Netscape's case has been helped enormously by the announcement that it would be unbundling Navigator and Netcaster from the bloated Communicator suite. The latter was turning into something of an own goal, alienating both customers (who didn't necessarily want the extra functions at the price of many more megabytes of code) and partners (who didn't want Communicator competing with their groupware products).

Although it is early days, there is some indication that the publishing industry is moving behind Netcaster in a significant way. Not only has Netscape already announced more than 700 channels (see http://www.netscape.com/newsref/pr/newsrelease458.html for details), but it is noticeable that these have a very definite business feel to them.

Microsoft's has listed its roster of supporters. It may well be that Webcaster - like Internet Explorer itself - becomes dominant among end-users, while Netcaster (and Navigator) wins out within companies.

Netscape gets back to its Internet roots

A steadily shrinking browser share has been followed by catastrophic financial results And, to add insult to injury, the joint venture with Novell called Novonyx - which has been porting Netscape products to the Netware platform has recently been taken over completely by Novell. Against this dismal background, Netscape has taken some sensible remedial action; no doubt largely to the astute leadership of its chief executive Jim Barksdale.

Dramatic move
The most dramatic move was the decision to make the entry-level Communicator suite available free (users still have to pay for the more advanced version. This is more symbolic than anything, since most private users had probably downloaded the program for nothing, and for companies the bulk prices were sufficiently small as to make little difference.

Far more profound, though, is Netscape's announcement that it would be making not just the next version of Communicator freely available, but its source code too.

Free for all
Details are a little hazy, but the idea is that anyone will be able to take that code and adapt it or add to it freely; Netscape retains the right to use any elements of new versions that may be developed. Although this is more or less unprecedented in the commercial world, it is well-established on the Internet. Perhaps the best-known examples of a similar approach are Linux and Apache.

Netscape's announcement could help to legitimise these products further in the eyes of firms sceptical about their genesis. For Netscape, this is a clever move for several reasons. First, it goes beyond Microsoft's own licensing of Internet Explorer to third parties. Second, it potentially mobilises thousands of programmers on the Internet; a virtual software team that surpasses anything Microsoft can muster in-house.

And third, it represents a return to Netscape's roots as a true Internet firm. When the company first released version 0.9 of Navigator in October 1994, it was perceived very much as an Internet event. It is important to remember that the way the beta code was released and comments invited from all Internet users was novel then in the commercial world.

Gradually, though, Netscape moved further away from its origins, culminating in the bloated and expensive Communicator suite that had aspirations to taking on groupware products such as Notes. Making Communicator's source code available may help the firm to win back the goodwill it once had among the online community.

Climbdown
Another wise move in this context - even though it is rather a climbdown - is the decision to halt work on producing Netscape's own Java Virtual Machine.

Partly because it had so many platforms to support, Netscape had fallen behind in offering the latest Java technologies, and had become something of a liability for the Java movement. It will now employ an open architecture for its browser that will allow other Java Virtual Machines to be used; for example, Sun's or Microsoft's.

Although painful, these decisions should allow the company to concentrate on basic browser and server technology, and to build on its expensive electronic commerce acquisitions, Actra and Kiva.

Netscape Communicator 4.5

In contrast to Microsoft with Internet Explorer, Netscape has chosen to concentrate on the user interface on Communicator

The changes to Microsoft's Internet Explorer 5.0 are mostly under the bonnet, particularly in the areas of Dynamic HTML and XML, with not much new on the surface. Netscape's Communicator 4.5 (home page at http://home.netscape.com/ communicator/v4.5/index.html) takes almost exactly the opposite approach. It is also notable for its integration with the company's evolving Netcenter portal.

So far as I can tell from the preview version (download from http://home.netscape.com/download/prev.html), Netscape has done little to update its rather retrogressive support for emerging Internet standards such as DHTML or XML, though Java and Javascript have both been improved, and support for Macromedia's Flash technology included.

It does this in two ways. First, using the Alexa system, a new What's Related menu offers suggestions of other sites that may be relevant. The second approach is using keywords. This lets you enter a word or a phrase instead of a URL. Navigator then refers to a database held at Netcenter that maps these words and phrases into addresses. According to the company, this database will "initially contain generic terms and some specific URLs, eventually growing to include such things as trademarks, 800 numbers, and stock symbols."

This is an interesting idea, but not without its pitfalls. Most worryingly, it means that Netscape's portal in effect becomes a gatekeeper too, since it gets to decide whose site will be returned for key words. Even assuming that this is not determined by monetary considerations, it still implies a value judgement on Netscape's part. Moreover, the tight dependence of this feature on Netscape's Web site is something for which Microsoft would be roundly criticised if it dared do the same.

Other innovations are less contentious. For example, Navigator's offline capabilities have been improved: there is a detailed history list, which allows previously visited pages to be viewed offline.

Netscape's Smart Update has been refined, making the addition and removal of software easier. Alongside this there is another innovation: the Netscape Quality Feedback Agent, which enables the browser to send reports of software problems to Netscape automatically (see http://home.netscape.com/ browsers/qfs1.html). This is an approach that could be usefully adopted by other software manufacturers to help improve their products.

There is extended support for lightweight directory access protocol-based directories, and synchronisation facilities have been improved. The latter is designed to allow users to read and write E-mail and Usenet postings offline, possibly using several machines (for example a desktop in the office and a portable on the road).

A related feature is roaming: a user's profile (preferences, bookmarks, address book, security files, etc) can be stored centrally on a server that supports lightweight directory access protocol or HTTP. A user can then log in to another computer and access his or her own personalised Communicator environment.

Netscape is to be praised for its continuing innovations in the browser sphere, but also cautioned against trying to become all things to all people and jeopardising its browser's overall efficacy in order to bolster its Netcenter portal.

Net Telephony

The idea behind Net telephony - the transmission of ordinary voice messages over the Internet - is extremely simple. Since TCP/IP transmissions consist simply of data packets, these can just as easily be digitised voice messages as program files or Web pages. But the consequences of this are dramatic. Because Internet pricing takes no account of distance, the cost of making a Net phone call to Australia is the same as one to Acton - a penny or two a minute at most.

In addition to these pioneers, there have been many new entrants such as Netscape (with its CoolTalk, available as part of Netscape Navigator 3) and IBM with its Internet Connection Phone as well as innovative programs like PGPfone which allows real-time encryption of conversations for complete security. But so far, Net telephony has been something of a fringe activity, conducted by a few enthusiasts rather than ordinary business users.

The reasons are not hard to find. Quality and reliability both leave something to be desired. The way in which the Internet transmits data - split into packets, possibly sent by completely different routes - means that messages can break up and become garbled. As technology progresses and the Internet infrastructure improves, this is likely to become less of an issue. More serious is the question of standards. Most Net telephone programs will only work with themselves: their is no interoperability, so you need to make sure that the intended recipient of your call is similarly equipped.

A joint Intel-Microsoft initiative may at bring some order to this market. Built around the H.323 audio and video conferencing standard, this new generation of Internet phones should, in theory, allow full interoperability between products from different manufacturers: Intel's Internet Phone and Microsoft's NetMeeting. Microsoft will also be building the technology into future releases of Windows.

If proof were needed that a new era in Internet telephony is upon us, you need look no further than the America's Carriers Telecommunication Association. This group of telephone companies was set up with the explicit aim of fighting Internet telephony, which they recognise as representing a serious threat to revenues.

Happily for business users, their case seems pretty forlorn. The VON coalition, formed to promote the wider use of Net telephones among other things, counts not only Microsoft among its members, but the major US telecoms company Sprint. Moreover, the chairman of the FCC, the regulatory body for telecoms in the US, has stated: "we shouldn't be looking for ways to subject new technologies to old rules."

In the UK, both Oftel and the DTI are "monitoring" the situation - which given the rapid rate of change in the Internet world is tantamount to tacit approval. The only brake on Net telephony in this country has come from some ISPs like UUnet Pipex which have banned its use for fear of being deluged with the traffic it generates. However, Pipex is likely to join other major ISPs like Demon who have always allowed its use, as technology advances and user demand increases.

Moreover, the one remaining objection to Net telephony - that it is only any use for those with Internet connections - is about to be overcome. VocalTec is launching its

Internet Phone Gateway Server that will allow ordinary company phone systems to be patched into the Internet. And IDT offers the Net2Phone service whereby Internet users can call any phone anywhere in the world.

Network Computing Framework

The network computer (NC) idea was one of the most exciting approaches to business computing to evolve out of the Internet environment. Unfortunately so far it has turned out to be little more than a high-profile marketing concept, with much promise but little substance.

One reason why the NC has failed to materialise is that it was so vague in terms of its implementation, and in particular how it would fit into the larger picture of corporate computing. Thin clients on their own are useless without appropriate infrastructure elsewhere within an organisation. So one approach to rectifying this flaw of NCs has been to construct a detailed framework based on the original insight.

For example, the Network Computing Framework does not just talk airily about thin clients, but specifies all elements of the overall architecture. To be sure, thin clients are in there, but only as a part of the client side, which is built around Web browsers with Java applets adding extra functions "just in time".

Javabeans lie at the heart of the framework, both on the client and server side. Standard protocols such as HTTP are used to communicate among them. In addition, there are some connector services to provide access to existing data, applications and other external elements.

One of the main proponents of the framework, IBM, has other plans in terms of adding extra features. But whether these are ever deployed, or remain as yet more monuments to marketing hype, remains to be seen.

Newsfeeds

Of all the various components parts that go to make up the Internet and its services, Usenet is perhaps the most mysterious to newcomers. How exactly this great flood of messages is passed around the world, gaining new postings and comments as it passes can be difficult to understand at first.

The fact that a Usenet feed is required quite separately from an Internet connection only increases the confusion: is Usenet part of the Internet or not? Strictly speaking, no: in fact Usenet newsgroups pass outside the Internet using the UUCP protocols for exchanging data between Unix systems.

However, whatever its historical importance (and its local use in areas where the TCP/IP infrastructure is less well-developed), UUCP transmission has been largely replaced by the Network News Transmission Protocol (NNTP) using the Internet as the wiring.

Typically each Internet provider (that is, the company that links you to the Internet when you connect to them) receives a newsfeed from their own Internet supplier (if they have one) or from an Internet peer elsewhere. This feed is then made available to their subscribers, allowing them to download parts of the Usenet flow to their machines and to view them with a Usenet client.

The same client can also be used for adding messages or comments to messages; these are then sent to the Internet provider which adds them to the stream of new Usenet messages that are passed on to other major Internet nodes. To stop the entire Internet system from becoming clogged up with Usenet messages (and to preserve the finite storage space of Internet providers), postings are normally only kept for a few days before being discarded.

Newsgroups

Looking through the list of Usenet newsgroups you may wonder why, along side serious areas such as alt.winsock are to be found others with names like alt.adjective.noun.noun.noun.verb and alt.swedish.chef.bork.bork.bork.

In the case of the alt hierarchy there is no mechanism controlling the naming of such groups. But elsewhere, for example in the comp groups about computer related topics, there is a strict procedure that must be followed before the machinery for the creation of a new newsgroup can be set in motion.

The first stage is a Call For Discussion (CFD), where everyone is given the opportunity to air their thoughts on the advisability or otherwise of a new newsgroup, what it should be called and where it should be placed in the Usenet hierarchy.

After a discussion period, if there is a general consensus that the idea of a new newsgroup is worth pursuing, a more formal Call For Votes (CFV) is instituted.

Sometimes the call goes out twice to give as many people as possible the opportunity to participate. If 100 more votes for than against are received, and at least two-thirds of those voting are in favour of the newsgroup, it is created; otherwise the proposal is dropped, and cannot be brought up again for at least six months.

The Netscan site at http://netscan.sscnet.ucla.edu/, funded by Microsoft, is a study and understanding of the Usenet phenomenon. Besides a FAQ, a glossary, a collection of papers about the sociology of cyber space and the ability to call up analyses of particular newsgroups.

Drop into your local on the Internet

Other discussions of the Usenet newsgroups have tended to treat the area as if there were just one Usenet, all of whose messages circulated around the world. Although this is true of the bulk of it, there is another very important element that is purely local. Because of the way Usenet newsgroups are created and passed on, it is quite possible to have postings that are only seen by a limited subset of Internet users. In particular, there are a number of groups that are only circulated within the UK, and even then not all service providers offer all of them.

In fact there is an entire Usenet hierarchy called uk, which takes its place alongside ones like comp, biz, alt etc (if you use an advanced browser like Netscape you can see this structure explicitly by entering the URL news:uk.*). A good place to start is in either uk.announce or uk.events: these are useful for keeping up with things generally, but are rather unfocused in their subject-matter.

Better, perhaps, are the newsgroups devoted to the Internet, but with a UK tinge. You might begin with uk.net, which has plenty of information about what's happening online (most of it relevant). The newsgroup uk.net.news tends to be rather noisy (that is, the signal-to-noise ratio is low, with a good deal of pointless posting) but with nuggets of important news hidden away within it. On the other hand you will probably want to give uk.net.maps a miss - it seems to be full of spams (the Usenet's equivalent of junk mail) and wibble (yet more noise).

The fact that many UK newsgroups are full of the latter - and therefore hardly worth bothering with - has not escaped people's notice. There is now a concerted effort to clear up the illogical mess that the uk.* Usenet areas have become. One manifestation of this is a renaming of newsgroups to bring them more into line with the main Usenet hierarchies - well-established and functioning reasonably efficiently.

For example, the singleton Usenet area uk.sun is becoming uk.comp.sys.sun as part of a move to create a more formal uk.comp.* along the lines of the comp.* area found globally. Other newsgroups that can be found here are uk.comp.lang.lisp (quiet), uk.comp.os.win95 (very lively, with postings that are particularly to the point) and uk.comp.training. Many others will doubtless follow as more and more UK Internet users discover these local Usenet areas and start to press for their own interests to be covered.

Another product of this new order is the rebaptised uk.org.women-in-comp (discussing women in computing), while other computer areas include a dizzying array pertaining to the British Computer Society, including such specialist newsgroups as uk.bcs.internet, uk.bcs.courses, uk.bcs.events, and even a clump of regional ones under the umbrella of uk.bcs.local.* for Croydon, East Anglia, Scarborough, Skye, Wolverhampton, Swindon and Yorkshire - although in all of these a deathly cyber-silence seems to reign most of the time.

Much more lively are the areas where jobs wanted and offered are posted: uk.jobs.wanted, uk.jobs.offered and uk.jobs.contract. The high-technology field predominates, and the same is true in the uk.consultants newsgroup. Also worth looking at are some of the areas devoted to particular subjects, such as uk.gov.local, uk.legal and uk.telecom.

Finally, it is worth noting that there are even more specialised local Usenet groups around, the most important of which are those in the demon.* hierarchy. As its name suggests, these are designed principally for users of Demon's Internet service. As well as general areas such as demon.announce, there are some excellent technical resources to be found in the demon.ip.support.* area, which includes newsgroups for Macintosh, OS/2, PCs, Windows 95 and Winsock.

How to access Usenet newsgroups using Internet E-mail

An interesting E-mail service that seems to be quite reliable allows you to access Usenet newsgroups, albeit indirectly. You can use a kind of news filtering service that will automatically search through most Usenet newsgroups for the occurrence of keywords set by you. Although this does not let you browse the huge range of messages that pour through Usenet every day, it enables you to track particularly subjects very easily - something that in many ways is much more convenient for business use.

Setting up the service is simple. For example, suppose you wanted to be kept informed about the new Web browser technology from Sun called Hot Java. By sending the message

to the address netnews@reference.com you would be requesting this service (now run by Inreference, was formally NetNews netnews@db.stanford.edu at Stanford University) to filter its news feed for the words 'hot java' for seven days (the expire 7 command). First, you will receive confirmation that your request has been accepted, and then for the next seven days an E-mail message will be sent to you containing short excerpts from Usenet postings where the keywords have been found. These excerpts take the following form:

Clearly, if the keywords are not found you will not receive any excerpts. Less obvious may be the fact that the news filtering is quite indiscriminate in the places it looks for your keywords. For example, in the case above, references to Javan coffee will also be flagged as meeting (at least partially) the selection criteria.

Once you have found a posting that seems relevant you can request it. To do this you need to take the article reference that precedes each excerpt: in the above example, it is comp.os.linux.setup.12474. You would then send the following message

to netnews@reference.com as before. Provided you send this message soon enough - remember that newsfeeds are not permanently archived, and so are held in temporary storage for only a few days - you should receive the full-text version of the posting in question.

To find out more about this service, and in particular about some of its more advanced options, send the message

Novell

This gives Novell a unique opportunity to establish itself as a major player in the intranet and extranet worlds.

Offline Readers

Offline readers in themselves are not a new idea. Members of the UK online service CIX (Compulink Information eXchange) have been using programs such as Ameol and Wigwam for years to download their E-mail and the conference postings.

The latest incarnations of Ameol (version 2) and Wigwam (now called Virtual Access), marry CIX access with offline Usenet and Internet E-mail readers. Virtual Access further adds offline capabilities for CompuServe. Internet users will also be pleased to learn that both programs support a telnet option: messages can be downloaded from CIX (and CompuServe in the case of Virtual Access) using the Internet to connect, rather than a direct telephone link.

Like CIX itself, both Ameol and Virtual Access are UK products and highly recommended: Ameol 2 is particularly impressive in its handling of CIX, while Virtual Access is a natural for those with both CIX and CompuServe accounts.

Reading E-mail and conference messages offline is fairly natural, but the benefits of using offline Web readers may be less clear, since by its very nature the Web is interactive. One application is for presentations. If during the course of a talk or demonstration you need to show real Web pages (and not just those you have written yourself and hold locally), accessing the Internet to retrieve them during the presentation itself can be nerve-racking to say the least.

Far better to retrieve them before and then view them offline. Another application is where the information held at sites is updated on a regular basis. In this case it would be helpful to be able to automate the process of retrieval, with the updated version of the relevant pages downloaded to your machine (perhaps during the night) ready for you to peruse at your leisure. Mirroring these two kinds of applications are two classes of offline Web reader.

The first generally make use of the fact that Netscape (and Internet Explorer, although this is not yet widely supported by these programs) keep a store of all the pages visited, plus all their images, held in what is called the cache. There are a number of products that use the cache to reconstitute the pages you have visited and allow you to read them offline. In other words, you can browse quickly around a site online, finding pages that look at first glance interesting, and then use a cache manager to recreate those pages and their links offline.

Interestingly, this area seems dominated by European software. For my money, the best is Unmozify (a play on the earlier use of .moz files to store cached images, themselves deriving from Netscape's beta code-name of Mozilla). This £15 shareware program offers a simple but neat interface that allows you to select which sites in the cache are reconstituted. Another UK program is

These cache managers are all rather passive in that they simply re-use information already present on your computer. Other offline readers go out to the Internet to retrieve requested Web documents automatically, often at preset times.

One of the first of these programs was FreeLoader. As its name hints, it is free, but this is not such a good deal as it sounds. The program is paid for by the advertising that it carries, and I found it very restrictive in the way it works (it basically takes over Netscape) and worrying in that it seemed to be connecting to the Internet and downloading (and possibly uploading) in a way outside my control.

Fortunately there is an alternative. NetAttache Light is both free and excellent, a standalone program offering very fast downloads of sites (including those with frames, JavaScript and Netscape plugins) and the ability to cope with passwords. Missing from this software is any kind of scheduling. This is present in the NetAttache Pro version and still in beta. In addition, this gives you data filtering - the ability to spot changes at Web sites or to pull in only those pages that meet certain criteria - and a front-end to simultaneous queries of all the main search engines. The price is likely to be $49.95 (£33.08).

The Pro version adopts an interesting client-server approach whereby pages are displayed using Netscape or Internet Explorer operating with the loopback address 127.0.0.1. The same approach is adopted by OM Express. OM Express costs $29.95, which is one disadvantage with respect to NetAttache; another is the need to re-boot your PC on installation and the fact that it imposes the loopback approach permanently on Netscape, which means you need to remove it by hand.

Also using loopback is NearSite, produced by the same UK company that wrote the Unmozify program recommended last week. Though still in beta-testing, this has a good set of features, including the ability to link several PCs to the Internet using a single dial-up connection, and is worth keeping an eye on.

Another early program in this area was Milktruck. This has now been bought by Travelling Software and given the more mundane name WebEx. It costs $29.95, and a trial version can be downloaded. Like FreeLoader, it too takes over Netscape, though less intrusively.

Browser Buddy is a standalone product and at $39 costs more without really offering any extra functionality. The same is true for WebWhacker - another pioneer in this area - costing $69.95. Its main point of interest is that it is one of the few offline readers available in a Macintosh version too.

Metz Netriever is more noteworthy as a novelty than a serious tool. It allows you to turn downloaded Web pages into desktop wallpaper and screensavers (rather like Pointcast), and it will also generate a sequence of Web pages, following a link at random on each successive page. Still, rather expensive at $39.

Another oddity is Alpha Connect which allows data retrieved from Web pages to be imported into packages such as Word or Excel. Probably of more use for intranet rather than Internet users, I found it overly complex.

OpenDoc

As the battle hots up between Sun/Netscape's Java approach, and Microsoft's OCX/ActiveX philosophy, it is all-too easy to overlook the outsider in this race.

CI Labs' OpenDoc is a cross-platform, object-oriented approach to building applications that in many ways is superior to both Java and ActiveX; what it lacks, though, is the clout of either. OpenDoc is mainly being pushed by Apple as a key part of its next generation systems. Unfortunately, Apple's current woes are tending to distract attention from an extremely interesting product that desperately needs to be better known, since mind-share is almost more important than technology in the current turbulent world of IT.

However, in one respect, the main cause of this confusion - the Internet - may well prove one of OpenDoc's, and Apple's, best allies. For alongside early versions of OpenDoc itself, there is now a suite of Internet applications written using it that acts as a showcase of its features.

Called Cyberdog, the suite comprises a Web browser and clients for E-mail, Usenet, Gopher, FTP and telnet, along with utilities for storing and organising Internet information and various security facilities. All these tools employ a common interface.

And because of the way the component technology that is central to OpenDoc works, you can use Cyberdog "parts" to create other applications with embedded Internet function.

Open Document Management Application

More or less every technological industry has been forced to re-invent itself under the impact of the Internet, but few have had their entire raison d'être called into question more than the document management sector. Since documents lie at the heart of the Web, which has become the new standard for information storage and retrieval, those producing classical document management tools have had to reach an accommodation with this enormous cuckoo in their nest.

The first step was the open document management application programming interface (ODMA). This was designed to allow desktop applications, including Web browsers, to access existing document management systems.

ODMA is to do with the basic nuts and bolts of linking document management systems with other software. More recently, the same umbrella organisation that defined the interface, the Association for Information & Image Management, has come up with something deeper.

The Document Management Alliance enables access to, and searches in, documents stored in multi-supplier systems. In other words, it does for classical document retrieval systems what the Web does for HTML documents. However, the alliance's specification adds a few more advanced features that reflect its provenance. These include versioning and check in/out behaviour.

If for nothing else, the alliance's approach is interesting as an indication of some of the more advanced capabilities that will doubtless be added to the Web in due course; and that will marginalise classical document management software yet further.

Open Source

The Internet began as a medium for the free exchange of information between academics, and despite increasing commercialisation it has never lost this idealistic element.

Nearly half the Internet runs the free Apache Web Server and the free Linux operating system, has a user base numbered in millions. Other key tools are also free. The Perl language (homepage http://www.perl.com/ ) is one of the most popular ways of writing server-side scripts. Sendmail ( http://www.sendmail.org/ ) is a popular E-mail server program and BIND, the main domain name system server software ( see http://www.isc.org/bind.html ) are both free. Added to list is Netscape Communicator (at http://www.mozilla.org/ ).

This approach to software has been dubbed Open Source ( http://www.opensource.org/osd.html )by its leading theorist, Eric Raymond. His analysis, called The Cathedral And The Bazaar (available at http://sagan.earthspace.net/~esr/writings/cathedral-bazaar/ ), of how the Open Source movement works - and why it is so successful - apparently played an important part in convincing Netscape to take the unprecedented step of opening up its software development process. This seems to have been a good move. Unable to release all the cryptographic code in its browser because of US export regulations the Australian based Mozilla Cryto team ( http://mozilla-crypto.ssleay.org/index.html ) succeeded in writing their own version 15 hours after Netscape released its code. Another group is working on Jazilla, a Java-based version of the browser ( http://mozilla.alsutton.com/jazilla/ ). Netscape has also acquired XML capabilities over night through James Clark's expat program ( http://www.jclark.com/xml/expat.html ).

The Open Source movement is gaining other adherents. Corel said it would release all the code for a toolset for the forthcoming Linux-based network computer.

Open Trading Protocol

The gradual application of the Internet in commerce has taken place through an increasing formalisation of business practices in terms of computer protocols. First was the basic hypertext transport protocol (HTTP). This allowed Web pages to present information about goods and services for sale, and to receive payment in the form of credit card details, albeit without any kind of security.

Then came Secure Sockets Layer (SSL), Netscape's encrypted transmission protocol. Through the use of public key cryptography, credit card details and other sensitive information could be encrypted during transmission to ensure that electronic eavesdroppers would be unable to gain access to the content of messages.

SSL made the transport secure, but did not address the issue of what happened to credit card details at the merchant's end. The idea behind the Secure Electronic Transaction protocol (SET) is that credit card details are shielded from the merchant too. But even with SET, many important and everyday aspects of commercial transactions can only be implemented through ad hoc extensions.

This has led a heavyweight group of financial and computer companies to draw up the Open Trading Protocol (OTP), based on XML files. Just as SET does not replace SSL, nor does SSL replace HTTP, so OTP is designed to incorporate and extend earlier standards rather than do away with them.

The idea is to provide a general open framework for conducting commerce on the Internet, and to ensure compatibility between different suppliers' approaches at all stages and for all elements of purchase, payment and delivery.

Optimal Asymmetric Encryption Padding

Real-life cryptography is a constant game of cat and mouse better than any fictional spy story. Crackers try to decipher encrypted messages using every trick, notably through the exploitation of vulnerabilities in the encryption techniques themselves.

Meanwhile, cryptologists fight back by keeping their methods as tight as possible and plugging any holes when they emerge. For example, in the Public Cryptography Standard number 1(PKCS 1) vulnerability, the trick is to send millions of messages to a server employing this kind of encryption. By observing, the errors that are sent back, enough information can be deduced to crack a real message.

This requires a particular kind of weakness in the encryption technique, one that PKCS 1 processes. Specifically, it is possible to construct a cryptographic message - known as ciphertext - that is valid according to PKCS 1 rules, without knowing what is the corresponding hidden cleartext message.

This apparently arcane defect is enough of a loophole for clever crackers to pick apart a server's security. To get around this, the clear text message is scrambles in a particular way before it is encrypted using PKCS 1 technique. This means it is not feasible to construct valid ciphertext without knowing the cleartext.

This technique is known officially as Optimal Asymmetric Encryption Padding, and will be included in the next version of RSA's PKCS 1 in order to plug a vulnerability that has always been present, but which until now was not identified as maliciously exploitable.

Palmtops

Wireless connectivity, on the other hand, is ideal, and will doubtless become the principal way of accessing this next generation of Internet-enabled devices.

Fortunately, the third generation wireless standard currently under development (officially IMT-2000) is being developed on the basis of a "family of systems" concept, which should at least help bring a little order to this chaos and drive the uptake of high-speed global wireless Internet connectivity.

Passwords

One of the central problems for those running sites on the Internet is making sure that users are who they say they are. The solution almost universally adopted, independently of the kind of service involved, computer platform used or site location, is to use a password.

Traditionally a password consists of a single, special word chosen by the user, agreed with the manager of the system it is to work with, and thereafter recognised by the system as the identifying sign of the user. But it is now accepted that using any kind of 'normal' word - that is, one found in the dictionary - is fool-hardy. Hackers often attempt to break into Internet by trying thousands of dictionary words as passwords - with frequent success.

However, forcing users to choose a completely meaningless password is almost certainly doomed to failure: few people can remember such concatenations, and hence resort to jotting down the string of characters on a piece of paper, often kept all too conveniently close to the computer in question.

A better solution is to use either a combination of words, or some kind of mnemonic. For example, combining two simple words with punctuation (tip:top) is one way to block dictionary searches. Another is to take the initial letters of an easily remembered phase, preferably with upper and lower case letters: for example 'iIwyIwdt' derived from 'if I were you, I wouldn't do that'. Although the concept of a password might seem trivial, the task of choosing one certainly isn't.

Performance Monitoring Software

There is an irony at the heart of the rise of the Internet and intranets in business. Before, a host of separate and often incompatible systems were used to connect a company to the outside world (dedicated links, modems, fax) and to provide key business functions internally (terminal access to mainframes, client/server connections to minis, local area networks etc). Today, those have become increasingly concentrated in just two: the external Internet link, and the intranet.

Once IT departments begin to realise just how vulnerable they are to the failure or under-performance of Internet and intranet connections, such monitoring software will become indispensable.

Perl

Perl stands for Practical Extraction and Report Language, and this describes its original function well: that of manipulating text and files and producing reports. It was created for his personal use by Larry Wall, who then released it to the Usenet community. Since PERL was originally intimately bound up with Unix, the de facto operating system of the Internet, it soon took root and became one of the most popular tools there. It was aided in this by its portability: it is now available on most platforms.

As well as its general use by Unix hackers, PERL is often met in the context of the World Wide Web. As a previous Net Speak explained, the basic capabilities of the HTML language were significantly extended when it became possible to run ancillary programs offering extra functionality. This was achieved using the Common Gateway Interface, or CGI.

The CGI standard allows programs to be called in response to actions by the Web client user. More or less any programming language can be used that works with the CGI, but far and away the most popular in this context is PERL. Generally, the PERL scripts act as mediators between the Web server and other, specialised software such as databases.

Although PERL is both powerful and popular (and free), it is being supplanted to some extent by a new approach, that of using APIs. These allow direct communication between the Web server and other applications, and are generally much faster and more efficient than using CGI. They are, however, platform-specific, whereas PERL scripts are completely portable.

Perl occupies a special position as one of the oldest and most widely-used scripting tools. Its home page is a good place to start exploring. For example, there is a short explanation of what Perl is from Larry Wall, the man who invented the language. There is an explanation of how to get it - as an open source program it is completely free. Macintosh users may want to look at the MacPerl page.

The key archive for Perl is CPAN (there are numerous local mirrors). If you use this URL you will be taken automatically to your nearest geographical site.

The CPAN front-end gives you access to the extensive collection of programs and scripts. It also worth noting that although Perl is an interpreted language, work is under way to create a compiler.

Other useful resources include the Perl Clinic, which offers paid-for support for Perl, and the established Perl Journal.

One publisher closely associated with Perl is O'Reilly. Alongside its Perl Centre there is a comprehensive range of books.

Among these it is worth singling out the famous Learning Perl (£21.95, ISBN 1-56592-284-0) and Programming Perl (£29.50, ISBN 1-56592-149-6) titles (known respectively as the Llama book and the Camel book because of the cover illustrations).

More recently O'Reilly has come out with the Perl Resource Kit (for both Unix and Win32 - £109.95, ISBN 1-56592-370-7 and ISBN 1-56592-409-6 respectively). This includes four books and a CD-Rom containing the basic Perl distribution, Perl modules and (for the Win32 version) the Perl Debugger from ActiveState, a visual debugging environment for Perl.

ActiveState is a software house specialising in Perl-related software. As well as the Perl Debugger it has come out with PerlEx, a plug-in for NT Web servers that improves the performance of Perl CGI scripts, and Perlscript, an ActiveX scripting engine that allows you to use Perl with any ActiveX scripting host.

An interesting Perl product is Solutionsoft's Perl Builder. This is a full visual development environment for Windows 95/NT, and includes a CGI wizard and simulator. A 30-day demo version can be downloaded. Evoscript, from Evolution Online Systems, is a free, database-enabled Web application framework written in Perl.

Another company - rather better-known than these - supporting Perl is Netscape. It has released PerLDap, a set of Perl modules for managing lightweight directory access protocol (LDap)-based directories.

Finally, it is worth watching out for a possible rapprochement between Perl and XML. Perl's string-handling capabilities could be a perfect complement to XML's use of structured text. Work is currently under way to extend Perl for this purpose.

Persistent URLs

It would be an understatement to say that the Internet changes rapidly. But an unfortunate manifestation of this is that links to Web pages and other resources are frequently out of date. The problem is that the Uniform Resource Locator (URL) that identifies them is pointing to a place where the information is no longer found.

The solution is to replace URLs with Uniform Resource Names (URNs), which are unchanging names for things, rather than mutable addresses. Work is under way to define a viable URN architecture, but until then a temporary solution has been put together.

This involves the creation of Persistent URLs, or Purls. The idea is that unlike a URL, a Purl will never change, and so can be given out and used in the knowledge that it will always be valid.

Unlike a URN, though, a Purl is just a normal URL. However, it does not refer directly to the location of the desired resource. Instead, it points to a reference held at an independent site, and the reference contains the real URL.

This is then sent back to the machine attempting to retrieve the resource with a given Purl, which can use the URL to access the information directly. In other words, the Purl service acts as a re-director, enabling requests sent to the Purl address to be re-routed to the real location.

The advantage of this system, of course, is that the underlying URL of the resource can be changed, but the Purl will remain the same. All that needs to be updated is the reference to the resource that is attached to the Purl. Updates to Purl data take place behind the scenes, and are something that users need never concern themselves with.

Ping

There are a number of tools that allow you to visualise the hops between your computer and the site you are visiting, these are generally known by the name of Traceroute programs. Both Windows for Workgroups and Windows 95 comes with such a tool if the full TCP/IP functionality has been installed. It is a DOS program called Tracert found in the main Windows directory.

Tracert is not the only Internet program hidden away in Windows, and certainly not the only low-level TCP/IP tool available. Another is Ping, mentioned in this week's main feature. Essentially Ping is a bare-bones version of Tracert: instead of mapping out the individual links in the Internet chain - the hops - Ping is concerned only with the end-point.

Its output reports whether the host name or IP address under consideration is reachable from the machine initiating the Ping. It represents the simplest way of checking whether a machine is accessible over the Internet, without worrying about details that may obscure this basic issue.

As such, Ping is an invaluable tool, both for advanced exploration of Internet connectivity - as in the research projects discussed in the feature above - and for simpler purposes. For example, if you are having problems connecting to a Web site it is often helpful to check whether it can be reached using a simple Ping.

If it can, then the site it probably just busy. If it cannot, you know that there is something more fundamental - either with the site itself, or with some element of the Internet path to it.

Platform for Internet Content Selection

Access control, for those worried about material available on the Internet (be it parents concerned about children, or IT managers about their users), basically hinge on the idea of special blocking software that effectively contains a list of banned sites or subjects.

Although these applications work, they have a number of disadvantages. First, they require constant updates of banned materials - for which you may have to pay a subscription. Second, exactly what is blocked is down to the judgement of the company supplying the software.

An interesting alternative is offered by the Platform for Internet Content Selection (Pics). This is being developed through the World Wide Web Consortium, and is backed by Netscape and Microsoft.

The idea is that each site rates its content according to an extensible set of criteria, but including things like sexual references, violence and profanity. Ordinary browser software such as Netscape Navigator or Microsoft's Internet Explorer would be set up to examine this rating and deny access to any sites that fail to meet the criteria chosen by the IT manager or parent.

Of course, this system depends on sites rating themselves honestly, but those who abuse the system would easily be spotted and could be added to a blacklist. Such a list would be shorter than those currently used by commercial programs, and could be managed at each site.

PNG

During the most recent phase of the World Wide Web, two graphics formats have dominated: .gif and .jpg. Both have particular virtues and vices: .gif is lossless and good at line art, but limited in the number of colours, while .jpg is good at photos but lossy. The usefulness of .gif has been complicated by the tussle over the legal status of an algorithm it uses, making that format far less attractive.

To get round the legal issues, and to address some of the shortcomings of both .gif and .jpg files, a new format has been devised. Called Portable Network Graphics (PNG, pronounced "ping"), it is now an official World Wide Web Consortium recommendation, which means that it is likely to be widely adopted. PNG has better compression than .gif files, and yet is lossless. It retains the ability of .gif to offer a progressive display, passing from low-resolution to high as more information arrives.

New features include a full alpha channel, allowing general transparency masks, and gamma correction for brightness adjustment. PNG also offers a colour depth of 48 bits for colour, and 16 bits for grey-scale: a resolution that surpasses that of the human eye. Moreover, as its name suggests, it is also completely platform-independent, something that makes designing Web pages far simpler than before.

One thing that is lost is the ability to create the kind of animated .gif effects that have become so popular recently. However, there are a number of alternative approaches possible, including yet another format called MNG building on PNG and designed specifically for animations.

Point-to-Point Tunnelling Protocol

Besides TCP/IP Windows NT comes with the Internet Information Server free (alongside the FTP server that was always there). NT has also offered for some time RAS: Remote Access Service. This allows NT users to dial out and other users with access privileges to dial it and any machine connected to it over the network.

Microsoft extended this capability with Point-to-Point Tunnelling Protocol (PPTP) to allow the connection to take place via the Internet. That is, you would connect to your local POP and then access the Windows NT machine across the Internet. Once connected, it would then be possible to log on to a corporate network of which the Windows NT server machine forms part.

Moreover, using this technique you are not restricted to just TCP/IP networks: other protocols such as Novell's IPX could also be used.

This proposed standard, supported by 3Com, Ascend and US Robotics, will allow companies to create multi-protocol virtue private networks among users and networks wherever they may be in the world, without the need for highly expensive fixed communications links. The authentication and encryption features built in to RAS handle the security aspects, and the Internet handles the transport.

PointCash

Although the Internet is notable for the constant stream of new sites and new content, genuinely innovative ideas are rather thinner on the ground.

It is perhaps not so surprising, then, that the arrival of the PointCast system has been greeted with such applause. In many respects it represents a genuinely new way of using the Internet, and adopts an approach quite different from the current World Wide Web model, though in no sense replacing it.

PointCast is basically a news and information service that is delivered over the Internet to your desktop machine (currently Windows PCs, but a Macintosh client is promised later, as is a Netscape plug-in version). The service is free and supported by advertising - the parallels with commercial TV news are clear.

But PointCast differs in that it is customisable - you can select exactly what you receive in each of the various Internet "channels" on offer - and also operates in the background. That is, if you are connected permanently to the Internet through a corporate network, the client software will unobtrusively update its news and information. You can then either view this from time-to-time, or, perhaps more interestingly, use the built-in screen-saver capability of PointCast to allow it to pop-up on your PC during the time when you are not actively working with it.

There are six channels, covering news, business (with almost real-time share information on user-selectable companies), industry information, weather, sports and lifestyle. The beta version (available from http://www.pointcast.com/) currently offers only the US stock market and weather, but in principle, local versions could easily be set up.

The on-screen appearance is interesting, with the share information and indices displayed in a rolling ticker-tape along the bottom of the screen. You can double-click on news headline to read the full story. In the screen-save mode, various graphics effects and slide shows (slightly distracting, it has to be said) present headlines and share information.

PointCast offers its information actively, rather than passively as news Web sites do.

Of course, as with any other Internet technology, PointCast can also work just as well on an intranet as the Internet. To support this kind of use, the company is bringing out PointCast I-Server. This allows businesses to create private PointCast transmissions alongside the public Internet-delivered ones (the latter are downloaded to the PointCast server and cached there for more efficient internal redistribution).

A company could use the PointCast server to transmit important or time-sensitive information across its corporate intranet to all of its online staff. This could be a useful way of overcoming the increasingly serious problem of E-mail overload, where people now receive so many E-mail messages that it is hard to pick out the really important ones.

By allowing IT managers to control this private Net broadcasting medium, it will be possible to alert users to key messages and issues that they might otherwise have missed.

Hitting Headliner puts users back in control

During the first year following it's introduction in the summer of 1996, push has become one of the hottest Internet technologies, with many competing systems now available (see the listing at http:// www.innergy.com/pushers.html for the main players). However, once the novelty wore off, it became clear that the first generation push products suffered from numerous drawbacks.

One of the most serious of these is the bandwidth they consume. With hundreds, possibly thousands, of corporate PCs constantly receiving information from the various Pointcast channel updates, a significant proportion of a company's connectivity can be eaten up. Pointcast clearly recognised this as a problem, and produced a proxy server that would cache channels and serve them up across an intranet.

In an obvious attempt to blunt further the continuing criticism, the company has decided to make the proxy server freely available for version 2 of its software, now in beta. In fact there will be three server products.

With version 2, Pointcast has also made various improvements to the client software, such as a separate ticker-tape display of information, more customisation and better integration with the Web (it comes with Internet Explorer built in). More importantly, it has adopted Microsoft's channel definition format - to be discussed in greater detail in a future column - to allow any Web page to be turned into a channel using the supplied Connections Builder software. A directory of such channels is maintained by Excite .

This is a significant development, because it moves away from the previous very restrictive model whereby only a few high-powered publishers could afford to become channel providers. Now anyone with a Web site can use the Pointcast system to broadcast.

However, this approach still has serious faults. Because the firm's main source of revenue is now advertising (since the servers will be free), on-screen ads are necessarily prominent, and therefore highly distracting. And although Microsoft's approach allows any number of channels to be created, this can only be done by the Web site owner. The balance of power is still with the channel provider, rather than the viewer, as it is with ordinary Web browsing.

In this light, the Headliner product from Lanacom is particularly interesting. Superficially it is very like Pointcast. There are the usual channels (hundreds of them) which can be selected, customised and updated at preset intervals. There is even a ticker-tape display (though rather more sophisticated than Pointcast's).

But where Headliner differs fundamentally is in what data it retrieves and how. The data is all derived from pre-existing Web pages, rather like the Pointcast channels using Microsoft's format. But no action is required on the part of the Web site owner. In other words, the user of Headliner is in control, rather than dependent on the action of Pointcast or information providers.

Headliner employs a clever system of extracting the main elements from Web pages and then turning them into channels. And because it presents only the most succinct of summaries (along with a hyperlink) but no images it is extremely gentle on bandwidth. I was able to update a dozen Web page summaries in just a few seconds.

Headliner redresses three fundamental shortcomings of the Pointcast approach. It places control firmly in the hands of the user; it has very little impact on bandwidth use; and there are no distracting multimedia advertisements, something that makes Pointcast almost unusable at times. The basic Headliner is free.

Polite Agent protocol

One of the most serious problems with the first-generation push products was their consumption of bandwidth: companies whose employees used Pointcast and its rivals found a significant percentage of their connectivity to the Internet was being eaten up by the constant updates to push clients.

Ultimately, the solution will be to employ the extremely efficient multicasting protocols. However, to use multicasting across the Internet requires a massive updating of routers around the world, and although this is likely to happen eventually, it is not something that can be depended on in the short term.

Of course, within corporate intranets, which are under the control of a single company, multicasting can be employed now, provided software and hardware on the network has been upgraded to support it.

Because multicasting remains only a distant hope for the Internet, other solutions to the bandwidth problem have been proposed. One of these comes from the push company BackWeb. It has drawn up what it calls the Polite Agent protocol, designed to meet the needs of push delivery.

One of its key features is that it supports bandwidth management. That is, it can respond intelligently to the current download requirements of other programs across the corporate connection, and ease up when bandwidth becomes scarce. When demand drops back, it can take up the slack to carry out the downloads needed by the BackWeb clients.

The Polite Agent protocol also supports interruptible data transfer, allowing large files to be downloaded over multiple sessions during moments when connectivity is not needed for time-critical purposes.

POP3 v. SMTP

If your e-mail service uses POP3 - which stands for Post Office Protocol 3 - then you are able to pick up your e-mail whenever you like, and from wherever you are. This would be particularly useful if you were travelling around, perhaps in different countries: you could access the Internet using local access points (saving on telephone charges in the process) and yet still retrieve your e-mail from a central point, rather than setting up an e-mail account in every country (and facing the problem of telling people which one to use when).

Contrast this with an e-mail service that uses only the Simple Mail Transfer Protocol (SMTP). This is the basic means for transporting e-mail across the Internet to its final destination, the host computer whose name appears in the e-mail address (as in bloggs@acme.co.uk). The difference is that if the host computer in the e-mail address (acme.co.uk) is not using POP3, this mail is delivered automatically when the host computer connects to the Internet (for example when one of its users logs on), rather than being held until requested as with POP3. In this case, it is not possible to retrieve from elsewhere on the Internet e-mail sent to that computer.

Effectively SMTP is about delivery to a particular computer, while POP3 is based around retrieving e-mail from a particular account. As a consequence, multiple e-mail addresses can be used on a host receiving mail via SMTP (for example bloggs@acme.co.uk, smith@acme.co.uk). For this reason, an SMTP mailbox may be more suitable for a company with several users who always log on to the same machine.

Portable documents

One of the aspects of the World Wide Web which newcomers find slightly curious is the fact that in designing Web pages you specify the overall structure, not the overall form. Although this might seem a limitation - you never know exactly what a given page will look like on users' screens - it arises from two great strengths of the underlying HTML language: its platform-independence (vital for a global network like the Internet linking the widest possible range of machines imaginable) and the small size of the HTML files (also important when bandwidth is always at a premium).

Despite these very good reasons for using HTML, commercial publishers in particular find its constraints frustrating, since they are accustomed to controlling appearance as well as content. The fact that you do not even know which typeface will appear in a Web page makes a mockery of conventional page design. The obvious solution is to employ a standard file format such as Adobe Pagemaker or Microsoft Word to create a file with all design elements specified exactly. In doing so, though, you hit against precisely the problems that HTML avoids: the lack of suitable viewing programs on certain platforms and the size of the files.

This has led to the development of portable document formats that offer a kind of halfway-house between HTML and standard file-types. Appearance can be controlled more than with HTML: for example the overall layout is respected, though individual typefaces may not be. However, the same portable document can be viewed on several platforms using a simple viewer, and the file-sizes are generally smaller than equivalent files from word-processor or page make-up programs.

With the rise of the Internet, the idea of the portable document has assumed ever-greater importance, and there are now no less than four competing products (and standards): Acrobat from Adobe, Envoy from Novell, Replica from Farallon and Common Ground from the company of the same name.

Although all these programs work in the same way by printing to a special software driver that builds a file containing the text and layout details, there are a number of differences between the competing products. Some portable documents contain information about the typefaces employed, while others just substitute fonts that are similar or roughly the same size; some programs allow you to send a mini-viewer along with the document itself, thus avoiding the need for separate programs altogether. Platform support also varies.

However, perhaps more important than these technical details are the marketing ones. For a portable document format to be useful to publishers, there has to be a good installed base of potential users. Here Adobe seems to be well ahead, having signed agreements with companies such as IBM and Netscape to increase Acrobat's diffusion into the marketplace. It is also incorporating Acrobat support in widely-used products like Pagemaker. Novell, too, is obviously in a strong position to evangelise to its huge network user base and to incorporate Envoy support into its other desktop products. Clearly, though, the portable document battle has only just begun.

Adobe: (0181-606 4000); Novell: (01344 724100); Farallon: (0171-731 7930); Common Ground: (0171-490 5390)

Ports

One of the ways a firewall system keep intruders out of corporate networks attached to the Internet is by monitoring the TCP/IP packets which constitute the basic building blocks of the data. In particular, firewalls may block access on the basis of something called the port number, either of the source machine or its destination.

Despite its rather confusing name, this is not a reference to anything physical like a serial or parallel port: it is a purely software concept, but one that is central to the way that the Internet works.

Perhaps the best way of thinking of a port on a machine is as a kind of television channel: just a TV can be tuned to listen on a variety of different channels in order to pick up and display different programmes, so a machine on the Internet has various TCP/IP channels that it can listen out for. However, a computer is able to deal with many ports simultaneously.

Some of these are allocated to particular services, and are more-or-less fixed. Examples include port 21 for FTP, port 70 for Gopher and port 80 for the World Wide Web. These port numbers are quite often visible in URLs: so a Gopher address may be given as gopher://gopher.microsoft.com:70 where the figure after the colon specifies the port to be used.

In fact in this case the port can be omitted since it takes the default value, but where an unusual or non-standard port is used, the port must be given. An example would be the famous Subway Navigator service: this has the URL telnet://metro.jussieu.fr:10000 where the special port 10000 is used.

Portals

However, the push model never really answered a pressing user need, whereas employees certainly do require help in navigating through the increasingly daunting quantities of information routinely placed on intranets. This suggests that intranet portals - and their close cousins, extranet portals - may well prove of durable worth once the current phase of excitement and confusion has passed.

The extreme immaturity of the emerging class of intranet portals, also known as enterprise or corporate portals can be judged from the fact that there are so many players, some of which have not even launched their products yet. One product that is shipping is MyEureka from Information Advantage. It is notable for the clear parallels with public portals like Yahoo.

Also worth noting is the appearance in the intranet portal area of major players such as Netscape and Peoplesoft.

Protocols

A protocol is just a jargon term for a set of rules. In the context of the Internet, it is generally applied to the way in which communications occur between two computers, which normally adopt a client-server model for their interaction.

At the general level, the underlying protocol of the Internet, TCP/IP (which stands for Transmission Control Protocol/Internet Protocol, but which is actually a family of related protocols not restricted to these two), defines the way in which the raw binary data is sent and routed over the global network.

But there are many other protocols that describe how this basic functionality is applied in particular cases.

The most obvious examples of these are FTP - File Transfer Protocol - and HTTP - HyperText Transfer Protocol.

As their names suggest, the former describes the set of rules for transferring computer files across the Internet, while the latter spells out how hypertext documents are sent from the WWW server to a Web client.

What may not be so obvious is that other Internet services such as Telnet, E-mail and Usenet are also described by their own protocols. Necessarily, in fact, otherwise there would be no way for two computers to agree on how to establish the requested service.

The protocols that define E-mail functionality are SMTP (Simple Mail Transfer Protocol) and POP3 (Post Office Protocol 3), while for Usenet delivery there is NNTP (Network News Transport Protocol).

Push/pull

As Web users become more sophisticated Web pages need to be more engaging. On way to keep users' attention is with elements that change over time. The new animated Gifs are a way of achieving this. These Gifs are simply files containing multiple images that are displayed in turn. A more interesting approach involves retrieving possibly new information from the Web server in question.

Client pull is where the change is instigated from the Web client end. After a certain period the browser requests a page again, during which time it elements on it may have changed, causing the displayed Web page to evolve.

Alternatively, server push can be used. With this it is the server that determines when things change, and simply sends down the new data.

With client pull connection is closed after each transfer in the normal way. But with server push, the server leaves open a connection to the client.

Never method produces smooth animations. For simple animations, the new Gifs are probably better.

Push technology is really trendy with Internet and intranets developers. Whereas the classical pull model requires users to seek actively material from Web servers, the push approach is designed to provide users with information on the basis of previous requests. According to the push school, the benefits for the users are that they do not need to waste time looking for information, while Web site providers gain a loyal, not to say captive, audience.

Leaving aside the older generation push technologies - e-mail and mailing lists - the first example of this new approach was Pointcast, the interface has been spruced up, and there is at last some European content. Otherwise the model remains the same: you subscribe to certain channels, for example news or business, and these are updated periodically (or on demand).

The service is free, and paid for by on-screen advertisements. After a while, these become quite annoying, but a more serious problem for companies is the fact that a corporate Internet connection can soon be swamped by the cumulative effect of many updates. To ease this problem, Pointcast has come up with the I-server which allows content to be cached as well as local channels to be created.

A number of rival services to Pointcast have since appeared. These include IFusion's Arrive, which uses a similar metaphor of channels. In this case, the initial downloads are often several Mbytes in size (in order to provide advanced multimedia capabilities), and the results look more like television than anything to do with the Internet.

BackWeb's product works slightly differently. Although once more you sign up for channels that are updated over the Internet, these updates consist of newsflashes alerting you to new postings on the channels, which may be sent too. Some of these are very small files, while others are large - Yahoo, for example, adopts an extraordinary multimedia approach that sends banners charging around your entire screen.

Yahoo's extravaganza highlights one of the biggest problems of much push technology for businesses: it is far too intrusive. In a corporate context the last thing managers want is for their staff to be distracted from their work by an over-busy monitor. In this respect the new TV model of Web publishing - found not only with pure push technologies, but also increasingly on major Internet services such as the Microsoft Network - is fundamentally inappropriate for the office environment.

This does not mean that push technologies have no place there. Indeed, the fact that Microsoft will be incorporating Pointcast in its forthcoming Active Desktop approach, and that Netscape will be doing the same with its Constellation front-end using technology from Marimba, another important player in this area, indicates that the push approach is here to stay.

Moreover, there are already alternatives to the TV model. For example Intermind uses what it calls Interconnectors to allow users to obtain very simple updates about selected Web sites. On the basis of the update information they can choose whether or not to visit the main site. One of Intermind's great advantages is that it works entirely within a browser interface. Another push solution aimed more specifically at the corporate sector is Tibco's Tibnet.

Alongside its main product, BackWeb offers a business-oriented service called SecureCast: McAfee will use this to send anti-virus updates to users. This kind of incremental approach is, of course, one of the most important features of Marimba's Castanet. Since I wrote about this system, new channels have been added, including one from Corel. Using the basic Castanet approach, it is possible to download a version of CorelOffice for Java that can be constantly updated over the Internet. This is an impressive example of just what Castanet can do, and an excellent demonstration of how push technologies, when used properly, dovetail into the Network Computer model.

Resource Description Framework

Interoperability is key to the functioning of the Internet. In contrast with the days when the emphasis was on imposing proprietary approaches, today following open standards is key. It is therefore ironic that the most fundamental component of the Internet - the data that flows over it - remains the most incompatible. Whereas different implementations of TCP/IP stacks, Web clients and servers and the rest happily work together (more or less), the basic underlying stuff they carry around, i.e. information, consists of thousands of different formats and structures.

As a result, even though the Internet/intranets have provided a single path to accessing data, drawing together this information remains a challenge. This has led to more work being done in the area of meta-data. This is information about information, and provides an overall conceptual structure for the data that is poured into it. The idea is that with the right ancillary meta-data, even with completely disparate information sources, it will be possible to mix and meld data seamlessly.

One meta-data approach is called the Resource Description Framework, which has been proposed by Netscape and others. Like so many other new proposals, it is based on the XML (extensible markup language) standard. Applications of the framework (and other meta-languages) include resource discovery to provide better search engine capabilities; cataloguing of Web sites and other information stores; content ratings; and describing intellectual property rights that subsist in particular material.

Real Audio

One of the most attractive features of the World Wide Web is its ability to add multimedia elements. However, the way in which these elements are incorporated into Web pages means that viewing them is often a two-stage process. For example, to listen to an embedded audio file, you must normally download all of it - which can take several minutes - before you can use a helper program to play it back. Clearly this robs the feature of much of its immediacy, and also means that such audio files could never be used for live Internet broadcasts.

It was with precisely these problems in mind that Progressive Networks developed its new audio format called RealAudio. Through a variety of special techniques, RealAudio files can be listened to as they arrive, rather than after the whole file has been transmitted. It is possible to listen in practically real-time, hence the name.

This opens up a number of exciting possibilities. For example, it is now possible to take live feeds from radio stations and broadcast them across the Internet. Several local US stations do this, enabling them to reach a far wider audience without the need for further transmitters. Indeed, there could be thousands of such netcasters, whether or not they have traditional radio equipment.

To listen to the new sound format, you need special software that is freely available from the RealAudio home page at http://www.realaudio.com/. Netcasters also need special RealAudio server software, and as with Web servers, it is from this side of the client/server equation that Progressive Networks hopes to make its money, rather than by charging users of the helper application.

Progressive Networks, probably the leader in streaming audio with its Real Audio product, has produced a streaming video program called Real Player. A beta version can be downloaded from http://www.real.com/products/realvideo/.

Realname

Unfortunately, because of the ad hoc way the Internet grew, the DNS is a complete mess, and tinkering with it along the lines of the Magaziner plans will not solve its deep-rooted problems. This is what makes the Real Name system from Centraal (www.centraal.com/) - a US start-up led by UK entrepreneur Keith Teare, co-founder of EasyNet and the Cyberia cyber cafés - attractive.

Hiding anomalies
Rather than trying to replace the DNS completely, it simply layers a new address system on top of it, effectively hiding all the anomalies. Through a new registry - run initially by Centraal, but later with mirrors distributed around the world - firms can register not only names such as BT, but also product names and advertising slogans.

Then, when users of the Real Name system type in a company or product name, or a phrase that has been registered, they will be taken to the corresponding URL - without having to enter a long and often unmemorable string of protocols, domains and directories.

Unlike the DNS, Centraal aims to have a strongly enforced acceptable-use policy, designed to avoid the current battles over who has the rights to certain names (such as Prince), and prevent speculative registration by third parties of names in the hope of extorting money from their rightful owners. As well as its ability to cope with phrases, this system is Unicode-compliant. This means it can handle almost any of the world's writing systems, whereas the DNS is limited to just ASCII.

The Real Name system is available now; you can try it out at www.realnames.com/, entering either company names or general terms. In the latter case, the system offers what it considers near-matches. You can register phrases and names (http://company.realnames.com/Subscribe.asp). The fee is $40 (£25) per name, per year - roughly comparable to the DNS costs.

Although the Real Name approach is attractive in principle, it has several major question marks hanging over it. First, will it scale? For all its faults, the DNS works. Whether the Real Name system will work when 100 million users are querying the central servers simultaneously is another matter, although Centraal claims it will scale without problems.

More problematic is the catch-22 issue of whether it will be used in sufficient numbers for companies to register their names, and thereby encourage more users to try it out. An important move in this context is Centraal's agreement with Altavista to offer Real Name searches alongside conventional ones on every results page (see http://altavista.digital.com/).

Rival approach
What Centraal really needs is for its system to be built directly into the two leading browsers. However, Netscape has already announced it will be employing its own rival approach (http://home.netscape.com/newsref/pr/newsrelease623.html), based in part on the Alexa system.

Meanwhile, Centraal has produced a free helper program (at http://company.realnames.com/download/personalname.exe) that works with Windows 95/NT versions of Navigator and Internet Explorer, and lets users try out the Real Name approach directly.

Instead of entering URLs, you just type a word or phrase into your browser. The Real Name software forwards this to the Centraal server, which then sends back the corresponding URL (if there is one) for the browser to retrieve. It also lets you create your own local version of Real Names, allowing you to set up bookmarks identified by words or phrases.

Real Teal Streaming Protocol

One of the interesting cultural side-effects of the Internet on the computer industry is the craze for drawing up standards. These are real standards, not just de facto ones established by a dominant player. Because the Internet is such a new area, there are no dominant players yet that can impose their will as others once could in other markets. Moreover, there is a general fear of backing the wrong horse and getting left behind in this fast-moving sector, so computer companies have decided that it is better to stick together for safety.

One of the latest manifestations of this phenomenon is the Real-Teal Streaming Protocol (RTSP). This has been endorsed by just about all the major online players (Netscape, Apple, IBM, Silicon Graphics and Sun) - though Microsoft is conspicuous by its absence. The aim of this proposed standard (it has been submitted to the IETF for ratification) is to do for streaming multimedia what HTTP did for text and graphics. That is, to provide a common and robust platform for what is perceived as one of the growth areas for the future, and to allow full interoperability between clients and servers from different manufacturers.

Streaming multimedia is about the transmission of data in a continuous stream, often with a real-time component - for example audio broadcasts. RTSP aims to work with pre-existing structures such as TCP, UDP, Multicast IP and another protocol confusingly known as Real-time Transport Protocol (RTP, now RFC1889). There is already a sample client available, and also a new URL, which takes the form rtsp://rtsp.prognet.com/welcome.wav.

Real-time collaboration

E-mail, Usenet newsgroups and especially World Wide Web sites allow information to be exchanged with one or more people, but the response of the latter is inevitably not instantaneous.

As usual, this deficiency has acted as a spur to various inventive individuals on the Internet who have come up with a variety of collaborative tools that let you interact with people across the Internet in real-time.

The simplest of these are the chat programs. With them, you simply open up a connection to another node on the Internet and then converse with the person at the other end by typing in your message; the recipient receives this on-screen and can type back in reply more or less instantaneously. A variant is the whiteboard where you draw on-screen and the resulting image is transmitted across the Internet and then displayed at the other end where the recipient can then add his or her own marks.

Obviously this is rather lacking in the human touch, and it has been largely superseded by the Internet phone programs, of which there are now several. The idea is simple: voice input is digitised by the transmitting computer and the resulting file sent across the Internet to another node. There the file is converted back to analogue form and heard by the recipient. The latter can then reply in the same way. If the soundcards employed in this process are full-duplex it is possible to receive and transmit voice messages at the same time.

Although the sound quality of Internet phone software leaves something to be desired, the possibilities for saving on international calls through the use of local Internet PoPs are great, and more than compensate.

Rendezvous Protocol

One of the anomalies of the Internet is that although there may well be tens of millions of people connected at any one moment, there is very little sense of this when you are online. In particular, it is not generally possible to discover who among peers and colleagues is online, and to chat with them in some way.

Of course, Internet Relay Chat allows precisely this kind of interaction, but suffers from a number of disadvantages. First, it requires special software - an Internet Relay Chat client - and for this to be configured appropriately. Second, the large number of distinct relay chat channels make it hard to find people.

And finally, such chat systems generally have a rather dubious reputation in business circles. None the less, the fact that Internet Relay Chat continues to thrive indicates that there is a clear need for this kind of service. This has led companies to come up with what are known as "buddy list" applications, whereby it is possible to both discover who is online at a given time, and to communicate with them by typing short messages.

These are routed across the Internet and appear on the recipient's screen, provided they too are running the relevant "buddy" software.

The leading system is America Online's, because of its 10 million user-base, many of whom employ it. It was probably for this reason that Netscape decided to adopt this standard in the most recent release of its Communicator program. Unfortunately Microsoft has put together a rival specification, called the Rendezvous Protocol, and managed to rally most other players behind it, leading to yet another potentially damaging Internet schism.

Request For Comments

It is a mystery that the Internet works at all, let alone without central control. Besides a curiously co-operative spirit that seems to pervade the Internet the answer lies in the Request For Comments (RFC), a series of some 1,700 documents that largely define exactly how the Internet works.

These documents have no prescriptive force, however, part of the Internet alchemy is that these polite requests for comments turn into the accepted rules for particular aspects of the Internet. Although there is no Internet police to enforce them if you want to connect to the system and exchange information with other Internet users you have to follow them, otherwise your connection is effectively useless.

Many RFCs cover deeply technical or specialised subjects - for example Multiprotocol Encapsulation over Asynchronous Transfer Mode Adaption Layer 5 (RFC 1483) or Conventions for Encoding the Vietnamese Language (RFC 1456). Others are aimed at beginners - such as RFC 1594: Answers to Commonly Asked New Internet User Questions - and some (like RFC 1607: A view from the 21st Century) are even entertaining.

RFCs can be obtained by FTP from the directory ds.internic.net/rfc/; alternatively you can send a message of the form document-by-name rfcxxx to mailserv@ds.internic.net replacing xxx by the relevant RFC document number. To find out what is available sent the message file /ftp/rfc/rfc-index.txt to the same address. This file is more than 200Kb.

Robot Guidance

Given the indispensable nature of Web search engines, it is surprising how ad hoc they are in how they operate. For example, one of the key elements of a search engine is the robot that scours the Web for pages and content that will be indexed by the main engine.

As anyone with a Web site knows, a robot will access sites indefatigably. For robots' continual presence to be acceptable to those creating and hosting Web sites, it is important for them to observe a certain minimum etiquette.

The requirements of the Web site owners are typically held in a document called simply robots.txt, and known as the robot exclusion file. One of its main functions is to forbid robots from indexing pages or areas on the site that are confidential or simply not appropriate for indexing, for example, temporary files.

And yet this key robots.txt file - the only standard that exists in the area of Web indexing - is completely informal, with only a now-expired Internet draft to define its main elements. In an attempt to provide a little more rigour in this key area, an independent group has been formed to create what it calls a Standard for Robot Guidance. This aims to build on the basic technique employed by the robots.txt file, and add more sophisticated features.

New ideas under consideration include the ability to refer to more than one robots file (allowing a finer-grained control of exclusion), the provision of a priority URL for robots that do not have the resources to visit all URLs on a site, and the ability to index non-HTML items such as images.

Routers

The Internet derives its name from the Internet protocol, a set of rules about how the basic packets of information used on the network are transmitted among the various constituent sections.

The Internet protocol defines the theory behind this parcelling out of the packets, but the practice clearly requires some hardware, generally called a router.

Sometimes the term "gateway" is met in this context, although this is also used to mean other kinds of interconnections, and router is now the preferred term.

A router works by sitting on two separately defined sub-networks. Packets that can be delivered within the sub-network simply travel to their destination directly, but when a packet has an address that is not found there it is directed to the router.

This then passes it on to the other sub-network. The process can then be repeated until the packet arrives at its destination.

In effect, routers are the glue that bind the separate strands of the Internet together. Routers are used to connect corporate networks to the Internet and to bolt together the main backbones of the Internet itself by connecting service providers.

As the Internet has assumed ever-greater importance, so the hitherto rather specialised world of routers has moved steadily towards the centre of the Internet stage.

Riding on the back of this success is the company Cisco, which dominates this sector much as Microsoft does the desktop.

Satan

Back in spring 1995 a ripple of excitement has passed through the Internet world concerning a devilish program called, provocatively enough, Satan. The acronym stands for Security Administrator Tool for Analysing Networks, which makes it sound much more dull, and also puts it in context. For Satan is not a tool to help people break into systems (though it can be misused in this way, just as any computer can be misused as the instrument of such attacks) but for network administrators to take preventative action before such break-ins are attempted.

It does this by probing known weaknesses in widely-used systems in a systematic way. Any such loopholes that have been overlooked (as happens only too frequently) are noted for the administrator to close. It is not a tool for gaining access, but for reporting and analysing only - though there is nothing to stop someone modifying it sufficiently that it did effect access too. It is also important to note that Satan can only be used on computers that are visible on the Internet: that is, those systems hidden behind protective firewalls (blocking computers) cannot be scanned.

In many ways the hysteria that has greeted the release of Satan shows only how ill-prepared most system administrators are as far as security is concerned. To say that Satan should never have been created, never mind released for general use, is like saying that you should never check that doors are locked in case thieves think of doing the same. The fact is that all of the techniques employed by Satan are already well-known; the only people who have to gain from its widespread availability are those who know less about deficiencies in their defences than they should and whose system security as a result is weaker than they like to pretend.

Seal of approval

Trust is always a key issue in business, but on the Internet it assumes a special importance. In the usual realms of commerce there are many indicators of a firm's trustworthiness, but online most of these are absent. Companies operating on the Internet might be new; they will often be distant, so local methods of checking them out are unavailable. Often they will consist of little more than some HTML code at a URL. Clearly, potential purchasers at E-commerce sites are doubly cautious of buying. One proposal for fostering a little trust is to use Seals of Approval, or SOAPs.

SOAPs are now being employed to lend some substance to online firms that are otherwise all too evanescent. A variety of organisations will - for a fee - carry out some kind of audit of an aspiring E-commerce outfit and its operations. If these meet the auditor's criteria, the seal will be issued for display on the site. Of course, this begs the question of which SOAPs signifying trustworthiness can themselves be trusted.

Search Engines

Alongside this extraordinary growth in indexing capability, driven partly by the application of raw computing power and partly through clever programming techniques for the indexing and searching, there has been another highly significant trend. Whereas Lycos, Web Crawler and WWW Worm all began life in academe, the vast majority of search engines are now commercial operations (Inktomi is the latest to join the club).

In a sense this is just a reflection of the fact that the Internet has moved from being a largely academic environment to one where business is the dominant mode. But more significantly, this shift has taken place because search engines have proved to be among the most popular sites on the Internet (Infoseek claims to be the second most popular site after Netscape, with 25 million hits a day against Netscape's 45 million) and hence the most marketable. In the process they have become pivotal to the way people use the Internet.

For example, it is now often far quicker to use one of them to find a site that you wish to access than type in the long and often unmemorable URL. In effect, search sites have become the portals of the Internet.

Such is the reach of the top search engines, they have effectively become the gazetteer of the Internet, at once its index and guide-book. They turn the Net into a huge encyclopaedia whose strange organisation defies human thought. Because these search engines offer an increasingly common way into the Internet, it is vitally important for corporate Web sites to be added to the indexes so that potential clients can find them when they look for keywords and concepts that occur at those servers.

Indeed, in the future there will be an increasing tendency to design Web sites so as to optimise their chances of appearing high among the ranked results of top engines like Alta Vista, Infoseek and Open Text. It is all very well being included in an index, but you might as well not be there if your site only turns up several hundred entries from the top in a search (how many people have the patience to look through all the hits that their query generates?).

For this reason, those designing Web sites would be well-advised to make sure that they have a clear idea of the message they wish to convey with their pages. Many systems of relevance ranking look at section headings for hints on how important a page is likely to be, while others simply count the number of times the search term appears.

The more sophisticated designers might even want to start investigating how the major search engines carry out their ranking in order to maximise the chance that their site will appear near the top of any given list.

Searching for a way round your intranet

Search engines are central to the way most people use the Internet. In fact they have become so sophisticated that it is now generally easier to find something on the external Web than it is on a corporate intranet.

As a result, many suppliers are creating products that are designed to enable you to search through more or less everything held on an intranet just as you would on the Web. Typically they will index all HTML, text, Microsoft Office and Adobe PDF files, and allow you to carry out advanced searches such as those using Query By Example.

Building on their high-profile success, most of the major Internet search engines now offer intranet versions. Altavista has produced its Search Extensions, Lycos has teamed up with Inmagic to offer the Lycos Intranet Spider, and Infoseek has the Ultraseek Server. Excite offers Excite for Web servers - this only indexes HTML and text files, but is free - while Open Text sells Livelink Spider, part of its Livelink Intranet suite.

Microsoft and Netscape are also players. Microsoft has its Index Server product, which works only with Windows NT and the Internet Information Server, but is free. Version 2 is currently being beta tested. Netscape too is moving forward from its first search engine, called Catalog Server, to a new product known as Compass. This uses technology from Grapevine, and there is also an add-on that allows a Yahoo-like taxonomy scheme to be generated for an intranet.

The final group of players in the intranet search engine market are those who have traditionally offered specialised - and often outrageously expensive - search solutions for niche markets such as government departments and large corporations. The main names here are Verity, with its Search 97 family, Excalibur's Retrieval Ware and the interesting British company Muscat.

There are also new entrants such as Semio offering novel technologies that attempt to solve the perennial problem of finding the exact information you're after, and displaying it in a comprehensible format.

As the above whirlwind tour of the intranet search scene indicates, this is a very crowded market. In the next year or so there is bound to be a shake-out of players as the technologies mature. One company that inevitably will still be there is Microsoft. Version 2 of its Index Server indicates its seriousness, and, as history has shown, once Microsoft enters a market in this way, it rarely gives up.

The advantage of Index Server 2 is that it is freely available to Windows NT and Internet Information Server users, which means it is bound to be widely deployed. It is powerful, but very hard to customise. There is already a book devoted to the subject (Index Server, £36.50, ISBN 1-57521-212-9), as well as a third-party product to help tame its complexities. The other player almost certain to survive is Verity. Although not a household name, the company is probably the market leader in search engines.

Earlier offerings were hard to install and run, but Search 97 sensibly uses a Web front-end to make running queries trivial. Its display capabilities are rather rough, and it is not cheap, but Verity's recent bout of acquisitions and continuing investment in research should ensure that it rides out the coming tumult in the intranet search market.

Finding answers has never been so tough

Search engines have long been central to the way most people use the Internet. Without them, it would be impossible to find most of the unparalleled contents now available online. In particular, it would be hard to find companies and their services, which would have stunted the growth of E-commerce considerably.

As a result of this pivotal importance, the world of Internet search engines has become one of the most fiercely contested online businesses. Even though there are already five top-class engines - Altavista Excite, Lycos Hotbotand Infoseek - this has not prevented others from being launched. For example, a specialised search engine aimed at Internet professionals called Netsearcher has been created, while Livelink Pinstripe is aimed at business users.

An interesting alternative tack is taken by Goto.com. This attempts to address one of the central problems of search engines, the ranking of the often many hits, by employing a novel and rather controversial technique. Instead of trying to analyse the sites to match the user's needs, Goto.com simply allows advertisers to buy the top places in listings. To be fair, this blatant subversion of search engine technology is clearly signalled and even the price paid by the advertiser is given, if you follow its link.

Rule Britannica
Another approach is exemplified by the Britannica Internet Guide, which aims to build on its reputation as the leading encyclopaedia, by offering information on a selected 65,000 Web sites chosen by Britannica editors. The established engines have not stood still, though little in the way of technical innovations has been introduced. One of the few new ideas comes from AltaVista, which offers an automated translation service to help users view pages in other languages.

One move made by all of the major engines has been the addition of what are generally called channels. These are simply groupings of Web sites by categories, rather like Yahoo. The idea is to provide users with another way through the huge holdings. An advantage for the search site is that they can sell banner advertising specific to each category.

Several of the main engines have added local versions of their services. Excite has sites in Australia, France, Germany, Japan, the Netherlands, Sweden and the UK; Lycos has them in Belgium, France, Germany, Italy, Japan, the Netherlands, Spain, Sweden, Switzerland and the UK; and local Altavistas are currently available in Australia, Brazil, Canada, Malaysia, Sweden and Spain (also covering South America).

Although local search engines might seem to negate the whole point of a global Internet, it is generally true that users often only want information about sites that are physically near to them, or at least in their own language. The development of these local sites is symptomatic of a larger shift in Internet search engines. Now the emphasis is on providing services that will ensure users come back on a regular basis, and that will keep them on the site for as long as possible.

Up-to-the-minute
This has meant that, alongside the traditional search facilities - augmented by channels - a wide range of completely unrelated services are available at the leading sites. These typically include up-to-the-minute news; yellow pages for finding information about companies; white pages for people searches; chat capabilities; free E-mail; and instant messaging (employing the so-called 'buddy list' technology).

The whole point of adding all these apparently peripheral services to the core search business is to sell more advertising - the key revenue source for these engines - and to allow them to strike deals in the burgeoning area of Internet commerce. In doing so, they are following in the footsteps of the undisputed leader of what have come to be called hub or portal sites: Yahoo.

Six months ago, when I last reviewed the state of the search engine sector, I noted how the focus then was on bolstering the main search service with ancillaries - subject-based channels, news feeds and online communities - rather than on advancing the core technology.

Although both Excite and Lycos have continued in this vein since then, with numerous acquisitions designed to bulk them up and keep them within striking distance of the portal leader Yahoo, most other search engines have taken a different tack.

In a sense, they have gone back to basics, and sought to address the fundamental question of how to make searching on the Internet easier. As the quantity of information available online continues to swell, current techniques yield ever more hits, ironically making search engines less rather than more useful. Users are no longer grateful to find any result quickly but want to obtain precisely the few results they would have chosen had they time to sift through the many thousand hits that are now typically generated.

New approaches like the Alexa and RealNames systems discussed previously in Getting Wired are symptomatic of this new mood. Both try to second-guess what users really want when they are search for something. Alexa (http://www.alexa.com/) does this by using fellow users' paths through the Web to suggest relevant sites; RealNames (http://realnames.com/) allows key words - company names, trademarks etc - to be used as shortcuts to some "obvious" corresponding Web sites.

Altavista (at its new URL of http://www.altavista.com/) was the first search engine to support the RealNames initiative. Now when you search for a word you are given the possibility of going to the site that has paid to be associated with that word, or one similar. Altavista also now allows you to submit natural language-type queries, and offers answers to other related questions spontaneously. And in another effort to make searching at its site easier, Altavista has created a free helper program called Discovery - see http://discovery.altavista.com/ - that runs on the user's machine alongside Web browsers.

To aid users Infoseek (at http://www. infoseek.com/) introduced a little while back the idea of clustered results, whereby hits from the same site are grouped together. This allows more distinct sites to be displayed in each page of results. It has now added to this reviewed Web topics - essentially hand-picked related sites along the lines of Yahoo - and a related idea which it calls Extra Search Precision (see http://www.infoseek.com/Help?pg=ESP_feature.html).

This feature is employed when searching for very common words or short phrases that typically generate huge numbers of hits, and works by extracting from the raw search results those sites which Infoseek staff consider the most relevant.

Like Altavista, Infoseek has come out with free software. Called Express, it is a general metasearch program that enables you to search several engines at once. See http:// express.infoseek.com/ for more information.

Although Lycos itself has done little that is innovative in terms of search technology, one of its recent acquisition, HotWired, has. The Hotbot search engine (at http://www.hotbot.com/) has added a service from a company called DirectHit (home page at http://www.directhit.com/). Starting from the assumption that the most-visited sites dealing with a given subject are also likely to be the most useful, DirectHit uses information about which hits search engine users have actually visited in the past to create a weighted list of results.

A similar kind of thinking lies behind two new search engine technologies. Google (home page at http://google.stanford.edu/) ranks search result pages on the basis of which pages link to them. In particular, links from more important sites - like Yahoo, say - are given proportionately more weight.

Unlike Google, which can be freely accessed by anyone, an equivalent system being developed by IBM, called Clever (home page at http://www.almaden.ibm.com/cs/k53/clever.html), is currently only an internal project. But given the hunger of search sites for new technologies that may give them an edge, and of users for improved search engines, it will surely surface soon.

Secure Socket Layer (SSL)

At a time when it is becoming increasingly fashionable to view Netscape's chances against Microsoft as slim and diminishing, it is important to remember some of the major achievements of this plucky start-up. Ironically its greatest legacy may not lie in the creation of the Netscape browser.

After all, Navigator was only one of several programs at the time that built on the original Mosaic graphical software (though Navigator did possess the crucial advantage of having behind it Marc Andreessen, one of the main developers of Mosaic), which in turn drew on the ideas of Tim Berners-Lee when he devised the World Wide Web in the first place.

Far less apparent than the high-profile Navigator, but in fact far more pervasive, is a protocol called Secure Sockets Layer (SSL). In contrast to its fierce fight against Netscape's rival software, Microsoft has joined the rest of the industry in accepting SSL as a standard. An early competitor, called S-HTTP, has more or less sunk without trace.

SSL, as its name suggests, is designed to provide a channel for secure communications between Internet clients and servers. It does this using public key encryption at a very low level. This means that it can be applied to all kinds of Internet activities - not just the World Wide Web - including e-mail, FTP, telnet and Usenet.

SSL is important because it marks the first step towards secure commercial transactions across the Internet. Indeed, it is currently the main way of doing this. As such, SSL has played an important if unremarked role in convincing sceptical businesses that the Internet is a suitable - and safe - medium for financial activities.

Secure Electronic Transactions (SET)

One of the most surprising switches in the continuing story of business on the Internet has been the shift from pessimism to optimism as far as online commerce is concerned. For many years the accepted view was that the sale of general goods and services was some way off, but almost overnight analysts and market research companies have started predicting fabulous market sizes in the not-too-distant future.

In part this probably flows from an acceptance that the key issue of security has been largely resolved for business transactions. Nobody seriously doubts that the Secure Sockets Layer protocols are fundamentally sound (even if there have been the odd slips in implementation). Similarly, the more complex Secure Electronic Transaction (SET) protocols are now being recognised as appropriate for at least the first stages of online commerce built on credit card transactions.

SET is something of an Internet miracle, since it was born out of two rival standards, one supported by Netscape and Mastercard, the other by Microsoft and Visa. Aware of how damaging a standards war would be at this stage and in this field, the various parties came together and agreed on the protocols.

They not only guarantee the security of credit card details as they pass from customer to supplier, but it also ensures that the supplier is unable to obtain those details, but is able to check that the payment will be honoured by the relevant credit card company. These protocols are therefore even better from the customer's viewpoint than traditional credit card transactions, which is yet another reason why electronic commerce employing it should take off rapidly.

Now that the basic problem of secure transactions across the Internet has been resolved - first through SSL and with the agreement of SET - another issue started to assume greater importance. This is the need to impart personal information to suppliers, when buying goods online or signing up for a customisable service.

There are two main problems. The first is that typically the same information - name, address, job title, credit-card details etc - must be entered for each site visited. As well as being a waste of time, this duplication of effort can be a serious disincentive to buying online or signing up for services, or at least an incentive to enter short and fictitious details where possible.

More serious is the issue of trust: once the information has been entered, users have little idea what will happen to it. In particular, there is always a concern that personal details will be passed on to third parties for further and unauthorised commercial use. Indeed, in a survey carried out by the independent E-trust organisation, 70% of Internet users cited fears about privacy as a major reason for not registering this kind of demographic information online.

A growing recognition that this could act as a significant brake on the development of online commerce has led to the drawing up of the Open Profiling Standard. This initiative, begun by Netscape and Firefly, is now supported by many players in the online commerce sector including Microsoft . With this kind of support, OPS looks likely to take off, though how it will mesh with a similar project, the Platform for Privacy Preferences proposed by the World Wide Web Consortium is not yet clear.

The basic idea behind OPS is that a standard format for personal information, called a personal profile, will allow such data to be sent automatically from a customer to an online supplier without the customer needing to fill in forms. Users will be able to control which information is transmitted, and if any of this information can be passed on to third parties. Other elements of the OPS approach are that the information requested should be appropriate to the application - no questions about a person's marital status when they buy a CD, for example - and that information may only be collected if something of value is offered in exchange.

An OPS profile contains what are called well-known sections: elements that all OPS-compliant clients and servers must recognise. These include items such as anonymous demographic information; more personal details such as name, address, telephone number, title and profession; currency choice; and information about the software used (for example, whether a browser is frames-capable). Note that profiles do not necessarily replace the cookies that are commonly used to store information about visitors to sites. For example, shopping-trolley details may still require a cookie to hold transient details about a visitor However, cookies will no longer be needed just to store basic demographics.

Much of the structure for the profile derives from an existing standard called vCard. This was originally drawn up by the Versit consortium, established by Apple, AT&T, IBM and Siemens. Versit's work has now been taken over by the independent Internet Mail Consortium . Alongside vCard, another key Internet technology employed by OPS is the digital certificate. This is used to establish the credentials of online vendors who will receive the profile information.

Servlets

One of the major shifts that is taking place in the world of HTML is that from static pages to those that are dynamic. This does not mean just that they have moving images: these are pages that are generated on-the-fly, in response to a request from a Web client. In particular, the page that is generated may well change according to the circumstances of the request: who is asking for it, what their previous history at the server is, the context in which the request is made etc.

The applications in business are clear: when a customer accesses a commercial Web server supporting such active page generation, it is possible to create a completely customised HTML page. In this way, so the theory goes, it will be possible to draw visitors into the Web site and hold them there longer since psychologically they will feel in some sense more at home.

One of the most obvious manifestations of this approach can be found in many of Microsoft's Web pages. Instead of the usual .htm file extension, many Web page have started sporting a .asp - for example http://www.msn.com/default.asp. This refers to Microsoft's new Active Server Pages, and means that the pages are being generated on-the-fly as described above.

Not to be left behind in this technology race, the Java world has added this facility, and even goes one better, at least in terminology. In the context of Java-based servers, such active pages can be generated by something that Sun has dubbed servlets. These are similar to basic Java applets in that they are platform-independent objects, but instead of being sent down to the Java-enabled Web browser, they exist and are run purely at the server end.

Service Providers - UK

At the beginning of 1996 there were only something like 50 resellers and only six with international connectivity. The excellent list of UK and Irish ISPs provides details of no less than 264 companies (March 1997) offering national, regional or purely local Internet services. Information about how they connect to other parts of the Internet can be found at the Cyper Roads Map.

It is hard to estimate just how many users these ISPs have, and which therefore are the leading players. Some estimates based on a consensus of industry sources can be found at which at least provides a starting point. For example, the figures suggest, probably quite rightly, that the overall UK Internet market is dominated by very few players, including three major companies: CompuServe, AOL and the Microsoft Network. These are all aimed at single users, and are not really suitable for business use.

This leaves two companies with relatively large installed bases (more than 50,000 users) but with an established record of serving the corporate sector: Demon Internet and UUnet Pipex (Unipalm Pipex was acquired by the US company UUnet in November 1995). The quality of the Internet connection - both in terms of availability of dial-up lines and overall speed - has improved. This can be attributed to the fact that both have been investing heavily in their infrastructures.

Demon Internet began this process through a tie-up with Energis to provide full national coverage at local rates in September 1995, and followed it in April 1996 with the installation of a full 45 Mbit/s link to the US that has dramatically improved download times. Demon Internet also offers 64 Kbit/s ISDN access for its basic £10/month flat rate.

Not to be left behind, UUnet Pipex also put a 45 Mbit/s link to the US in place in September 1996, and has upgraded this to 90 Mbit/s in the first quarter of 1997. Moreover, it has announced a new pan-European backbone with hubs in ten cities, and plans to put a 10 Gigabit/s link across the Atlantic in 1998. On the dial-up side it already offers 33.6 Kbit/s, and has promised support for US Robotics 56Kbit/s modems too.

Following these two leaders are a couple of companies that form a kind of second-tier ahead of the mass of 200+ general ISPs: Easynet and Global Internet. Both are notable for offering very low-cost ISDN links (though the wholly-owned subsidiary UK Online in Easynet's case) as well as leased line options. The fact that major players are investing so heavily in upgrading their networks is good news for corporate users, and one of the main reasons why for businesses there is little need to look elsewhere. But this does not mean that there is no place for the hundreds of local ISPs.

Looking at the US experience, it would seem, for the moment at least, that they serve a valuable function for users who require more personal or local support - see the fascinating analysis from Boardwatch magazine on this and related issues commenting on its comprehensive listing of over 3000 US ISPs.

A tool that could prove valuable for choosing and working with ISPs has been developed by Quza. Called Nimby, it employs the Traceroute program to measure how long it takes packets to traverse each part of a journey to any other Internet node. This allows you to check the response times of potential ISPs as well as helping to locate and monitor performance problems with current suppliers. A beta version of Nimby may be downloaded.

Server-Side Include (SSI)

HTML is so simple to write that it can easily be produced with the most basic of text editors. But even with fairly modest sites, there may well be good reasons why something a little more sophisticated might be advantageous. For example, Web pages often contain certain fixed elements - names, e-mail addresses, telephone numbers etc. - that recur throughout.

Clearly it is inefficient to alter all these by hand if a detail needs to be changed. A more sensible approach would be to create a small, separate file, and then merely include that at the relevant point for each page. Changes would only need to be made once, and would be propagated automatically.

Server-Side Include (SSI) offers just such a facility. With a Web server that supports SSI it is possible to modularise the HTML code. This is achieved using extra code within the HTML pages held on the server that is processed to create the HTML page before being sent across the Internet. SSIs has many other capabilities. It can be used to include the current time and date within a Web page, or to call external programs. In this way SSI can create Web pages that are pure HTML - no JavaScript or Java - but which are generated dynamically according to the particularly situation.

Clearly, this is identical to Microsoft's Active Server approach, which also uses embedded instructions within a Web page held on the server to generate on the fly the final HTML code sent to the user. As is so often the case, Microsoft's exciting new technology turns out to be more a case of an update to old ideas, but marketed very astutely.

Shared Wireless Access Protocol

An interesting development in the networking world is the move towards wireless connections. The Teledesic satellite project, designed to allow fibre-optic connection speeds to the Internet from any point in the world, is perhaps the most ambitious manifestation of this. But another scheme, employing the new Shared Wireless Access Protocol, could be more important.

Whereas Teledesic is likely to appeal principally to business users, the protocol is designed for use in the home too. It aims to become the digital nervous system of a new class of interconnected devices. Since Internet connectivity is also a major part of the standard, the protocol offers the possibility of linking various business and domestic devices to the Net without the need for intrusive wiring.

It is derived from extensions of existing cordless telephone and wireless local area network technology. Up to 127 devices are supported, with an overall throughput of up to 2 megabits per second. Basic security is provided using a 40-bit encryption scheme.

For a data-only protocol system, all stations are equal. But when voice communications are present, a central Connection Point is required. This consists of an ordinary PC, which also acts as the gateway to conventional telephone services. One reason why the protocol has a good chance of establishing itself is that firms such as Compaq, Ericsson, IBM, Intel and Microsoft are behind it.

Shareware

Many FTP sites store huge numbers of programs available as shareware. That is, you can download them freely from sites on the Internet to try them out. If you decide to use the program, you may be asked to send a small sum of money to the author.

Some programs are distributed freely with no pre-conditions. that is, you may use them in any way, alter them, give them away or perhaps even sell them.

Normally, though, even freeware (as such shareware is often called) requests at least that you pass it on free, and specifies that the copy right remains with the author. Public domain software often comes complete with source code allowing you to enhance and modify the program as you like. Variants of shareware include postcardware (where you are asked to send a postcard to the author if you like the product)m and even beerware (you buy him or her a drink - though quite how is not usually specified). Nagware is when the program has annoying reminders to register the program if you continue to use it. Crippleware is when important features are disabled (typically saving or printing features) to encourage you to register. Demoware is when you can see what most of the program does, but not actually use it.

Although freeware is common most shareware requests a fee and registration. Payment enables the author to provide some kind of support, usually completely absent in freeware and to produce upgrades.

The shareware system has functioned well on the Internet, with users generally paying for what they use to encourage the continued flow and development of these programs.

It is important that newcomers to the Internet do the same, or else the supply of generally excellent software could well dry up, leaving everyone the poorer.

A note of caution. Several small and useful programs with registration in the region of $5-$25each may be matched by commercial products with better support at much less than total cost of these non-integrated utilities. Many may offer interesting alternative approaches to working, but adequate applets may already be included in the operating system or with integrated Office packages.

Current leading shareware applications include Netscape, Paint Shop Pro, Pkzip, WinZip and several free of charge programs from Microsoft such as the power toys for Windows 95, Internet Explorer and HTML assistants for Word.

These packages and other shareware can be obtained from ftp://src.doc.ic.ac.uk/computing.systems/ibmpc/. The CICA Windows {3, 95 and NT}, SIMTEL (DOS and Windows) and Microsoft mirror directories can be found here. The UK Microsoft Windows shareware publisher Springsoft has a Web site at http://www.springsoft.com. Available from here are free programs and links to other Windows software archives.

Windows NT software can be found at http://emwac.ed.ac.uk/html/internet_toolchest/top.html which comprises of various Internet programs written by Edinburgh University with the sponsorship of Microsoft, Digital and Sequent. They include servers for the World Wide Web, Gopher, Finger and WAIS. These are free and can be used by anyone who wishes to start providing information on the Internet using NT as a platform.

Benchin' Software Review at http://www.benchin.com/ is a database of information on about 70,000 programs including end-user reviews of products.

The micro side is held at the University of Lancaster (http://micros.hensa.ac.uk/), and contains files for most of the popular desktop platforms including DOS, windows, Win95, OS/2, Apple Mac and Windows NT. Also available is a list of the top 50 files downloaded over the past month (e.g. at http://micros.hensa.ac.uk/micros/ibmpc-win.top50.html).

Hensa is extremely British and you may like a look at http://www.jumbo.com/. This holds over 23000 programs broken down by categories such as business, programming, utilities, etc. Each category is split by platform and then by headings such as "Internet". Jumbo also allows you to read accompanying documents for a file before you download it.

Consolidated details of shareware and freeware can be found at http://www.shareware.com/ with over 140,000 files.

The Consummate Winsock Apps List at http://www.cwsapps.com/cwsa.html or it's mirror at http://www.star.co.uk/cwsapps/cwsa.html presents information in tables divided into categories such as virus scanners, utilities and browsers. The basic information includes name, version, rating, its size and a short description. A longer description is also available.

The Nethead site at http://www.tiac.net/users/sfinnie/ has much smaller listings than either TUCOWS or Stroud but has a very personal element to the comments. The basic listing of Internet software leads to one long list of direct links to the sites.

Along side this listing is Software Sonar (http://www.tiac.net/users/sfinnie/sw_sonar.html) which gives informed comments about new programs. The rest of the site is perhaps of less interest to business users. News Vues offer fairly compact musings on the latest happenings in the online world, as well as links to main news sites. Surfari is a themed guide to sites, while Burns is an editorial rant whose value will depend on its subject and your interests. Sitings is a collection of fairly trivial Web sites.

This is how much it costs to download software

File size


9600 time	9600 cost	14.4 time	14.4 cost	28.8 time	28.8 cost
10K	14 secs	0.2p	8 secs	0.1p	4 secs	0.06p
50K	1 min 10 secs	1.2p	42 secs	0.7p	20 secs	0.3p
100K	2 mins 25 secs	2.4p	1 min 20 secs	1.4p	40 secs	0.7p
500K	12 mins	12p	6 mins 55 secs	7p	3 mins 20	3.3p
1Mb	23 mins 50 secs	24p	14 mins	14p	6 mins 40 secs	6.7p
2Mb	47 mins 40 secs	48p	27 mins 45 secs	28p	13 mins 20 secs	13p
3Mb	1 hr 11 mins	71p	42 mins	42p	20 mins	20p
4Mb	1 hr 35 mins	95p	55 mins	56p	26 mins 40 secs	27p
5Mb	2 hrs	£1.20	1 hr 9 mins	69p	33 mins 20 secs	33p

Downloading software from the Internet is one of the cheapest ways of obtaining it. You can print out the table above as a ready source of reference. Note that when downloading software from busy sites it may be a bit slower than indicated.

Finally, a techie bit. If you have a choice between downloading from a Web page (using HTTP) or from an FTP site, choose HTTP. It will usually be faster and more reliable.

Simple Workflow Access Protocol

By its very nature, the Internet is about communication. It is already the perfect medium for sending E-mail or accessing Web pages. More complex forms of communication, however, where it turns into full-blown collaboration, remain largely underdeveloped. and yet the demand for such capabilities is great. The rapid rise of the intranet is proof that firms want to use Internet technologies to allow workers to collaborate more closely.

Outside the Internet, one of the most important classes of collaborative programs is generally called workflow software, which is designed to manage long-running, process-oriented applications. The workflow engine uses a model of such processes to assign tasks to participants in a workflow, and move processes forward.

The addition of the Internet dimension to workflow software clearly adds to the complexity, particularly with respect to handling distributed tasks. But equally, the application of workflow techniques to Internet operations could help draw together far-flung processes that are currently disconnected.

To facilitate the growth of the Net workflow market, a group - led by Hewlett-Packard, Netscape and Sun - has drawn up the Simple workflow Access Protocol. This is an Internet-based standard, designed to allow workflow products from multiple suppliers to interoperate. It enables workflow software to start, monitor, and exchange workflow data, and to control workflow processes in conjunction with other workflow systems.

Standard Generalised Markup Language (SGML)

The HyperText Mark-up Language describes only the overall structure of a document, not its appearance (though the structure is often made visible through the onscreen layouts created by particular Web browsers). In this respect HTML shows its origins as an application of a more general approach called the Standard Generalised Markup Language (SGML).

SGML provides a formal - not to say highly rigorous - framework for describing the structure of documents; HTML is an example of how the SGML approach is used (in fact an extremely simple example, which is why it is so easy to create HTML documents). Like HTML, SGML applications use tags, contained within the characteristic angled brackets found in Web source documents, which define certain structures. In general these tags exist in pairs, with a 'cancelling' tag containing the slash character (also used in HTML). For an excellent introduction see Readme.1st, SGML for Writers and Editors, £30.47, ISBN 0-13-432717-9

SGML applications come with full definitions of what the tags are and precise instructions about how they can be used, contained within a Document Type Definition, or DTD. For example, often certain tags must be used just once, or only in certain contexts. HTML, by contrast, has few restrictions of what can be used where (although it too possess a complete DTD).

SGML is designed to help writers to structure their documents, and more particularly to aid in their use across many different platforms and in many different incarnations (for example printed, as a CD-ROM, online or even converted into Braille). HTML also possesses the former useful ability - critically important in a mixed environment like the Internet - though the latter is not so relevant.

Signatures

Internet signatures are used in two contexts: at the end of E-mail messages, and for Usenet postings to newsgroups. As its name implies, it is the online equivalent of the scrawl we append to pen-and-paper communications, and is a chance to add a personal element to a cold assemblage of ones and zeros.

As well as your name, in whatever form you choose for online communications, there are a number of other elements that can be added.

Your E-mail address is not one of them, since it will be displayed automatically elsewhere. Alternative E-mail addresses might be useful if you wish to redirect correspondence to another mailbox. Physical addresses and telephone numbers are also common, and are useful for others, though bear in mind that if these are personal addresses you might be handing them to someone you have never met (through E-mail) and might never want to meet (in some Usenet newsgroups). Women in particular might want to consider this before adding them.

Many people use some favourite quotation to give a personal touch. Although this can be amusing the first few times, it can become rather tiresome. Similarly, the extravagant designs some people concoct for their signatures using letters and numbers soon become a waste of transmission time. More useful in a business context is a discreet advertisement: something along the lines of "Acme plc - serving software to the nation".

Provided these are kept very short and moderate in tone, they are generally acceptable, though for Usenet postings, this does vary according to the newsgroup.

Sitemaps

Frames within Web pages are wasteful: often they are little more than browser technology for technology's sake. But there is one use that is more justified: this is to employ a small frame as an overall guide to the Web site.

One of the biggest problems when navigating around collections of Web pages that may run into thousands is keeping track of exactly where you are on the site. Using a separate frame with some kind of hierarchical view can, if done properly, provide a much-needed sense of context. There are many possible implementations of this idea, but clearly a widely-adopted standard that addresses this increasingly important problem would be preferable.

Never one to be backward in making suggestions, Microsoft has come up with something it calls sitemaps that provide just these features. Although developed as part of the forthcoming Internet Explorer 4 release, the sitemap idea has also been submitted to the World Wide Web Consortium for consideration as a more general standard.

Basically the sitemap is an HTML file that contains hierarchical information about a Web site's structure. It is downloaded with HTML pages from that site, and can be used to provide an overall map. No new HTML tags are used, a strong point in the sitemap idea's favour. The information contained within the sitemap file can be used for other purposes. For example, it would allow Web search engines to index sites in new ways by retaining some sense of the overall structure. It could also be useful when employed in conjunction with offline Web readers.

Sound Files

Just as there are many graphical file formats in use on the Internet, so sounds are stored in a variety of ways. Most of them are sampled sounds: that is, the smoothly varying pressure that defines the frequencies and amplitudes we hear is converted into a voltage by a microphone, and is then measured many thousands of times a second and stored as a series of numbers. Expressed in binary, these form the initial analogue-to-digital conversion.

When a digitised sound is played back, the reverse process takes place, with the numbers being used to define the pressure of a sound wave, recreated with a loudspeaker or equivalent.

It is arbitrary how the numbers are obtained and stored (provided the same technique is used for digitisation and recreating the analogue form) and the various sound file formats encountered reflect slightly different approaches. The most common is probably with the file extension .wav.

The other main format has the extension .au, employed by Sun and NeXT computers. Others include .aif, .voc and .snd. These suffer from the serious drawback that to produce sounds of acceptable quality, large files are involved.

More important than some of these rarer formats are MIDI sound files. Unlike all of those described above, which are sampled representations of actual sounds, MIDI defines music in terms of collections of notes played by specific instruments. In many ways, it can be regarded as the audio equivalent of a postscript file, whereas sampled sounds are like bitmaps. MIDI files are compact, but the sounds produced depend on the MIDI synthesiser used to realise the music they store.

A one-minute .wav of CD-quality audio requires 10 Mbytes. Clearly this rules out downloading such files across the Internet, which in turn makes them unsuitable for routine use in Web pages. These audio files are produced by sampling real sounds; Midi files, by contrast, are simply sets of instructions for creating sounds. As a results they are extremely compact - typically just a few tens of kilobytes. However, the downside is that some kind of synthesiser is required to turn the Midi file's instructions into notes (Midi is only useful in the context of music).

Fortunately, today's personal computers have so much processing power that these synthesisers can be purely software-based (though a basic soundcard is still required for the final output). Yamaha is one of the leaders here, and offers a number of freely-downloadable soft synthesisers - see http://www.yamaha.co.jp/english/xg/html/midhm.html. Such synthesisers can also be supplied as browser plugs - as with the Yamaha Midplug (at http://www.cyber-bp.or.jp/yamaha/midplug/index_e.html).

This software-based approach obviates the need for users to have a decent synthesiser to play Midi files, but does mean that they will have to download software or plug-ins first. In which case they might as well download a dedicated sound program such as Beatnik (home page at http://www.headspace.com/beatnik/index.html) - which also comes with a built-in software synthesiser. Beatnik can play most audio files, but is really designed to work with its own enhanced format which uses the extension .rmf. There is a converter that turns .wav, .au, .aiff and Midi files to .rmf (see http://www.headspace.com/beatnik/converter/index.html). Although inventing yet another sound format might seem quixotic, the parent company Headspace has signed up some big names to support it. Beatnik is now included in the new Netscape Communicator 4.5, and will be in Java 1.2 (http://www.headspace.com/beatnik/index.html?developers/news.html).

The MPeg audio layer 3 format - better known as the increasingly notorious MP3 (see http://www.mpeg.org/pointers/mp3.html) - is likely to have most impact outside mainstream business. It offers CD-like quality in about a tenth of the space an equivalent .wav file would require. This makes the download of CD music tracks, and even entire albums, viable, much to the horror of the music industry. But even these compressed file-sizes are too large to be routinely useful within Web pages. A far better solution is adopt one of the streaming technologies. These allow audio to be heard before the entire file has arrived, which means even high-quality sounds can be listened to without waiting more than a second or two.

Spamdexing

Almost certainly, anyone who has an external e-mail account will have received by now several pieces of junk e-mail. Often this is called spamming, although spamming refers to the practice of posting irrelevant messages to many Usenet newsgroups rather than sending unsolicited messages to individuals.

Still, the word "spam" is undeniably attractive, so it is perhaps no surprise to see it appear in new forms to describe related anti-social or unsporting online activities. For example, the term spamdexing is used to describe the practice of trying to trick Web search engines into placing a HTML page higher in its search results.

This is clearly an attractive thing to do, since Web search results typically run to hundreds if not thousands of hits; being number 435 in the list is not much use, since few people will view more than the first few dozen. The idea behind spamdexing is to add keywords repeatedly to a page. The search engine may then give a greater weighting to this page, and thus place it higher in the overall search results.

Since such spamdexing could result in unreadable pages if carried out in the visible document, the usual practice is to hide the keywords in HTML code that is not seen by the reader (for example using the tag which is designed to give information in precisely this way - though not to be abused quite like this). In the usual online cat-and-mouse fashion, Web search engines are now developing strategies to filter out spamdexes. The InfoSeek service even threatens to remove an entry for a page completely if there are too many repeated keywords.

Spamming

In March 1994, two US lawyers, Laurence Canter and Martha Siegel, sent some messages to the Usenet newsgroups. These messages were doubly notable. first, for being advertisements - they were about a scheme connected with the US government's Green Card lottery - and secondly, for being posted to every Usenet newsgroup then in existence.

In doing so, the husband and wife team not only over-stepping the mark as far as business on the Internet is concerned - the inappropriate posting of advertisements is still frowned upon - but broke all records for wasteful cross-posting. They also entered the Internet history books and gave rise to a new word: spamming.

As a noun, spamming or spam is a kind of mass, indiscriminate posting across hundreds or even thousands of newsgroups; to spam is the action of doing so. The name seems to originate from a well-known sketch from the BBC TV comedy series Monty Python's Flying Circus, in which the meat product Spam was similarly ubiquitous on a canteen menu. Today it is applied to any kind of large-scale posting, particularly of a blatantly commercial nature. A considerable culture has grown up around spam, including sites treating it seriously (for example the advertisers blacklist at http://math-www.uni-paderborn.de/~axel/BL/blacklist.html) and not so seriously (see the excellent satirical page http://www.suck.com/dynasuck/95/11/15/ for a comprehensive collection).

Although tiresome, spam is not difficult to deal with. It is easy to skip over spams if you use a newsreading program that downloads headers first; some will let you set up filters to ignore spams automatically. There are even public-spirited individuals who dedicate themselves to removing spams from newsfeeds through the use of cancelbots.

S/MIME

As the Internet becomes an everyday tool in business, so the need for completely secure operations across it - and across intranets - becomes crucial. Indeed lack of security or a perceived lack of it has been one of the main brakes on the Internet's uptake. In fact, there exists an panoply of Internet security mechanisms, ranging from low-level protocols such as SSL and SHTTP to complete security schemes such as SET, designed to allow credit card transactions.

To these can be added S/MIME (see http://www.rsa.com/rsa/S-MIME/). This stands for Secure/Multipurpose Internet Mail Extensions, and as its names suggests, it builds on the MIME standard used for Internet mail. Although the standard is relatively new, none of its constituents is: rather, it is a packaging of several well-established techniques in order to provide a completely secure e-mail transmission method that is transparent to the user.

E-mail messages are encrypted using a standard symmetric encryption technique; the key for this is then further encrypted using the asymmetric public key technique that lies at the heart of just about every Internet security system. S/MIME also supports digital certificates so that recipients can be sure not only that the message has not been read by third parties, but that it actually comes from the person who claims to be the sender (it is otherwise quite easy to forge e-mail messages).

One of the most important aspects of S/MIME is interoperability. The idea is that any software supporting this standard will work with any other - a necessary precondition if it is to be deployed on the Internet where many different packages are in use.

Smileys

The Internet is essentially a written medium, which means that the body language and inflections of voice you would use face-to face or on the telephone are completely lost. all the recipient has your bare words - with no chance for you to qualify or explain what you really mean. To compensate, Internet users often add what are called 'smileys' to messages. These are tiny pictograms that are used to represent - very crudely - the intent of the person writing the message.

The basic smiley is :) which, when rotated through 90º, is indeed a smiling face. It is often employed to indicate that a comment was meant jokingly or ironically, and is used to divert any sense of grievance on the reader's part. Variants include the wink ;) to indicate that things have been left unsaid, and the grimace :( which is used to show that you are unhappy with the situation. Smileys are sometimes taken to extremes, like C={:*{) which smiley aficionados would claim represents a drunken chef with a moustache and a toupee. Too many smileys, or smileys that are too obscure, negate the point of using them. If you wish to use a smiley to convey something to your reader - and many people don't, finding them rather twee - then bear in mind that here as elsewhere on the Internet, less is usually more. A long list of smileys is also available.

SOCKS

As more and more companies link existing internal networks with the Internet, so the need for a powerful defensive system to keep unwanted visitors out becomes increasingly important. Firewalls is the generic name given to this area, but there are many different approaches.

One technique is known as SOCKS (not an acronym, despite its appearance). This is a networking proxy mechanism that allows hosts on one side of a SOCKS server to access those on the other; typically this server will act as a firewall between a internal corporate network and the Internet.

Unlike other firewalls, SOCKS is transparent to the user, and allows any TCP/IP application to be used; it is not limited to HTTP or FTP requests are some firewalls are. Moreover, the latest version of SOCKS can also cope with UDP packets as well as TCP, allowing users to use Internet phones and streaming audio techniques that require the former.

Other advantages of the latest SOCKS 5 is its support for various kinds of user authentication, monitoring and control. It is also much easier to implement than the previous SOCKS 4. Where the latter often required modifications to be made to software in order to use this service (or to be built-in, as with Netscape Navigator or Internet Explorer, for example), SOCKS 5 capabilities can be added more easily.

Hitherto, SOCKS has been rather an obscure and specialised area, but the latest incarnation which incorporates many new features added with typical business users in mind means that it could well become a popular approach in the firewall world.

The creases in the Socks protocol, which protects corporate networks, have been ironed out in its latest version.

For companies with connections between their corporate intranets and the open Internet, it is essential to secure the points where they meet - and to make sure they do so only at a few such controlled locations. This is nearly always effected using what are called firewalls - dedicated servers or software whose task is to monitor and if necessary block the passage of Internet Protocol packets from outside to within the corporate network. There are three basic types of firewalls: packet filters, application-level proxies and circuit-level proxies.

These operate at different levels of the communication process. Packet filters work at the transport level by examining where packets come from and blocking those that are from unauthorised sites. Application proxies work at the top-most layer of data transmission, and act as a relay between specific clients and their servers.

Packet filters tend to be rather crude, while application proxies suffer from the drawback that they have to be created for each application. Circuit-level proxies offer a kind of compromise since they operate at the session level and can work with many different kinds of client/server applications.

One of the most interesting examples of the circuit-level proxy technology uses a protocol known as Socks - an internal development name derived from SOCKetS. It has a home page and an introduction to Socks and a FAQ are also available.

Although an earlier version, Socks 4, had many advantages, notably a precise control over who could access what over the network, it also suffered from some major difficulties.

Prime among these was the need to modify clients so that they could work with a Socks proxy server - a process known as Socksification. Other deficiencies were a lack of authentication, encryption and support for User Datagram Protocol (UDP).

While the transmission control protocol (TCP) is generally used for reliable transmission of data across the Internet, the unreliable UDP (unreliable in the sense that no checks are made for lost packets) has become increasingly important for streaming media like RealAudio.

The inability of Socks 4 to cope with UDP was an obstacle to its deployment for many companies.

The current release, Socks 5, also known as authenticated firewall traversal addresses this and other issues. Major advantages of version 5 are that clients can be Socksified without being rewritten, and that both encryption and user authentication are offered. Socks 5 is an IETF standard (RFC1928).

As the above URLs indicate, the driving force behind Socks development has been the US arm of NEC. As well as free copies of the basic Socks 5 reference implementation, the company makes available evaluation copies of its commercial software.

A number of companies now produce Socks software. On the client side, Socks 4 capabilities are built into both Microsoft's Internet Explorer and Netscape's Navigator browsers.

The latter offers a free Socks 5 client library for Socksifying any Winsock-compliant software.

Momentum seems to be gathering behind the Socks initiative. The first summit took place in Q4 1998 and support for the standard among software suppliers is growing.

Speaker Verification API (VAPI)

The Internet is fast becoming the universal interface. With it, potentially, you can control any software or hardware, either locally or globally, and soon you will be able to carry out more or less any commercial transaction.

But as ever, with increased power comes increased risk. Access to these extended abilities (and extranets) is built on the assumption that the Web user is somebody with the authority to carry out those operations.

Various authorisation schemes are employed to try to ensure the person using an Internet node is who they say they are. Passwords are the classical approach to solving this resolution of identity, and yet it is well-known that most security breaches occur through compromised passwords - largely because fallible humans tend to make extremely bad choices when it comes to selecting them.

As well as easily guessable passwords involving names of family, pets and friends, there are a range of common words - including items such as "password" and swear words - that regularly crop up in password lists. An alternative approach to authentication is to use biometric systems. These measure biological attributes of the user and compare them with the known characteristics of authorised personnel.

One obvious attribute to use is the voice, and there is a consortium putting together a software standard called the Speaker Verification API (SVAPI) that is designed to allow this kind of authentication to be used with general Internet and non-Internet software. A Java-based implementation will be released first, with versions for Windows and Unix promised for the future.

Spoofing

Spoofing attacks work by misleading people about whom they are dealing with, or where they are on the Internet. The best-known are the e-mail spoofs, whereby mail appears to come from someone when in fact they derive from a third-party who has doctored the header files, and DNS spoofs, which fake Internet addresses. More recently, it has become apparent that the World Wide Web is vulnerable to spoofing on an even larger scale. Moreover, this Web spoofing, as it has been dubbed, requires far less technical expertise to accomplish.

The basic idea is to divert someone's exploration of Web space into another, seemingly identical world of WWW servers. When the spoofer is skilful, there is almost no way to tell that the sites you are visiting are in fact adulterated copies (about the only certain way is to look at the underlying HTML code, where tell-tale anomalies in URLs will give the game away).

For Web spoofing to take place, all that is needed is a URL that seems to point to one site but in fact takes you to a falsified copy. Then, using JavaScript for example, it is possible for all the URLs it contains to appear correct when they are displayed in the Web browser, but for the real links always to point within this shadow world. Only if you choose to employ one of your own bookmarks can you be sure to exit back to the real Web.

One particularly worrying aspect of Web spoofing is that is it possible to be diverted in this way even when accessing a secure site: the presence of the familiar key or padlock in the browser window merely indicates that you have entered a secure site - not that it is the one you were expecting.

Steganography

For businesses, one of the most disconcerting aspects of working with the World Wide Web is the absence of control. Because of the way the Web works, where everything that visitors view is downloaded first to their machine, it is hard to stop people keeping copies of what they find there, be it text, images, sounds or whatever. One approach, developed by IBM, is to use a complete rights management system. IBM's product, called Cryptolope, essentially employs special software to allow users to gain access to restricted material, usually for a payment.

This is fine where particularly valuable materials are concerned, but is clearly overkill for the vast majority of cases. What is required for most Web sites is something that does not prevent visitors from copying material temporarily but which provides a fairly foolproof way of tracking when this is re-used illegally elsewhere. It is fairly easy to check whether texts have been pirated, using Web search engines. But with images, or sounds, this is much harder, especially for third parties who wish to know whether material they come across is copyright-free or not.

This has led to the creation of techniques for marking visual and audio contents with their provenance and copyright information, but in a completely transparent way. A good example of such digital watermarks (as they graphics) is offered by the latest version of are often called in the context of Adobe's Photoshop. Long regarded as manipulation package for Web sites (and the top graphics elsewhere), in its latest incarnation it (that is, a process for manipulating a has added a new filter graphics file) licensed from Digimarc.

This lets you embed invisible digital watermarks within any kind of graphics file. The watermark will typically include the creator's ID (obtained from Digimarc, which maintains a directory of such IDs), as well as information about whether the image is royalty-free or not. During the creation of the digital watermark, you can select how visible and how durable it is: the more effect the digital watermark is allowed to have, the more resistant it will be. That is, even if the image is altered by someone who hopes to use it on another page, say, the digital watermark will still be readable by Photoshop or similar software.

On the other hand, if image quality is paramount, it is possible to embed the watermark very lightly so that it has no visible effect; but in this case even small changes to the file may result in the hidden information becoming scrambled to the point of unsuitability. The appearance of the facility within Photoshop is probably the most significant example of this technique of hiding information within other files, simply because of Photoshop's large installed base. But it is by no means the only software to allow you to do this. There is a whole class of programs in this field technically known as steganography. A very good introduction to the whole subject can be found at. There is also a list of steganographic tools available. Most of these are very low-cost shareware: for the Windows 95/NT platform there is S-tools, and for the Macintosh there is Stego.

Steganography has an importance beyond that of simply embedding rights and trademark details. Since it allows you to place any information within more or less any multimedia file without it being apparent that there is anything there, it represents a very powerful form of encryption: through concealment, rather than through hard-to-crack, but obvious, schemes such as PGP. In particular, steganography is one reason why the crude forms of censorship that have been proposed for some newsgroups simply will not work: anyone who wishes to post an image - of whatever kind - would simply conceal it in a completely innocuous form and post that to another, neutral newsgroup like alt.test where others could download it and retrieve the hidden data.

The big advantage of the steganography approach over conventional cryptography, is that before standard decryption techniques can be brought to bear the right multimedia file must first be discovered; not necessarily easy if the carrier is hidden among hundreds or even thousands of similar files.

At first sight it might seem impossible to place hidden data within an image, say. But steganography relies on the fact that the level of detail contained within an image is often far greater than that noticed by the human eye. So, for example, if a colour picture is represented in the usual way as pixels chosen from among 255 colours, each stored as an eight-bit number, it is possible to change that number by the last digit without appreciably altering the colour.

So digital information can be placed within an image by changing the colour of a certain small subset of the pixels. Unless the quantity of information so hidden is relatively large, or the image ill-chosen (large areas of one colour, for example), the fact that it bears hidden content will not be apparent. Similar techniques can be used with digitised sounds. The data can be retrieved by another copy of the steganographic program employing the same algorithms and password (which is used to ensure that the distribution of altered pixels is unique).

Structured Query Language (SQL)

Application Programming Interfaces, or APIs, are increasingly being used to provide ad hoc but useful extensions to Internet standards. Much the same can be said about SQL, or Structured Query Language.

An important issue in companies today is how to hook up to the various databases that are in use via the external Internet or internal corporate intranet.

Since each software manufacturer has a different approach (which naturally it claims is superior to that of its rivals), this could mean having to cope with many different systems. Pulling information together from diverse stores of information - one of the goals of the kind of Web-database integration discussed above - then becomes very complex.

Fortunately there is a very widely used standard that brings some degree of order to this potential confusion. SQL is a way of making requests for information to compliant databases using a standard set of commands and syntax. Although there are some divergences in the various implementations of SQL, it does offer a basic starting-point for cross-platform database queries. This means that requests for information from the Internet can be converted into SQL queries that are passed on to one or more databases without the middleware (or IT department) needing to worry about the details.

Just as APIs can simplify greatly the task of interfacing between Web servers and other programs, so SQL is emerging as probably the most important aid in drawing in databases too.

Synchronised Multimedia Integration Language (SMIL)

The World Wide Web has grown up in an ad hoc way, starting with text, then adding images, sounds and video. So the simultaneous use of these multimedia elements has never been addresses properly. In particular, the only way to create a constantly changing flux of text, sounds and images is to create a video stream. This approach is both inflexible and inefficient, as video tends to require high bandwidth. To remedy this omission, the World Wide Web Consortium has come up with the Synchronised Multimedia Integration Language (SMIL). this is a mark-up language, an application of XML, that allows multimedia elements to be specified through simple instructions contained in a text file. SMIL permits multimedia streams to be played sequentially or in parallel, and for different elements to be placed in absolute positions on the screen. In this way, it would be possible to display, for example, two video streams with associated audio, text captions and background images, all changing in time according to the designer's requirements.

Also, hotlinks can be embedded in video multimedia elements, as well as in text and images. This would allow SMIL presentations to offer full interactivity.

Because SMIL is a text file, and allows each multimedia element to be specified separately and then superimposed, it promises a simple development process and far smaller files for even complex presentations using many different and changing elements.

Tables

In the growing battle of the Web browsers the issue of support for "HTML 3 elements" is often encountered. This is rather ironic, since HTML 3 - that is, the third official implementation of the HyperText Markup language that underlies all World Wide Web pages - does not even exist.

Although the current standard is still HTML 2, the overall shape of HTML 3 is clear, and so first Netscape, and then Microsoft and other software manufacturers, are leaping in with browsers that comply with their own assumptions about how the new elements of HTML 3 will be implemented.

One of the most important of these is the table. Under HTML 2, the only way to construct a table is to space out the individual elements by hand. HTML 3, however, will allow you to specify all kinds of details, from the number of rows and columns, size of individual elements, the thickness of the borders and the spacing between an entry and the border that contains it. Moreover, you can use more or less any HTML element within a table: these might include text of different sizes and weights, images and even further tables.

Obviously the ability to structure information in this way is attractive for designers, but the inclusion of support for tables in HTML 3 has an even bigger if more subtle benefit. Because of the way HTML works, under HTML 2 it is not simple to create side-by-side layouts with text flowing from one column to the next, as in a newspaper. Tables will allow precisely this effect, which is one reason why they are being so eagerly adopted even before the standard has been formally approved.

Tcl/Tk

One of the stalwarts of the scripting world is Tcl, which stands for tool common language and is pronounced "tickle", created single-handedly by Professor John Ousterhout of the University of California, Berkeley.

A Netscape plug is available that lets you use Tcl as a scripting tool within Web pages.

Tcl is both the scripting language and an interpreter for that language. As the latter it is designed to be easily embedded in other applications, and allows Tcl to act as a scripting language in a wide variety of situations, including Web pages.

Tcl is also readily extensible, and perhaps the best-known extension is Tk, a windows toolkit that adds important graphical abilities. One of the other major advantages of Tcl is that it is now becoming available for a wide variety of operating systems.

TCP/IP

TCP/IP stands for Transmission Control Protocol/Internet Protocol and so it might appear to be only those two things. It is, but also includes ARP (Address Resolution Protocol), UDP (User Datagram Protocol), telnet, SMTP (Simple Mail Transfer Protocol) and DNS (Domain Name System).

It is therefore a large family of protocols, or rules - not software (though products may implement them). These rules specify in detail how information is sent across the Internet, and how various services such as telnet are implemented.

They are intimately bound up with the Internet because they represent the fundamental basis for the interconnectivity and interoperability of all of the constituent networks and subsystems it comprises.

TCP/IP derives its name from two of the most important elements of this family of protocols. The Internet Protocol handles fundamental issues such as addresses and sending the basic data packets over the Internet.

The Transmission Control Protocol deals with higher level functions: it ensures the delivery of data where possible, and reports when it cannot be delivered and splits up data into IP packets, checks for lost packets and retransmits them where necessary before re-assembling them at the receiving end.

Teleworking

Along with the never to be realised 'paperless office', the other fable in computing is that of teleworking. Paradoxically, the widespread uptake of the intranet idea in companies may finally lead to the long-predicted shift to teleworking, that is, work outside the office building.

The paradox is resolved by the Internet's ability to act as a bridge between intranets, and between individuals and intranets.

A site for information on this area can be found at http://www.mtanet.co.uk/, put together by Management Technology Associates (MTA). Web pages on telework can be found at http://www.mtanet.co.uk/mta_oen/tw_intro.htm. From here there are links to pages about the kinds of telework and various re-commendations to managers for its implementation.

What could well be the third wave of the Internet in business are extranets. Like intranets, extranets are systems built using standard Internet technology. But where intranets are turned inwards to embrace employees within a company, extranets extend this network outwards to offer similar functionality to those working closely with it but separate from it - for example, designers, consultants, suppliers, customers etc.

An important component of intranets and extranets in the future will be search engines. As more and more information is placed on intranets, and they become central to the way companies work, so the difficulty of finding something, and need to do so, will increase. Already the latest generation of Internet search tools - AltaVista, Excite etc. - have versions aimed specifically at the intranet market.

A corollary to the rise of intranets will be the spread of the personal intranet Web server. Just as today people use basic voice-mail systems to route requests for information, particularly when they are away from their desk, so in the future sophisticated Web pages held on PCs will allow a vast range of information to be on tap within a company continuously, as well as providing a permanent and accessible store of employees' personal knowledge and skills.

Get a little bit extra with your extranet

One barrier to the uptake of extranets is the difficulty people have in visualising what exactly an extranet might be. Perhaps the best way of understanding extranets and the particular benefits they offer is to contrast them with the Internet and intranets.

The Internet is the open, global network based around TCP/IP; it is essentially public (even though various security layers may be added). An intranet, by contrast, is a closed network based on the TCP/IP protocols; it is private, usually restricted to employees of a particular company.

The Internet offers the possibility of communicating with huge numbers of people, but because it is likely that nothing is known about these, the information provided is carefully selected, and the external Internet servers are rigorously isolated from other corporate systems which may well contain sensitive material.

It is precisely this kind of information that is generally available over intranets. The idea is to provide a means of access to all the key data that is generated and held by a company, but only to those authorised to view it. This is enforced using various security systems such as passwords and digital certificates.

Intranets have been a tremendous success, not least because the benefits they bring are so clear: in effect, they realise most of the benefits long-promised as the fruits of integrated IT systems. But they suffer from a limitation that is becoming increasingly obvious. For businesses do not operate in isolation; every day they work closely with suppliers of raw materials, finished goods and services, and, in various ways, with customers. Any IT system that aims to mirror faithfully the whole business process must also encompass these partners too.

An extranet aims to do this by extending the power of the intranet beyond the company's walls. In particular, it aims to allow trading partners to access various elements of the online corporate resources. It differs from Internet services in that it is not open to absolutely everybody, and because it enforces very strict security to ensure that users of the extranet are able to access only those areas for which they have authorisation. One reason why extranets are likely to surpass even intranets in importance is simply a question of mathematics: a business will generally possess only one intranet, but may well be part of many other companies' extranets.

Examples of extranet applications include tightening Just-In-Time tolerances even further, with suppliers' IT systems linked directly into a company's production facilities. Extranets will allow marketing, PR and general business consultants to keep up-to-date with the projects they are involved in, without the need for formal meetings or even their physical presence in a company. Customers, too, can be drawn into an extranet, with particularly interesting implications.

For example, customers might be involved in continuous market research projects whereby they gain access to experimental projects and provide instant feedback. By granting them access to certain parts of a company's IT system, customers can also carry out many time-consuming tasks - the placing of orders, checking the state of their account etc. - themselves, enabling mundane clerical work to be eliminated.

The possibility of integrating customers with a business in this way has important implications in terms of loyalty. The more that users feel that a company is serving them directly, the more likely they are to have a sense of ownership, and regard its products and services as something in which they have a personal stake.

Readers may recall a description of the FedEx system where customers could book pickups and track deliveries by making their PCs part of the main FedEx IT system. In retrospect, this can be seen as one of the earliest extranet systems - and a highly successful example at that.

Netscape looks at the future of extranets

Extranets promise to be even more important than intranets, creating a complex web of inter-company links around the world, like the Internet, but differentiated from it because of the security mechanisms that are an integral part of this approach.

As well as their general business implications, extranets will have a profound effect on the whole of corporate IT. Intranets and public Internet sites are essentially under the control of companies: they can choose freely which technologies to employ, and which suppliers to buy from. But extranets introduce new participants that are outside the control of a single IT department. Partners that join a business's extranet will generally have their own IT solution in place, and only rarely will they be willing - or even able - to change it.

As a consequence, those implementing extranets will be forced to opt for platform- and supplier-neutral solutions - otherwise current or future members may be locked out through the use of proprietary techniques. In other words, extranets must employ only the purest of open standards, based around core Internet technologies.

This, of course, is precisely Netscape's strength, so it is no surprise that it has based much of it future strategy around extranets, or "The Networked Enterprise" as it terms it in one of its characteristically well-written white papers. As well as mapping out in some detail its product line for the coming year, this document also offers one of the best introductions to the theory and practice of extranets.

Netscape uses the slightly redundant term of Crossware to describe the platform-independent software that will be needed to make extranets a reality. In truth this is more a case of dressing up old ideas with a fancy new name, and explicit mentions of Crossware figure rarely in the document, though the underlying idea is of course omnipresent. One exception is in the context of the product code-named Palomar, which is to be a "visual Crossware development tool", whatever that is.

More interesting are the details of the next generation client and server products. Beyond Communicator 4, currently nearing the end of its beta-testing, there is a client code-named Mercury. Alongside the usual promises of improved Java and JavaScript performance, one real innovation is the ability to script all aspects of the browser environment. This is achieved with JavaScript, which can be signed for security, as can Java applets. Mercury will also come with a built-in personal Web server (clearly a response to Microsoft's free Personal Web Server).

New networking features include agents that will help sift incoming information (a promise made by many companies, and still nowhere near realised), a full-screen "Internet desktop" (rather like Microsoft's Active Desktop, presumably) and a Hypertree representation of local and network resources, including those offered using the new WebNFS file system.

Offline browsing will be a possibility, as will roaming - the ability to log in to an extranet from any point as if you were sitting in front of your primary machine. On the server side, the current SuiteSpot 3 will be followed by a range of products code-named Apollo. Server-side scripting will be enhanced, though whether Apollo will be able to match Microsoft's powerful new Active Server platform in this respect remains to be seen.

Another area where Netscape is beginning to lag seriously is in transaction processing. Apollo will offer support for "advanced transaction processing and TP monitor capabilities", but Microsoft already has a product out, and is moving ahead fast.

This potential server-side deficit is rather ironic given Netscape's current lead in this field. It suggests that in its battle against Microsoft's Internet Explorer, Netscape may becoming too focussed on the client end of the equation. Still, there is no denying that Netscape's overall story is stronger than Microsoft's: if extranets do indeed take off then the latter's more proprietary approach will certainly have to be modified.

Telnet

Using telnet you can potentially control a computer anywhere on the Internet as if you were connected directly to it. This transparency is also telnet's biggest problem: since you are almost invariably using telenet to control some mainframe or mini-computer located elsewhere. this means that you need to know how to use a mainframe or minicomputer right down to tendering obscure commands at an on screen prompt. There is no friendly graphical interface but often access is faster and more direct.

It is one of the simplest, most powerful - and most neglected - aspects of using the Internet. Its neglect stems largely from the fact that the various public telnet connections available have little in common with each other: since you are controlling a mainframe or minicomputer somewhere else on the Internet it is necessary to know how those specific systems work. Unfortunately, there are no general rules about which commands you should use to do what; knowing how to use one telnet system does not necessarily help you manipulate any other. As a result, telnet tends to be regarded as too difficult or specialised by most users.

Although it is true that telnet presents challenges which FTP with its well-established structure does not, you are not in fact completely on your own when grappling with this tool. Although relatively unknown, the Hytelnet program put together by Peter Scott tells you practically everything you need to know about publicly-accessible telnet systems, and makes them as easy to use as they can be.

The idea behind Hytelnet is very simple. Information about most of the telnettable systems has been gathered together into a database of Internet addresses along with information about how to control the various systems you will find at these. This in itself would be useful enough, but what makes Hytelnet particularly powerful is that this information is presented in the form of a mini-hypertext system - hence the name Hytelnet, which derives from hypertext-telnet.

So, the initial menu offers you broad categories of telnet systems, for example library catalogues. Selecting the first catalogue link takes you to another menu broken down by geography - The Americas, Europe and Asia. Under Europe you find a list of 22 countries, and from each of these there is a hypertext link that takes you to a list of the library resources available in the respective country. The last level of information contains the specific details. For example, the entry for the Guildhall's telnettable library catalogue reads as follows:

which, as you can see, includes the address, login, password and username information - as well as the all-important exit command.

Hytelnet's abilities do not stop there: by selecting the Internet address (itself a hypertext link) the telnet function is started (assuming it is available on your system) and the connection made automatically to the distant computer specified by that address.

Hytelnet is available for a number of platforms including Unix, VMS, MS-DOS and the Apple Macintosh. The main repository of files is at ftp://ftp.usask.ca/pub/hytelnet/. The program is shareware, and the author requests a payment of $20 if you find it useful. There is also a very good free Microsoft Windows front-end to the DOS version; it is obtainable from http://www.connix.com/~clouette/hywin.htm.

The database that lies at the heart of the program is updated every few months (it is currently at version 6.9); information about upgrades and all other aspects of the program can be found at the URL http://www.lights.com/hytelnet/.

Templates

The hypertext markup language (HTML) that underlies all Web pages is inherently so simple that even the most basic text file can be viewed online: in a sense, it is almost impossible to produce something that does not work on the Web.

As a result, in the early days of the Web nobody thought twice about knocking together pages using HTML tags. Naturally, though, as the market has matured and less-technically adept users have moved online, there has been a growing demand for Web creation tools that obviate this need to code. This has led to HTML editors that come with pre-packaged Web page templates: all the user needs to do is slot in text and images at the appropriate points.

Paradoxically, this rather trivial approach has found its most successful application in the distinctly non-trivial world of electronic commerce. By far the commonest way of creating a business site is to choose from among the shop templates supplied with an electronic commerce package, and then to change the details as necessary.

In fact this is eminently sensible. Electronic commerce sites are almost always built around back-end databases, and work by drawing on their data to create Web pages dynamically. Templates are not just convenient, they are almost inevitable, given this clear separation between form and content.

Moreover, whether you are selling books, bottles or boots online, the steps that a customer goes through will typically be very similar. This underlying commonality also makes generic templates, suitably tweaked for particular situations, the obvious route to take when creating electronic commerce software.

Three-Letter Acronym

Alongside the FAQs and RFCs, there is another class of documents known by a TLA, or Three-Letter Acronym. These are the FYIs, which stands for For Your Information. In fact FYIs form a sub-class of RFCs rather than a completely separate group. And whereas there are now getting on for 2000 RFCs (and with what seems an acceleration in their rate of production), there are only just over 20 FYIs.

The FYIs tend to be more general and aimed at a less technical readership than the RFCs that embody the essence of the Internet expressed in its purest, most abstract form. As FYI1 puts it: "The FYIs are intended for a wide audience. Some FYIs will cater to beginners, while others will discuss more advanced topics." This is reflected in the titles: so, for example, alongside 'The Tao of IETF', 'There's Gold in them thar Networks' and 'Who's who in the Internet' are others entitled 'How to Use Anonymous FTP' and 'What is the Internet' plus reports with irresistible names like 'A Survey of Advanced Usages of X.500'.

Because of their provenance, FYIs also possess an RFC number, but slightly confusingly, the FYI number does not change even if the RFC does. For example, FYI4 is currently the same as RFC1594, but was RFC1325 (which RFC1594 replaces).

Obviously if you know which RFC number the FYI corresponds to, you can retrieve the RFC directly. Alternatively, you can generally find FYIs stored alongside RFCs in all the main Internet document stores.

Time zones

A site half the globe away tends to reply more slowly to a request for a file or a Web page than one around the corner. This comes down to a simple matter of wiring: local machines are generally separated by fewer intervening networks than those that are physically distant from each other, with fewer opportunities for bottlenecks.

However, it can also be true that your connection is faster to a distant machine than to one down the road if the Internet address on the other continent happens to be located "near" to the particular link you use (in terms of the connections that lie between you).

Much more important than the relative physical distance is the absolute time zone of the site you are trying to access. If you are requesting files or pages from a site during the middle of the night (local time), the response will be much faster than during the day there.

This is particularly true when contacting the US. Early in the morning (in the UK) is late at night in the US, and download speeds tend to be close to their theoretical maximum (generally dictated by the capacity of the connection you have).

But come early afternoon, as the East coast of the US begins to access Internet sites, the response drops noticeably. And by the time the West coast joins in, transatlantic links are often saturated, and it becomes almost impossible to access popular US sites.

For this reason it is important to bear in mind the time zone of the site you are accessing, and, where possible, to choose an hour when most local users will be offline.

TLS

The tribulations of Netscape (c. 1998) are tending to obscure the very real contributions the company has made to the current flowering of the Internet in business.

First among these, of course, was the development of the original Navigator browser: this introduced both new features and unheard-of levels of performance. Both helped to garner millions of new and enthusiastic users for the Internet world. These original achievements were later built on with the introduction of novelties such as support for frames, tables, JavaScript and Java.

But perhaps just as important as these milestones in browser technology are those on the server side. The first Netscape servers - particularly the Netsite Commerce Server - helped to convince business that Web sites could be more than just online advertising hoardings. Central to this move from marketing to money-making was the introduction of secure transactions.

Using Netscape's proprietary Secure Sockets Layer (SSL) it was possible to encrypt all the transmissions between Netscape clients and servers. This applied to all protocols, not just HTTP. The arrival of SSL made secure financial transactions possible, for example those using credit cards, and effectively created electronic commerce as we know it.

Although Netscape has made SSL freely available, the Internet community decided to develop a truly open version called Transport Layer Security, based almost completely on SSL 3.0, but not interoperable with it. It is likely that Transport Layer Security will gradually supersede SSL, which will remain, none the less, one of Netscape's most important legacies to the Internet world.

Transaction processing

As transactions on the Internet become more complex, particularly those in the financial sphere, so the perspective is changing. Where before they could be considered as relatively simple events in themselves, now they need to be treated as collections of more complex elements, related in various and possibly complicated ways. In particular, there is an increasing concern about the overall success of online transactions.

Why this is important can be best understood by an example. When money is transferred from one bank account to another during a financial transaction, there are two separate processes: the money being debited from one account and credited to another. It would not be acceptable for one to take place without the other to balance it. In other words, there is a requirement for the overall transaction to be successful, not just parts of it. If it is not, all its separate elements must be rolled-back - cancelled - to give a total failure.

This all-or-nothing approach is now starting to appear on the Internet. Here the task is potentially even more complex than for classical transaction processing, where you will typically have close control of most of the constituent processes. On the Internet, though, these may be taking place around the world.

This means that the transaction processing tools must ultimately be able to cope with a distributed, component-based architecture. Given that the component architectures such as Corba and DCom are themselves highly immature, this indicates the scale of the task facing software companies seeking to create robust transactional processing tools that can operate across the Internet.

Trolleys and carts

Selecting goods online is simple enough, but the next step - storing a possible list of goods to be purchased in electronic shopping trolleys, carts or baskets - is more tricky. Because a Web session has no memory, each transfer of information to and from a server occurs as if it were the first.

Holding lists on the server is inefficient. In part, this is because connections may be dropped during the visit of a potential purchaser to a shopping Web site, leading to unclaimed temporary files left lying around on the server. Or the visitor may simply disconnect without warning, for other reasons. Even if the session is ended in an orderly fashion, there may be problems because of the way people use the Web.

For example, after looking through a site and selecting a few items for possible purchase, users may well log off from the Internet but leave the browser open on the computer. The next day, they can reconnect to the site and expect to continue where they left off. To allow for such truncated or incomplete sessions, a supplier's server would need to store huge amounts of data, most of it never used again.

Instead, the items in electronic shopping trolleys are generally stored on the buyer's machine in the form of browser cookies: short text entries containing information about visitors and what they did at a particular site. In this way, trolley contents can be carried over from session to session without wasting space on the supplier's server. Information is only saved there once a transaction is complete and the order is placed.

Unicode

Much is made of the Internet's global reach and how it brings together most nations of this planet. But in one respect it is extremely parochial. Not only is English the dominant language of Internet communication, but the very limited character set known as ASCII is used.

Of course, this means that marginally exotic elements such as letters with accents are impossible with most types of Internet communication. Instead, forms such as "ae" for æ need to be used.

In this respect, the World Wide Web is a little more civilised: it is possible, using special groups of letters, to display most of the main Western European characters within a Web document.

However, this still discriminates against the rest of the world's alphabets and scripts.

The problem lies with the 8 bit nature of the representation of characters (seven in the case of ASCII). Clearly, to encompass the Semitic and Indian scripts, to say nothing of the thousands of Chinese, Japanese and Korean ideograms, a larger space is needed.

The most promising solution to this problem is called Unicode, which uses 2 bytes (16 bits) per character to offer a theoretical 65,536 possibilities. This is large enough to embrace all current scripts (including all the various oriental systems), and many ancient ones, as well as the most important collections of symbols.

Support for Unicode is still limited, but one area where it has been fully implemented is in Java. This means that Java offers the novel possibility of writing parts of the code in the user's own alphabet.

Interestingly enough NT's version of notepad uses Unicode as the default file format, not ASCII; something to watch when using some 16-bit applications.

Uniform Resources Locator (URLs)

The Uniform Resources Locator (URL) presents two pieces of information very compactly; it gives the address of an Internet site, and it tells you what kind of services the site offers. For example, for a World Wide Web server the URL would be:

The first part (http) stands for HyperText Transport Protocol, and indicates that it is a WWW site. The next three characters (://) are used as markers before the Internet address www.ibm.com, and the final slash (/) is again used as a marker. When a particular hypertext document is specified the URL is of the following form:

The additional part after the main Internet address (www.webnet.ie) gives the directory in which the file index.html can be found. The file extension .html indicates that it is a document written in the HyperText Mark-up Language (HTML).

which describes a file in the directory /Web/Mosaic/Windows at the FTP server whose Internet address is ftp.ncsa.uiuc.edu.

which simply specifies that the site is a Gopher server with the Internet address gopher.isoc.org.

Unix

Although Windows now dominates the desktop more or less completely, its place in the Internet world is far more circumscribed. Indeed, whatever strides it is making on the client side, its impact in the server area has so far been limited. (Windows NT, the most suitable candidate for this area, has still only a small percentage share.)

For most "serious" Internet applications Unix reigns supreme. In part, this goes back to the Internet's second era when it became a largely academic environment. Unix was (and still is) widely used in universities, and so it was natural for it to spread to the Internet.

However, there is a problem with Unix: it generally costs money; often quite a lot. This led to some tension between a natural desire by aficionados to use the Internet standard and the equally established principle that ideas and software are free on the Internet.

The solution to this conundrum was provided in 1991 by a young Finn by the name of Linus Torvalds. After cutting his programming teeth on a Sinclair QL, Torvald started putting together an operating system for a newly-acquired 386 PC. As time went on, this personal project turned into a full-blown free Unix clone running on a PC, and has been taken up by a like-minded community of enthusiasts around the world, linked by the Internet.

Today the Linux project continues to develop and grow in ambition to the point where it is now a viable operating system for serious business use. (Some claim it is even more stable than commercial varieties of Unix.) It is available free from numerous Internet sites.

Usenet newsgroups

The Usenet newsgroups form a kind of electronic notice board that is passed around the Internet each day automatically. As it does so, Internet users everywhere post comments to previous messages and add others starting new threads of discussion. The notice board is divided up into separate areas of interest, called newsgroups, even though only a small minority are today used for passing on true news.

There are now approaching 10,000 such newsgroups, and the total daily Usenet traffic is around 200Mbytes. Within this vast quantity of messages there is an enormous variety of subjects.

There are highly technical groups such as comp.unix.programmer, comp.os.ms-windows.nt.misc, comp.lang.basic.visual.misc as well as more general ones such as misc.entrepreneurs. There are cultural areas such as soc.culture.indian (soc.culture.* includes most countries e.g. british, china, hongkong, pakistan, singapore, taiwan as well as related groups intercultural and punjab) and many unclassifiable groups (often prefixed with 'alt') devoted to just about every aspect of human activity: alt.lefthanders, alt.babylon5.uk, alt.html, alt.drwho.creative and so on. Recreation is covered in rec.* (e.g. rec.audio.high-end, rec.photo.digital, rec.video, rec.video.production and UK interest largely in uk.* (e.g. uk.education.misc, uk.jobs, uk.misc, uk.adverts.computers, uk.media.tv.sf.babylon5).

To read Usenet newsgroups you will need an Internet supplier that has a 'feed'; that is, one that receives this enormous quantity of daily information from another Usenet site. It is not uncommon for such feeds to be censored, with some of the more way-out groups removed.. You will also need a suitable newsreader, of which there are many. These let you select which newsgroups you receive, then whether to save or delete them, and also let you post your own comments. Netscape includes an excellent newsreader if you have a permanent connection to the Internet. Off line readers such as Free Agent and Netcetera allow you to down load all newsheaders to select offline or to down load all updates to your selected newsgroups. As long as no one posts binaries or uuencoded binaries this is an effective way of browsing the newsgroups.

Although many of the Usenet postings are pretty vacuous, especially in some of the more transient newsgroups, there are others of real value, Particularly useful for businesses are the areas devoted to highly specialised fields within computing; it is common to see the most abstruse technical problem resolved in a few hours when one of the subject's gurus logs on somewhere in the world.

As so often on the Internet, no payment is required for this virtual consultancy.

The best place for a newcomer to start is a newsgroup called news.announce.newusers. Not only can the average user not post messages here, but the messages are the same, posted periodically. Together they form a basic catechism for Usenet acolytes. Issues covered include whether you post anything at all - would electronic mail be more appropriate; which newsgroup you post to; whether you post to other groups as well; the length of your posting; and how much you quote of the message you are commenting on.

Complementing this is an Internet classic called 'Emily Postnews Answers Yours Questions on Netiquette", which satirises some of the worst mistakes you can make with Usenet. Other documents include 'How to Work with the Usent Community', 'Hints On Writing Style', and 'A History of Usenet And Its Main Software'.

A number of informative documents are found in news.lists. Among these are lists of all Usenet newsgroups, answers to frequently asked questions for most of them, and statistics on the traffic in various areas. If you have any further questions, you could ask in news.newusers.questions before embarking on your first 'real' posting.

Minding your Internet manners in the Usenet newsgroups

The section on E-mail looked at how writing effective e-mail messages required a certain sensitivity to the particular characteristics of this new medium. Whatever the potential pitfalls of E-mail, they are nothing compared to those you face in the Usenet newsgroups. With E-mail you are generally writing to one or two people about whom you know at least something; with Usenet postings, your words may be read by tens of thousands of Internet users about whom you generally know nothing. The problems and challenges discussed in E-mail are therefore magnified enormously.

For example, the possibility that someone might misunderstand your words - particularly any attempts at humour or irony - is greatly increased for a Usenet posting. Although English is the unchallenged lingua franca of the Internet, an increasing number of Internet users speak it as their second or third language. This places an even greater premium on clarity and directness of expression.

But even things that are expressed with perfect clarity may be problematic in Usenet newsgroups. So much of what we write is laden with subtle cultural references that are lost on anyone not belonging to our particularly group. Things that are clear to you and your peers may be inscrutable mysteries to anyone outside the various circles you frequent (be they social, professional, local or national). Almost any cultural assumption you make in your comments will be false for at least some of your Usenet readers.

Of course this is less of a problem when you are posting to a newsgroup such as comp.databases.oracle or comp.unix.sys5.r4. Here you can make certain assumptions about the common (technical) background of the participants. Effectively, each newsgroup comes with its own frame of reference that defines the common currency of discussions and the context in which they are conducted.

As well as studying the FAQ, you should also spend some time lurking in the newsgroup. That is, reading the postings there for a few weeks in order to pick up the general tenor of the discussions. This will enable you to develop a feel for what is and is not done in this particular part of the Usenet alongside the few rules applying to all of it.

As far as the latter are concerned, it is generally accepted that cardinal Usenet sins include the following: flaming (rabid and unbalanced attacks on others posting messages to a newsgroup); the quotation of entire messages only to add 'me too' at the end (a complete waste of the valuable Internet bandwidth - unfortunately the original postings may have 'disappeared' so including the previous comments becomes essential); the posting of a message to the entire newsgroup when your words would be better directed via E-mail to one person only; lazy cross-posting (where you send the same message to several Usenet newsgroups in the desperate hope that someone somewhere will care - or worse send the same message separately to several newsgroups so that they appear to be different); grossly inflated signatures at the end of your message (four lines is considered the absolute maximum); inappropriate use of the newsgroup for advertising purposes (though what is appropriate will vary from area to area); and posting pyramid letters of 'Get Rich Quick' schemes (both of which will result in many junk E-mail messages being sent to you as a fitting punishment).

How to set up a newsgroup for corporate use

Usenet's dubious reputation has discouraged IT managers in the past from exploiting its undoubtedly rich potential.

Usenet newsgroups are on the periphery of business use of the Internet. In part this reflects the dominance of the World Wide Web. But it is also a consequence of the slightly dubious reputation Usenet has acquired. While it is true that many of the 15,000 or so newsgroups are worthless, the vast majority exist as useful forums for discussing highly-specialised topics of interest.

In particular, there are many newsgroups in the computing field which are invaluable, free resources. Moreover, new blocking software enables managers to regulate what exactly users can access.

If people think about newsgroups at all, it is as a huge, uncontrolled public noticeboard that circulates the Net, gathering postings and comments as it goes.

This is a fair representation of the ordinary Usenet newsgroups, but it is, by no means, the whole story.

For there is nothing in the underlying Network News Transport Protocol (NNTP) that says such postings have to be in the main Usenet hierarchy. Indeed, there are important local variations to the Usenet structure, where national newsgroups are created and distributed to a small group of users. An extension of this idea is to create a Usenet newsgroup that pertains not just to a geographic area, but to an individual company. This can be easily done by running your own news server.

Various kinds of product exist - freeware, shareware and, increasingly, commercial applications.

Through a news server a firm can create its own personal newsgroups for its particular market, customers and products. It is then possible to allow people to use news clients - either standalone or built-in to Web browsers - to access these quite separately from the public Usenet groups.

To do so, all that is needed is the Internet address of the NNTP server. The usual form of a news URL is news:alt.winsock, where alt.winsock is the name of the newsgroup. However, this is a shortened form of the full URL news://news.acme.co.uk/alt.winsock, if the NNTP server had the address news.acme.co.uk.

By making the address of the company news server known either to all, or just a private group of users, you can create your own public or private discussion groups.

Perhaps the best example of these has been created by Microsoft. If you enter the URL news://msnews.microsoft.com/ in a browser that supports this syntax you will obtain a full listing of the many hundreds of newsgroups that Microsoft has created to support its products (alternatively you can enter msnews.microsoft.com as the default NNTP server in your newsreader software).

As you will see, the structure is exactly the same as Usenet, the only difference is that these are controlled by Microsoft and dedicated to its products.

Another example is offered by Apple, which has set up newsgroups to support its new Cyberdog software. The address is identical to that of the Web server at http://cyberdog. apple.com/, though obviously the protocol used is different (and with it the port number).

Lest this gives the impression that such useful marketing adjuncts are the exclusive province of multi-billion-dollar firms, take a look at a very small-scale newsgroup area set up by Byte magazine as part of its online activities.

The same techniques could be adopted by any company with a permanent Internet access or, perhaps even more easily, with an intranet. To save on hardware costs you can run the news server on the same machine as the Web server. Newsgroups on an intranet would allow all kinds of internal forums to be created, ranging from company-wide electronic noticeboards to tightly-focused discussion areas for small teams or particular projects.

Inreference

The free site at http://www.reference.com/ claims to offer the ability to find and participate in 150,000 Usenet newsgroups, mailing lists and Web conferences.

The site is run by Inreference, and according to the information at http://www.reference.com/pr_nasa.htm, the company employs supercomputers and 2 terabytes of disc storage owned by Nasa for the service. It has also recruited various luminaries, including Brent Chapman, author of the popular Majordomo mailing list software.

Reference.com (see http://www.reference.com/pn/help_1.0/newuser.html) lets you search for Usenet groups and mailing lists by name, as well as search through their archives (though not for all mailing lists). There is an advanced search option which allows you too refine the search by specifying the author of a posting, when it was posted, etc.

If you register with this site (at http://www.reference.com/cgi-bin/pn/go?choice=register1) - there is no charge - you will be able to carry out two more operations. First, you can post to Usenet groups (useful perhaps if your service provider carries no newsfeed, has a restricted newsfeed or you are travelling and only have Web access), and secondly you can set up a personal profile. This consists of searches that are run automatically and regularly by Reference.com; the results are then E-mailed to you.

The site is a useful addition to the search tools available on the internet, particularly because Usenet and mailing lists have far fewer resources than, say, the Web. However, it is worth bearing in mind that there are some good alternatives.

for mailing lists you may want to try the site at http://www.liszt.com/, while for Usenet searches the fast site at http://www.dejanews.com/ is excellent. The latter is also interesting because its 120Gbytes database holding 80 million news postings is stored on no mainframe, but a PC running under Linux (see http://www.dejanews.com/dnabout.html#tech), and therefore acts as a showcase of just what this amazing free operating system can do.

Usenet Death Penalty

Just as E-mail has become an indispensable part of modern business life, so junk E-mail seems to be its regrettable counterpart. Junk E-mail is often called spam, but strictly speaking this refers to the flooding of Usenet newsgroups with annoying and irrelevant postings.

For E-mail users, there are various ways of preventing junk mail, including the use of filters in E-mail clients. But for those who visit Usenet newsgroups, it is much harder to take this kind of defensive action. This has led Usenet militants to adopt a different tactic. Instead of trying to block the flood of spam in newsgroups directly, they have started to target the Internet service providers of those individuals and companies responsible for the deluge.

Once spammers have been tracked down, the next step is to issue what is dramatically known as a Usenet Death Penalty (UDP) against their service provider if it refuses to take sufficiently strong measures against the culprits.

Executing a UDP involves technically savvy users causing the cancellation of all Usenet postings from the spammer's service provider. Clearly this is a considerable inconvenience to legitimate users of newsgroups, and something that providers want to avoid. Threats of such Usenet Death Penalties have already caused several major providers to pursue spammers more vigorously, and for all its overtones of vigilante action, the penalty seems to be one of the most effective moves against this kind of Usenet abuse.

User Datagram Protocol

Internet phones often suffer from poor sound quality. This comes down to throughput: the more data you can push over the Internet, the greater the fidelity of the transmitted message. Aside from the obvious expedient of using a faster connection the other way to improve quality is to increase the speed at which information is pushed through any given data pipe. Compression is one important way of doing this.

Another popular method is to employ the User Datagram Protocol (UDP) to deliver the data packets. Most Internet applications use the TCP protocol; this is what is known as a reliable, connection-oriented service. That is, it sets up a connection with the recipient computer first, and then checks that everything arrives. If it doesn't, it will re-transmit.

UDP, by contrast, is an unreliable connectionless service. It does not first set up a connection, but simply begins sending out the packets that have been consigned to it. Nor does it check whether all the packets arrived safely.

The use of UDP is in part why Internet phones are not totally reliable, and why conversations can break up as packets get lost. UDP is employed because its looser approach means less overhead for the data transmission, and so a faster throughput overall.

Issues of reliability are not crucial when voice data (rather than a program file, say) is being sent. But using UDP has one other important side-effect. As a fairly reckless protocol it will quite happily saturate all the bandwidth that is available to it. This selfishness is the main reason why some ISPs ban the use of Internet phones.

Vector Graphics

When the World Wide Web was invented in 1989, it was a text-only affair. The addition of graphics support in the Mosaic browser was without doubt a key factor in the rapid uptake of the Web in business from 1993 onwards.

Since then, the two main graphics formats have been Gifs and JPEG's. These have managed to hold their ground against one of the World Wide Web Consortium's less successful proposed standards, Portable Network Graphics (see www.w3c. org/TR/REC-png-multi.html).

Drawback
But even the more advanced Portable Network Graphics, like Gifs and JPEG's, suffers from a basic drawback. As a bit-mapped graphics format, it leads to relatively large file sizes, with a consequent slow download time for all but the fastest connections.

As more multimedia elements are incorporated in the design of Web pages, this bottleneck becomes increasingly problematic. One solution is to move away from bit-mapped, or raster, graphics to vector graphics. Instead of building up an image pixel by pixel, it is constructed in terms of lines and other elements that can be expressed as higher-level descriptions. This means that file sizes are generally far smaller, since often quite complex images can be described very simply in terms of their elements.

Such files are also readily scalable, an invaluable feature when Web pages are turning up on many more platforms, including those offered through set-top boxes on TVs, and via wireless devices such as mobile phones. Vector graphics can be re-scaled to cope with the needs of such environments in a way that bit-mapped files cannot.

In addition, vector graphics lend themselves to animations, since changing images can be described as transformations of the constituent vectors. A good white paper on the whole area of vector graphics and the Web is at http://www.macromedia.com/software/flash/open/whitepaper/. It comes from Macromedia, a company with a particular interest in this field.

Its Flash technology - home page at http://www.macromedia.com/software/ flash/home_nowrap.html - is probably the de facto standard for vector graphics and animation. For example, Netscape has announced that it will be providing built-in support in future browsers. Moreover, Macromedia has recently placed full details of the file format in the public domain (see for example, www. macromedia.com/software/flash/open/). This was not entirely an act of philanthropy: there is now a widespread recognition of the need for a general, open format for vector graphics, and several other groups have put for ward suggestions.

Advantage
Although Flash has the advantage of an installed base numbering millions, and is well-known among developers, it is not perfect. For example, it offers a binary format for downloading, rather than a transparent one that makes the vector components explicit. The latter approach, whereby vector graphics are sent as text files, is the one favoured by Macromedia's rivals in this area.

These other systems all use XML as their basic format. This offers the ability to manipulate and exchange such graphics files easily. The Schematic Graphics Markup Language ( http://www.w3c.org/TR/1998/ NOTE-WebSchematics/ ) is rather a specialised approach, while Precision Graphics Markup Language ( http://www.w3c.org/TR/1998/NOTE-PGML ), proposed by Adobe (with support from IBM, Sun and Netscape), and based on the Postscript and PDF formats, is more general. Its main rival is the Vector Markup Language proposed by Microsoft ( http://www.w3c.org/TR/NOTE-VML ).

Slideshow
Microsoft is working on another XML-based graphics format, which is code-named Chrome.

One thing worth noting is Chrome's hardware requirements: a 350MHz Pentium with 64Mbytes Ram, as a minimum. This is the downside of the vector graphics approach, which requires far more horsepower than bitmaps do.

Virtual Private Networks

Intranets have been such a tremendous success in part because they offer clear and immediate benefits and are so controllable.

Setting up an intranet within a single site is relatively straightforward and low-cost project - in theory at least. All you need is a network capable of running TCP/IP and suitable software on the machines that will be hooked up. Security is no more of an issue than when it is when running any kind of corporate network.

Once more distant sites are involved this become more complex. It is crucial that inter-site connections are secure.

Until recently this has meant expensive lease lines. By adding extra sophistication to Internet's wiring the benefits of using the Internet - its low cost and ubiquity can be obtained without compromising in other areas. the connection that is created using such means is called a virtual private network. It is virtual because it does not exist as a physical reality; in fact, its path through the Internet may change at any moment as different routings are employed.

But for practical purpose, it acts like a fixed network. It is private because encryption techniques are used to protect transmission's contents as it passes across the open Internet. This technology is also called tunnelling since the encrypted path creates a kind of invisible, underground channel.

VRML

Virtual Reality (VR) is the name given to the creation of artificial worlds that exist exclusively within a computer's memory and which allow the user to explore and interact with them in various ways and with varying degrees of verisimilitude. Although VR has now become something of a cliché, particularly in the popular imagination, its true impact is only now beginning to be felt as usable systems for general computers (rather than specialised ones for flight simulation or games) start to appear.

This is particularly the case for the Internet. At the heart of all standard VR systems, is something called VRML or Virtual Reality Modelling Language.

Just as HTML - HyperText Markup Language - provides a description of a Web page that is then created by the Web browser software resident on your computer, so VRML (named analogously) contains a text description of a three-dimensional world. As with HTML this approach is necessary in order to allow information to be stored in a compact form that can be transferred quickly over the Internet. If a three-dimensional scene were described using conventional graphical techniques it would require a huge file that would take many minutes to download.

Using VRML it is possible to model - to create, that is - to any degree of detail three-dimensional worlds that can be navigated. It is not a marking up of elements that are abstract (as HTML) but the explicit definition of elements that are geometrical solids (albeit in a virtual world that exists only inside a computer). VRML works by describing the form of those elements (spheres, cubes, cones etc), their nature (notably in terms of their colour, how much light they reflect and other surface qualities) and their position. And working with VRML atoms - points in the virtual world - it is possible to create an arbitrarily complex shape by modelling it from the polygonal elements that result from joining these points together.

The impetus for all this rather intricate sculpting in virtual spaces comes from a recognition that the current model for navigating the Internet is inadequate. Although the World Wide Web offers a rich hypermedia surface, it remains just that: a skin whose two-dimensionality is a barrier to more natural three-dimensional movements. The hope is that by employing VRML worlds it will be possible to make Internet navigation - something that becomes more challenging by the day as more content and more links are added - easier and hence to enfranchise a larger percentage of computer users, including those who find the current models too abstract.

As by-products of this more theoretical goal there are number of immediate practical benefits. For example, it will be possible to create virtual offices that teleworkers can inhabit and use for virtual meetings; cybermalls will become places that you can wander through rather more realistically than at present; businesses working with physical objects - whether on a large scale, such as roads or buildings, or down to the tiniest sub-components - will be able to examine, manipulate and even demonstrate them before they are created; and various ideas involving action at a distance (such as telemedicine) clearly move closer to (virtual) reality.

As ever, the best place to start when exploring this new world is the VRML FAQ: it can be found at http://www.oki.com/vrml/VRML_FAQ.html. There are also a few key sites that are well worth exploring. The San Diego Supercomputer Center has probably the main repository of all things VRML (http://www.sdsc.edu/vrml/); there is a VRML organisation at http://www.vrml.org/, and Wired, perhaps not unsurprisingly, played an important early role in the development of this area: its site at http://vrml.wired.com/ has many interesting documents. Wired also runs the main VRML mailing list, which is very lively. To join, send the message

Another good source of information is the book VRML: Browsing and Building Cyberspace (£37.49; ISBN 1-56205-498-8). This is notable not just as one of the very first such titles, but because its author is Mark Pesce, co-founder of the whole area and still one of the key VRML personalities (his very idiosyncratic home page can be found at http://hyperreal.com/~mpesce/). The book offers a good introduction to the history of the subject and, even more importantly, a basic VRML primer that describes in considerable detail how VRML works, with practical examples. These can be tried out using some of the software found on the CD-ROM that comes with book. These programs - the VRML browsers and associated tools - will be the subject of next week's feature.

Internet tools for browsing through the real cyberspace

Like more or less everything else on the Internet, VRML employs the client-server model. That is, VRML documents - which, exactly like HTML are nothing but simple ASCII files - are held on a server; from here they are requested by the VRML client application which receives them in due course across the Internet.

VRML browsers are of two main types: standalone and helper applications. The standalone programs are able to handle the negotiation with servers holding VRML documents on their own, whereas helper application require an ordinary World Wide Web browser to effect this. They then process the VRML file when it arrives in the standard way of a helper application. These helper applications are called up (from within Netscape or Mosaic, for example) when the MIME type x-world/x-vrml (see this week's Net Speak) is encountered.

Among the standalone VRML browsers one in particular stands out: WebSpace from Silicon Graphics (SGI) at http://webspace.sgi.com/. Much of the more technical details of VRML (for example in terms of generating the geometry from the ASCII code) are closely based on the Open Inventor programming library from Silicon Graphics. In fact SGI placed a subset of Open Inventor in the public domain so that it could become the basis of the first official VRML specification. More recently, SGI has announced Cosmo, an advanced 3-D development system (see http://www.sgi.com/Products/cosmo/index.html).

Silicon Graphics machines remain one of the two main platforms for which VRML browsers exist; the other is Windows 95 (and Windows NT), which has rapidly established itself in this area. This is due in part, no doubt, to the availability of advanced 3-D rendering technology for Windows from the British company RenderMorphics, now owned by Microsoft (see http://www.gold.net:80/oneday/render/press/mrmsoft.html for more). The latter has just announced ActiveVRML, an attempt to pre-empt future developments in the area of 3-D animations (see http://www.microsoft.com/internet/vrmlpr.htm).

The use of VRML helper programs is perhaps the commoner approach, not least because people are reluctant to forgo their favourite Web browsers. Alongside conventional helper programs such as WorldView (produced by the company Intervista set up by one of the co-inventors of VRML, Tony Parisi - see http://www.intervista.com/), which you configure in the usual way, associating it with the MIME type x-world/x-vrml, another interesting approach is the inline plug-in. This is one of the new features of Netscape version 2.0, allowing external viewers to be integrated into the Web browser so that their results are displayed within the main Netscape window, rather than separately. An example of this in the context of VRML is WebFX, available from http://www.paperinc.com/. Microsoft too has adopted this approach with the Virtual Explorer add-in for its Internet Explorer browser (see http://www.microsoft.com/windows/ie/vrml.htm).

One striking aspect of these VRML browsers is the variety of approaches taken in creating the navigational interface. The problem that has to be addressed is how three-dimensional movements can be controlled on a two-dimensional computer screen. The technique employed is generally to use left and right mouse buttons (and sometimes both simultaneously). No method seems particularly intuitive at the moment, which means that for every VRML browser you need to learn a completely new way of working. Another problem is that many of these programs are still in beta testing and can be unstable.

Alongside VRML browsers there are a number of other programs worth investigating. For example Fountain (at http://www.caligari.com/) is an extremely powerful VRML authoring tool that lets you create VRML worlds. Equally impressive is VRServer (from http://www.tenet.net/html/products/vrserver.html): this takes ordinary data held on a Web server and converts it into a three-dimensional VRML world that can be navigated - surely a portent of things to come in this area.

VRML's fall from Grace

The Virtual Reality Modelling Language, once so full of promise, now seems to have been sidelined by the major players (September 1998). What went wrong?

Whatever happened to the Virtual Reality Modelling Language (VRML)? This was supposed to be a step beyond common-or-garden HTML, and to present users with a rich pseudo-three-dimensional world of data that could be explored 'immersively' - as if you were really there in the thick of it.

After the initial excitement, and some important support from leading players, things seem to have fizzled out. For example, both Netscape and Microsoft's latest Web browsers offers built-in VRML support, but it's likely that few users are aware of this.

Even on the respective companies' Web sites it is hard to find anything about this VRML capability - almost as if they were ashamed of their earlier enthusiasm.

Out of fashion
The truth is that VRML has passed out of fashion, and now it is not even worth mentioning. In fact it is easy to see what the next victim of such browser faddishness will be: the once-hot push technologies are already moving out of the marketing limelight in favour of the latest bandwagons, Dynamic HTML, XML and portal integration.

A visit to the heart of the VRML empire, the VRML Consortium shows how deep the malaise is. Alongside a VRML FAQ and information about the official ISO VRML 97 standard, which replaced the similar VRML 2 specification in December 1997, there is also a telling press release.

This explains how the VRML consortium is expanding "to embrace other areas" to provide "an open forum among members to develop optimal 3D graphics solutions for the Internet and Intranet." There is even a half-hearted move to adopt the open-source approach to try to enfranchise a little of the excitement that is driving that area.

It might be argued that consortia are hardly the best way for pushing forward such standards anyway, and that support from software companies is what counts. But the news here is just as bad. The main proponent of VRML - Silicon Graphics - is in turmoil, and there seems to be no place for the Cosmo VRML tools or unit in its new boss's "vision" (at www.sgi.com/vision/). At the time of writing, even the Cosmo URLs had mysteriously disappeared.

One area where VRML seemed to be thriving was in the field of immersive worlds, avatars and 3D communities - see the Java-based example.

But even here many of the main players are retrenching: Electric Communities is joining with The Palace and Onlive in an attempt to create new revenues from selling enterprise solutions.

Against a background where even the smallest of Internet start-ups manages to achieve outrageous stock valuations of hundreds or even billions of dollars, the struggles of these comparatively well-established VRML companies seem cruelly ironic.

In retrospect, perhaps VRML arrived too early. Arguments about standards, which were resolved only slowly, meant industry progress was harder than it needed to be. And now that a standard is in place XML has arrived, offering a far better way of handling structured information of the kind represented by VRML.

Symbolic
In this context, the appearance of Microsoft's imminent Chromeffects looks heavily symbolic.

Although details are still sketchy, this XML-based technology works closely with the Windows operating system to provide advanced 3D special effects, without using VRML at all.

Some idea of its capabilities can be gleaned from two firms that are already offering tools that will work with Chromeffects: Zapa Digital Arts and SquishyFX.

It is not hard to imagine a similar approach applied to Web advertising banners, for example.

This is perhaps something of a come-down after those high aspirations of creating virtual worlds, but one whose very practical focus may well see Chromeffects succeed, unlike the increasingly marginalised VRML.

Virus

Computer viruses and their various software relatives have long played a complex part in the Internet's history and mythology. For example, nine years ago there was the Internet worm, and in more recent times hoaxes such as the "Good Times" virus, which was spread supposedly just by reading an infected E-mail message, have wasted the Internet's resources and people's time.

Today, too, there are widespread concerns - and misapprehensions - about what viruses and similarly destructive programs might do to corporate networks and computers that are connected to the Internet.

The concern is justified in the sense that the Internet does potentially offer a perfect medium for the propagation of malevolent programs. After all, most users spend time downloading something; be it Web pages, applets, ActiveX controls, files from FTP sites, Usenet postings or E-mail. IT managers therefore have a responsibility to ensure that these commonplace operations do not jeopardise the critical computing infrastructure of a company.

But the truth of the matter is that the Internet is actually far better at propagating rumours and fears than the threats themselves. You are more likely to catch some garbled half-truth or intentional misinformation about viruses - what are fashionably known as memes - than a virus. The most likely source of viruses is probably your software supplier.

The reason that the Internet is less of a threat than it might seem comes down to technology. Many viruses are passed around in the boot sector of infected floppy discs, and clearly pose no direct threat to Internet users.

Other kinds of virus can infect programs rather than media, so these could be downloaded over the Internet. However, those running FTP sites are acutely aware of this danger and check their files for such infections. Moreover, if programs escape detection initially the information can be passed back to the site in question, and the offending files removed very quickly.

Perhaps the most contagious kinds of virus are those that attach themselves to word processing documents. These use the powerful macro languages found within programs such as Microsoft Word and Excel to create destructive macro viruses.

Such forms are particularly dangerous because the exchange of documents is an essential business activity, and because documents with macros can be sent across the Internet as E-mail attachments. Partly no doubt because there have been so many hoaxes regarding E-mail viruses, users tend to be lax when it comes to firing off messages with attachments, and even more reckless about viewing them.

The other development in this area is the increasing availability of Java applets and ActiveX controls on the Internet. The former come with a security model specifically designed to prevent malicious application. However, some implementations of Java have been found to have security holes.

The case against ActiveX is more clear-cut. Considerably more powerful than applets, ActiveX controls are therefore potentially much more dangerous. Microsoft's security approach hinges on the use of digital certificates to ensure that controls are not changed in transit and that their authors can be identified. This is all very well, but places the onus on the user to make decisions about who is trustworthy. Supplementary software protection is therefore highly advisable.

Moreover, even if the role of the Internet in passing on infection is less than scaremongers might have us believe, the broader risks for companies are real. Rogue floppies brought in by employees can easily spark off epidemics across intranets and extranets. Macro viruses are not just a threat but are fast becoming some of the most widespread infections encountered. Suitable anti-virus precautions are therefore a must for every company.

Build up your immunity to viruses over the Net

The Internet might be responsible for many panic stories about viruses, but it's also a great way to get hold of cures for them.

Two things can help in the battle against viruses: information and anti-virus software, and many of the leading suppliers in this field make both available on the Internet to varying degrees.

As far as products are concerned, there are two broad classes: those that aim to detect and remove viruses on the user's machine, and those that try to block their arrival over networks.

For example, the UK company Dr Solomon's offers its anti-virus toolkits for various platforms, and Mailguard, which spots viruses sent as attachments to E-mail. There is also a management edition that allows you to install and control anti-virus software on client machines from a central Windows NT server.

Dr Solomon's is also notable for its product manuals. Some of this information is available online as part of the public virus resources, notably the virus encyclopaedia.

McAfee is one of the best-known names in the anti-virus world, and offers a wide range of products and general virus information. It is also interesting for its use of Backweb's push technology to provide virus updates.

The anti-virus products from Symantec also make use of an automatic online update feature. Its range includes basic desktop software as well as an E-mail scanner that uses an attractive Web interface for management. There are additional anti-virus resources.

Eliashim provides desktop products, including a free anti-virus plug-in for Netscape Navigator and Microsoft Internet Explorer, and Virus Safe, designed to work with firewalls. Another server product is Mimesweeper which handles file downloads and E-mail by working with other anti-virus products.

Some of these products can handle the scanning of Java applets and ActiveX controls too, while one company, Finjan.

Most of these programs are available as trial downloads, and this is obviously a sensible first step in evaluating which product to buy. The suppliers' sites discussed here naturally offer reasons why their software is better than that of their rivals, but obtaining objective information is hard.

One place to start is the National Computer Security Association virus page. The association offers a virus software certification program. There is a list of products it has certified , while other virus sites are listed. Since this security organisation is attempting to set itself up as an impartial reference point for the virus world, its site is rather staid.

If you prefer something with more bite, try the Computer Virus Myths site. As well as debunking the many virus hoaxes that float round the Internet, it contains an excellent, if barbed, view of the current anti-virus software suppliers.

Voice buttons

One of the least-expected uses for the Internet network has been voice telephony. It is also a rather ironic one, in that it is likely to transform the telecoms world by turning voice into an adjunct of data, rather than the other way round, as it has been until now.

But most of the exponents of Internet telephony have tended to regard the provision of voice transmission as merely a way of cutting costs. In doing so, many miss the practical implications of this fusion of what were hitherto distinct networks.

For if there is essentially no difference between voice and data, not only can voice be sent over the Internet's pipe - which ultimately will abolish distance for voice traffic just as for data - but telephone calls can be regarded as just another streaming element in a Web page (assuming the right hardware is available at each end).

A logical move is therefore to offer a link providing telephone connectivity from within a Web page. Then, when users visit a Web site, they can talk instantly to a salesperson, or a helpdesk, say, just by clicking the appropriate option on-screen. These voice buttons, as they are sometimes called, will allow both IP telephony (the simplest solution), and connection to ordinary phones (through a gateway at some point).

All kinds of pricing variations will be available, from costing nothing, to calls charged to customers or to the company with the Web site, depending on what kind of service is being offered. Whatever the pricing model adopted, the business effects of this tight integration of voice and data could prove to be just as important as the more obvious economic effects.

We-commerce

Web Based Services

Perhaps the most interesting move in the Web calendaring sphere is the acquisition of PlanetAll by no less a player than Amazon.com. This is a clear indication not only of the arrival of Web calendaring as the next key ancillary Web service - which means an imminent spate of acquisitions in this area - but also of Amazon.com's growing portal pretensions.

Web cams

Much of the power of the Internet comes from the fact that it transports information as raw bits: there are no assumptions about the structure of the data that is being sent beyond that imposed by the TCP/IP protocols. This means that as well as E-mail text and basic Web pages, multimedia files can also be transmitted.

One interesting possibility is therefore to send images over the Internet by using a video camera as a data source. As a quick calculation will show, the bandwidth required for real-time, full-colour images is immense, and even using compression techniques fails to bring the quantity into anything manageable for dial-up connections, say.

The solution is quite brutal - to limit the number of frames sent, often to one every few seconds - but surprisingly successful. For it turns out that the kind of views provided by video sources placed on the Internet in this way - usually known as Web cameras, or Web cams - rarely require anything approaching real-time updates. Instead, they are used to provide a view of a city, a road intersection for example, or of an office. The general idea is to show every now and then what is happening rather to provide a detailed, second-by-second record of events.

One of the most famous Web cams is also one of the earliest, even predating the Web itself through its first, purely internal implementation. The Trojan Room Coffee Pot shows the state of a coffee pot in one of the rooms at the Computer Lab at Cambridge University. To update the image displayed on one of their Web pages, you click on a link that invokes a small CGI script to send the latest image back.

Webcasting

Microsoft's Internet Explorer 4 had an extremely slow and cautious beta testing programme with good cause: the integration of a Web browser into the very fabric of Windows is fraught with danger, and clearly Microsoft was determined to iron out as many bugs as possible before it released version 4 officially (which was soon followed by several bug fixes!).

The most significant change since it was first announced was the addition of Webcasting, which offers not one but three kinds of push. As Microsoft's excellent Web pages explain in great detail, the first two of these use not true push but automated pull. That means the browser pulls down data from a server at predetermined intervals. Because the user is not involved in the process (other than subscribing initially), this gives the impression that information is being pushed to the client.

This pseudo-push approach is the one generally employed even by Pointcast, the push pioneer. It is why push technologies are such terrible bandwidth hogs: if there are many such push clients active, then at these predetermined times the same data is being pulled repeatedly from distant servers to the corporate desktops, filling up much of a company's connection to the Internet in the process.

Microsoft's first push offering is little more than an offline Web browser. Any Web site can be turned into a channel using the Webcaster wizard to define which pages are pulled down and when. You can chose whether to pull down all the pages that have changed or just to be notified of the new content.

The second option is rather richer. Using Web pages as its basic data type once more, Microsoft has come up with a more intelligent offline Web-browsing technique. Alongside the HTML pages held on a Web server, a new kind of file, holding the Channel Definition Format, is created. This contains details of which pages on the site change and when.

Using this format, Explorer 4 can pull down just those pages that represent new information. There is also an option to retrieve only the channel file to allow the user to choose directly from among the altered pages. These are simple text files, like HTML, and are some of the first examples of Extensible Markup Language (XML).

This has been widely touted as the natural successor to HTML since it allows arbitrary extensions (in the form of new tags) to be added to that language without the need for constant changes to browsers. These must simply be XML-aware, and so able to deal with the new tags as they encounter them.

Microsoft's move is shrewd in a number of ways. First, it creates a platform-independent method of specifying the pages: any browser supporting XML will be able to read the channel file. Secondly, creating these files requires little more than a text editor.

And thirdly, Microsoft is once again trying to co-opt a standards body (in this case the World Wide Web Consortium) to bolster its approach. This seems to be working, with many companies (including Pointcast) agreeing to support the format in their push-related products.

The final option offered by Explorer 4 remains more a hope than reality, but is important because eventually it will allow true push. Through system hooks it will be possible to use advanced network technologies such as multicasting to provide efficient one-to-many delivery, rather than the extremely wasteful one-to-one currently employed.

I found Webcaster easy to use, though the channels on offer are still thin on the ground. Those that do exist are flashy, with plenty of fairly pointless multimedia effects wasting bandwidth and screen space. Most are free, but some require paid subscriptions.

One of the test channels available with the release I looked at - Pointcast's - requires ActiveX controls to be downloaded first. As readers of this page will know, ActiveX controls are extremely powerful - and extremely risky. As such, they may well play a key role in deciding which push technology wins out within companies.

Web conferencing

The essence of the Internet is communications, and in some sense all of its services can be regarded as variations on this theme. Broadly speaking, they can be broken up into three groups: one-to-one, one-to-many and many-to-many. The one-to-one uses of the Internet include e-mail, Net phones and video phones. The main one-to-many use is of course the World Wide Web, but other options include real-time sound or video broadcasts and Gophers.

The most basic implementation of many-to-many interactions is through Usenet. The reach of Usenet is immense, but the interaction does not occur in real-time. Moreover, many people do not participate in the Usenet world - perhaps because NNTP feeds are not always offered by their ISP, or because the newsreading functions of programs like Navigator and Internet Explorer have lagged somewhat, or maybe even because of Usenet's rather dubious if largely unjustified reputation.

Internet Relay Chat (IRC) offers a near real-time interaction between small groups of people, but otherwise suffers even more from Usenet's problems: the software available for taking part in IRC discussions is less well-developed than newsreaders, and IRC's reputation is even worse than Usenet's.

A third possibility is offered by Web conferencing. In many ways this is like Usenet in that people view messages and add their comments, but it possesses the great advantage that it is done over the Web, and does not require any extra software or skills. For this reason, Web conferencing is a fast-growing area, and is often used to try to create and sustain a community of visitors around a particular Web site.

WebDAV

Even though the Internet is fundamentally about communication, many of its key activities remain remarkably lonely occupations. For example, the Web consists of millions of pages that are created by individuals on their own, even if they sit in teams. These sites may be modified by others, but each time the work is done essentially in isolations. Ideally, Web page writing should allow for more collaborative types of genesis. Here it would be possible for individuals located anywhere to add to and refine sites simultaneously, enabling a more interactive and creative process of Web site construction.

Of course, such an approach has dangers, notably of simultaneously changes being made to a page and the loss of work of one or more such users. for such a system to work, extra structures need to be in place over and above the simple Net protocols that are currently in use. One proposal to address both the collaborative urge and the problem it poses is called Distributed Authoring and Versioning on the Web - WebDAV.

It comes from an Internet Engineering Task Force working group, and is an extension of the current hypertext transfer protocols that are used to communicate between Web servers and clients.

It uses XML as the way of encoding the information that is passed between a user wishing to change pages and the Web server holding those pages. XML is employed both to structure the exchange and to define various properties of the file in question (for example, authorship, copyright, last update). By using further XML encoded information about whether a particular page is locked, different parts of a Web site can be updated by several people simultaneously.

Web farms

There is a general assumption that a Web server is a single machine holding HTML pages that are then served up to browsers as they access the site. This model was fine when Web holdings consisted of perhaps a few hundred or even thousand such pages, but today corporate presence may include hundreds of thousands of pages, and there is no reason why multi-million page Web sites should not arrive in due course.

Clearly, trying to load such massive holdings on single servers, particularly if multimedia objects are involved, is impractical. Fortunately the very distributed nature of the Web means that it is easy to place portions of a site on many different servers, which are then seamlessly linked together.

These collections of Web servers are often called Web farms, and are becoming increasingly common in larger companies, whether for Internet, intranet or extranet use. As well as allowing the load to be spread across many machines, there are a number of other advantages. For example, this partitioning of information makes local control much easier: individual departments can be responsible for the content found on their respective system without needing to worry about the overall design.

Similarly, it is much easier to add redundancy using the Web farm approach. Extra back-up systems can be added at multiple points and incrementally as the size and popularity of the site grows. Of course, all of this presumes that connectivity both within and outside the site it good: it must be possible to move data quickly from one component of the Web farm to another, and then to serve it up to the outside world.

WebNFS

Although the file transfer protocol (FTP) is not as visible as the hypertext transfer protocol used for Web pages, it remains a vital part of the Internet. Whenever files need to be distributed, it is almost always FTP that does the job and is often invoked from a Web page.

However, FTP has a number of limitations. For example, the connection between the client and server is extremely limited, especially in the context of an increasingly distributed approach to computing, where the ability to act on objects held distantly is important.

Various solutions have been proposed to get round this. One is Microsoft's common Internet File System approach, another is WebNFS from Sun. Sun's well-established network file system (NFS) allows file systems to be "mounted" - that is, added - by a distant user so that he or she has transparent access to all the files there as if they were available locally.

The WebNFS protocol takes this approach and extends it for the Internet. For example, originally NFS used UDP as the transport protocol, but WebNFS supports transmission control protocol, which is more reliable and often necessary for corporate sites with firewalls.

The benefits of WebNFS include the ability to resume broken connections without the need to download all the file again. Along with these features comes a new URL. This takes the form of nfs://server:port/path, where server is the server name, and port has a default of 2049. Netscape's support for WebNFS means that these URLs may start turning up in the near future.

Web Page Engineering

Much of the power of the Web page derives from its simplicity. HTML is just text, and very easy to write, which means anyone can create a page. Inevitably, though, as Web technology has become more sophisticated, so the demands made on those who craft Web pages have grown.

Contents has always been paramount: if there is nothing worth reading on a Web page, then all other considerations are irrelevant. But once Web pages were more than simple screens of text, design skills too become necessary.

Thus a division began to emerge between the content creator - someone skilled at writing text that would work well online - and the Web page designer. The latter would be able to work within the new constraints of the medium, coping with limited screen sizes, colour palettes and the basic fact that the end-appearance of a page was ultimately determined by the viewer's browser settings.

More recently, a third personage has joined the Web team, Alongside the content and the design, another element is increasingly important: the client-side scripting code that lies hidden in the page, but which often adds important functions.

this scripting might be Javascript, or employ advanced features of Dynamic HTML. Design elements are also involved through the use of cascading style sheets. These all require considerable programming skills, and creating an entirely new class of IT job. For want of a better term, this final member of the Web page production team is sometimes called a Web page engineer.

Webring

Webrings are collections of Web sites with some theme in common. They are set up by interested parties, and managed from the central Webring site at http://www.webring.org/.

Each member of a Webring links to two others, so as to form a conceptual ring. this could obviously be done simply by embedding hyperlinks within pages.

The beauty of the Webring approach is that everything is handled centrally by the main Webring site.

In this way, new sites can be added to the ring without changing the code in any page. Random links are also possible. For more information see http://www.webring.org/what.html.

Web-safe colours

Part of the Internet's great promise is platform-independence. The use of TCP/IP protocols for wiring and HTML for presentation seemed to offer, at last, the possibility of getting away from sterile arguments about which computer system was best.

Of course, things haven't quite worked out that way. On the software side, the continuing battles between Microsoft and Netscape have still not been resolved, with each introducing 'improvements' - or even just wilful idiosyncrasies - that deviate from official standards.

And even on the hardware side, there remain problems. For example, you might think that a Web graphic is a Web graphic. But a key issue for images not in black and white has to do with the colour palette employed.

Many computer systems display a maximum of only 256 colours (eight-bit mode). They therefore have what is known as a palette of 256 colours with which an approximate representation of any of the potentially many millions of perceivable colours is created.

That approximation is achieved using dithering: the placing together of two different pixels to simulate the effect of a third colour not available directly from the palette.

However, the core 256 colours available on a Windows-based PC and a Macintosh system are not quite the same, leaving Web designers in something of a quandary: which to adopt? The best solution is to employ a slightly smaller palette of 216 colours, common to both Windows and Macintosh platforms, which can therefore be displayed on both without dithering.

This is known as the Web-safe or browser-safe palette, since it is the one employed by the leading browsers in eight-bit mode.

Web TV

One of the concerns about the future development of the Internet is that its consumer base is proportionately quite small - certainly compared to that of television, say. With this in mind a number of companies have put together what are now generally called Web TVs with the aim of connecting the general population to the Internet - and so creating a huge online audience - using the two commonest pieces of high-technology found in the home: the TV and the telephone.

The idea is to complement the basic display and sound capabilities of the television with additional processing power plus built-in Web browsing and email software, and then connect the ensemble to the telephone line using a modem. All of these extras are packaged in a small set-top box which is truly plug and play - no set-up is required.

Although the approach is sound in theory, in practice there are a number of obstacles to be overcome. Technical challenges include producing a good enough image on the TV screen - one that can be viewed from several metres - and solving the problem of input. Currently the mouse is the main way of navigating Web pages, but clearly this is not an option for the Web TV.

The other issue is price: the Web TV will only succeed in enfranchising huge new swathes of the population if the set-top unit represents a relatively small incremental cost, and if the subscription charges for the Internet (usually bundled with the hardware) are low enough. It is not yet clear whether the pricing of the Web TVs currently available is low enough to provide the hoped-for breakthrough.

Over the last year, much has been heard about the idea of a slimmed-down Internet unit designed for the home, variously called a WebTV or information appliance. As part of this trend, Microsoft has bought the WebTV company for $425 million, while Netscape set up a new company called Navio, now merged with Oracle's Network Computer efforts. The idea is to broaden the appeal of the Internet to the many millions who do not have a PC or modem, and who are unlikely ever to purchase them.

One of the leaders in this area is Diba. The company was formed by Farzad Dibachi, an engineer from Oracle who believed in the new platform sufficiently to start his own company to produce Internet desktop units for televisions, as well as other Internet-enabled devices. Diba's "vision white paper" - Appliances for the Information Age - gives a good introduction to the company's aims, while there is more technical information.

Perhaps most fascinating is the occasional diary Dibachi kept as he went about creating his company. Similarly, the press releases give a good resume of the company's growth. One of the most important announcements is about the availability of Diba Internet Set Top Starter Units - the first real products, in other words. More details about these can be found. There is background information on the general Diba Information Appliance Suite architecture, while details about end-user.software can be found.

One of these programs is the Diba browser. This is HTML-compatible, but optimised for use on information appliances such as televisions, and incorporates anti-flicker and anti-alias features. Because these new appliances are limited in their display capabilities, it will be necessary either to redesign Web sites accordingly, or at least to produce special versions for these new visitors. Diba is offering some guidelines on what kinds of changes need to be made.

Webtop

Java plays a central role in the working of i-Planet, so in a sense Sun was right about everything - except the new hardware platform it mistakenly tried to foist on to users, and which ultimately led to the demise of the original Network Computer idea

White Pages

If there is one thing that surprises newcomers to the Internet more than anything else it is that there is no overall index. A related source of astonishment is that there is not even a central registry for people: even though something like 30 million users are effortlessly accessible using Internet e-mail, there is no one place where you can look to find out what their online address is.

In the face of this need, there have been several attempts to create just this kind of database. There are various systems, called by a variety of names - CSO, whois, netfind, X.500 - while the generic term used for describing this kind of information is White Pages.

The analogy is with conventional telephone books (which do indeed have white-ish pages). Moreover, there is an intentional contrast with Yellow Pages, which also exist on the Internet. These offer information not about people, but about services and companies. Like the White Pages, they too are fragmented and incomplete.

In a sense, both White Pages and Yellow Pages are doomed to failure at the current time. By its very nature, the Internet is decentralised and unpoliced, and so there is little hope that an obligation to post entries to an all-encompassing directory could ever be imposed.

What is more likely to happen is that some kind of commercial alternative will evolve. By drawing together and collating information about e-mail addresses from thousands of disparate sources, and then charging tiny sums for online directory enquiries - or perhaps using sponsorship - the Holy Grail of a viable White Pages service may one day be attained.

The site at Global Mega - People Finder allows you to carry out searches at online white pages directories from a single page. As well as the main US directories there are also European entries.

Other sites include Four11 ( at http://www,four11.com/), WhoWhere (at http://www.whowhere.com/)and Switchboard (at http://www.switchboard.com/). Also useful are major E-mail services such as Hotmail (http://www.hotmail.com/) - but only for finding people's free E-mail addresses, and even then they represent only a fraction of the total user universe. The same is true for Mirabillis ( http://www.mirabillis.com/)now owned by American Online.

Will Ldap be the Internet's White Pages standard?

Progress is being made in the search for a comprehensive directory for the Net. But how far off is the complete article?

Whereas search services designed to locate infor- mation are now very powerful and essentially complete (tools like Altavista hold the prospect of a full-text index to the World-Wide Web, for example) White Pages are fragmented among many small, local databases, making them almost useless.

A universal White Pages directory does exist, in the shape of the X.500 service. However, despite the backing of governments and international standards organisations, this has signally failed to take off. The reason is simple. In contrast to the transparent syntax used for SMTP (Internet) addresses, X.500 forms are unmemorable and awkward to use. Moreover, X.500 software, both client and server, tends to be large, and the protocol used to communicate between the two - Directory Access Protocol (Dap) - is also overly complex.

However, X.500 does have some virtues, and this has led to work that tries to reduce its requirements. Researchers at the University of Michigan have come up with a lighter protocol based on Dap called Lightweight Dap (Ldap). This runs over TCP/IP and offers most of the benefits of Dap in a simpler implementation.

However, Ldap was originally designed to work with the full X.500 service, and required an X.500 server to hold the White Pages information. More recently, the Michigan team has come up with a stand-alone Ldap server, called slapd; this runs under Unix and is freely available - read the Michigan team's introduction to Ldap .

This might have remained an academic curiosity were it not for the fact that Netscape has now taken up the idea, and has proposed the new Ldap protocol and server as the basis of the long-sought universal Internet White Pages. The idea has a good chance of becoming accepted, not just because Netscape has considerable clout in the market, but because other key players such as Novell, Sun, Hewlett-Packard, IBM and Silicon Graphics have all announced that they will support Ldap access. Even more significantly, Microsoft too has said that it will join in.

Netscape has put together an exemplary document on what Ldap is and how the company will support it.

The basic ideas behind Ldap directories are very similar to those of X.500. That is, the directory consists of entries, a collection of attributes that has an unambiguous overall name, called a distinguished name.

Although the syntax of these distinguished names is more or less identical to that of X.500, the important difference is that users should never need to employ it. Instead, they will be able to use common names which will return a standard SMTP E-mail address joe.bloggs@acme.co.uk. All of the X.500-like mechanics are hidden below the surface.

Ldap directory entries are arranged in a hierarchical tree-like structure, with geographical and organisational distinctions at the top, and smaller-scale differences lower down. Ldap works using the standard client/ server model: an Ldap client connects to an Ldap server across a TCP/IP connection and makes a query. The server responds or makes a referral to another server.

Ldap offers advanced features described in Netscape's document and in the official RFC 1777. There are sophisticated procedures for ensuring changes to Ldap directories are co-ordinated for consistency. Access control is also implemented, a crucial requirement for secure business use.

One interesting by-product of this proposal is a new URL. With future versions of Netscape Navigator, for example, you can access Ldap-compliant directories using the syntax ldap://ldap. netscape.com. Although this looks a little strange, one day it might be as familiar as http://home.netscape.com/.

Windows NT and the World-Wide Web

For most of its lifetime the Internet has been synonymous with Unix; but for Web servers, at least, there is now an interesting competitor: Windows NT. One signal of this shift is the growing number of URLs ending in .htm rather than the full .html preferred by Unix - though, of course, this in itself is no proof that Windows NT is taking over. But allied to this suggestive change has been an very marked increase in the availability of commercial Web server software running on Windows NT. Many of these programs also run under Windows 95, but the more robust and secure NT is to be preferred as a server platform.

The advantage of these new products is twofold: they are much easier to set up than most current Unix systems, and they tend to be much cheaper. This makes them ideal for businesses who wish to dip a toe in the Web's cyberwaters without committing themselves too deeply either financially or in terms of extra specialised staff.

The NT Web revolution was started by Microsoft itself as part of their Damascene conversion to the Internet last year. Very shrewdly it commissioned the University of Edinburgh to write a simple but effective NT Web server, HTTPS, which was then available free over the Internet (you can download it from http://emwac.ed.ac.uk/html/internet_toolchest/https/software.htm). To underline the robustness of the system, Microsoft runs its own hugely popular Web site using this software; it is also working on a new Web server product code-named 'Gibraltar', due out early next year.

HTTPS may be free, but its functionality is limited. A commercial version called Purveyor costs £395 (see http://www.integralis.co.uk/process/white-paper/purveyor/index.html for more details) and adds some much-needed security features, including the ability to function as a proxy.

Once the viability of NT as a Web server platform was established by HTTPS, other commercial companies were not slow to join the fray. What is particularly noticeable is how quickly these programs have grown in sophistication.

For example, WebSite from the book publishers O'Reilly & Associates (£345, available from 0171-497 1422 ext.211; information and a trial version at http://204.148.40.6/) uses a highly-graphical approach for site management, including a tool called WebView that presents all HTML documents as a tree structure very similar to Windows 95's Explorer. It also comes with an extremely full manual that will be invaluable for newcomers.

Netscape Communications' NT server even uses a Web browser to lead you through the configuration of the server and to modify it thereafter - a good example of how flexible such browsers can be, and how manufacturers are trying to make the setup of servers as easy as possible in order to broaden their appeal to general business users. A free trial version of Netscape's server, which costs $795 from (01753) 622061, can be downloaded from http://home.netscape.com/comprod/server_central/test_drive.html.

Another commercial NT Web server is NaviServer. This is produced by NaviSoft, the Internet software arm of America Online which is one of the leading online service providers in the US. NaviServer is also notable for the highly-graphical approach it uses for Web management: its NaviPress tool presents a complete image of the HTML documents on a site together with all the links between them - a true web. It costs $1495 but is also available in a free trial version: see the URL http://www.navisoft.com/products/server/server.htm for more details. An NT version of Quarterdeck's £130 WebServer is due early next year. See http://www.qdeck.com/qdeck/products/WebServr/ for details.

Winsock

To simplify enormously, a program written to the Winsock standard acts as a kind of bridge between the data that is sent over the Internet according to the basic Transmission Control Protocol/Internet Protocol (TCP/IP) rules; and the end-user Windows programs that are run on a desktop PC.

The big advantage for Windows software developers is that they can write to a standard: theortically, any Winsock-compliant program will work with any Winsock. Winsock is also a boon for users, since it means that once a Winsock is installed, you can the (theoretically, at least) run any Winsock-compliant Windows Internet utility on top of it - and more than one at the same time.

Microsoft includes Winsocks for NT and Windows 95 as standard and one for Windows for Work Groups is available from their ftp site.

Trumpet Software (at http://www.trumpet.com.au/), the firm founded by Peter Tattam to sell and support his Trumpet Winsock program probably did more than any other single program to drive the uptake of TCP/IP - and with it, the Internet - on PCs. It is very easily to set up but cannot be employed in conjunction with a TCP/IP network.

An obvious thing to try is to connect to the Internet using a modem attached to one of the networked machines, and then using the network to pass the TCP/IP packets to other computers. This can be achived using using NT, functioning as a router. For Windows 95 (and NT) users there is a product called WinGate. WinGate acts as a proxy; it receives a request for Web pages, say, over the TCP/IP network, fetches them from the Internet, and then serves them up to the PC that requested them. WinGate is available from http://nz.com/NZ/Commerce/creative-cgi/special/qbik/wingate.htm and at the European mirror at http://www.common.dk/qbik/wingate.htm. The author wil give you the registration key free if you are connecting only two PCs and the prices for more than two are very reasonable. There is also some extremely good information on how WinGate works.

The success of Winsock has led the Winsock Standard Group - an independent forum of several hundred developers - to come up with Winsock 2. The idea is to take the elements of the current Winsock 1.1 that have led to its widespread deployment, and to build on them.

For example, the common TCP/IP standard offered by Winsock to application programmers is to be extended to other communication protocols. Initially these will be Novell's IPX/SPX, Apple's Appletalk and OSI, but the modular nature of Winsock 2 will allow others to be added in the future. In a similar way, new media are being embraced, including ISDN, ATM and wireless.

Another innovation is the support for the Quality of Service (QoS) concept. This is necessary in order to allow real-time multimedia applications across the Internet and other networks, where certain time-sensitive data streams must be given priority over others - E-mail for example - for which delays are not so crucial.

Wide Area Information Servers

Wide Area Information Servers (Wais) - pronounced ways - are yet another attempt to provide a way of searching for material across the Internet's rich holdings, complementing Archie and Veronica.

With Archie you need to know more or less exactly the name of the file you are looking for. Veronica searches are carried out through the collection of menus that go to make up the various Gophers around the world. Wais scores as it provides a full-text search tools that lets you look for words or phases in the documents themselves. However it only applies to collections that have been pre-indexed for use with Wais.

There are several hundred such indexes, but they tend to be narrow in their scope. Only if your search topic happens to be well indexed will your search pinpoint relevant documents.

Using E-mail to carry out a WAIS search

Undertaking a WAIS search invlovles choosing where you will submit your enquiry. A list of sources, or databases, that supports WAIS interrogations can be obtained by sending the message search xxxx xxxx to the address waismail@sunsite.unc.edu. This is where the E-mail WAIS service is run. The message is a a kind of dummy command, and more or less anything could be substituted for the xxxx above. The response is something like: xxxx is not available for searching. You may search the following sources: AAS_jobs.src AAS_meeting.src ANU-ACT-Stat-L.src ANU-Aboriginal - EconPolicies.src ANU-Aboriginal-Studies.scr ANU-Ancient-DNA-L.src

where a list of several hundred sources (as indicated by the .src extension) then follows. The secret of effective WAIS investigations is choosing the right database to search.

For example, suppose you were interested in finding out about blue whales. Looking through the list of databases you see one called biosci.src which you might guess would be suitable. To find out, you would send the message setmax 20 search biosci.src blue whale

to the same waismail@sunsite.unc.edu address. The first line set the maximum number of posible relies, while the second specifies that you want to find out about the blue whale in the database biosci.src. After awhile you will get a list of references with a field DocID:... To retrieve one of these hits you send the section starting DocID: to waismail@sunsite.unc.edu.

WorldCom

The sale of CompuServe, in 1997, came as no surprise: what had once been the clear leader in online services had seen its membership levels stagnate while rival America OnLine (AOL) grew at a tremendous pace. After years of denying that the Internet would ever represent a serious alternative to its proprietary approach, CompuServe had been struggling to find a profitable - or even a plausible - online niche.

But, although its recent sale was long-expected, the buyer was not. WorldCom is not a company that has hitherto had much of a profile in the Internet world, but this deal, combined with others that it has made recently, have propelled it to the first rank of online players, and turned it into a company that will have considerable importance for every business user of the Internet.

WorldCom was born not in a garage but in a coffee shop in 1983. It started offering telephone services in the US after the break-up of AT&T and the liberalisation of the telecoms sector. It grew steadily by acquisition and natural expansion, and soon offered local, long-distance and international telephone services as well as more specialised high-speed connections (see the fascinating timeline of its evolution.)

It made one of its most significant acquisitions on 31 December last year. It bought MFS Communications, another huge if relatively unknown telecoms company. MFS plays a crucial role in providing large-scale connectivity to ISPs and companies. It, in its turn, had become a major Internet player through the purchase of UUnet last August, and UUnet itself had swallowed up one of the UK's major Internet service providers, Pipex, the year before. As a result of all this activity, in 1996 WorldCom's turnover was $5.64bn (£3.5bn) (annual report).

In other words, even before the acquisition of CompuServe, WorldCom was important for the Internet. In this context, the acquisition of an ailing online company might seem to imply a loss of focus: after all, millions of people accessing online forums have little to do with global connectivity. But WorldCom is not keeping all of CompuServe: if the deal is approved by the authorities, it will sell the membership to AOL, the leading online service.

WorldCom will keep Compu-Serve's huge global networking infrastructure. Moreover, it will buy AOL's networking company ANS, and has entered into a five-year contract with AOL to provide Internet connectivity.

AOL gets more than two million extra members (many in Europe, where it is weak), taking it to more than 11 million, and making it effectively untouchable even by Microsoft (MSN has 2.3 million members). AOL also gets out of the Internet connectivity business, and solves a long-standing problem - lack of modems - by adding WorldCom's huge network of 700,000 dial-up points.

But the real winner is WorldCom. At a stroke, it has become the equal of telecom giants such as AT&T, Sprint and the new BT-MCI company Concert. Its acquisition of CompuServe signals its definitive transformation from a patchwork of US telephone companies into perhaps the leading supplier of Internet connectivity at every level (an interesting map of its network can be found).

This is good news for corporates because WorldCom has traditionally been very aggressive in its pricing and in smashing open the cosy cartels found in the telecoms industry. It has also shown itself to be ready to supply the very latest technology: for example, its subsidiary MFS is already offering the new Digital Subscriber Line (DSL) which allows megabit download speeds over ordinary telephone lines.

Since it seems clear that com-panies like BT are happier to make profits out of ancient technologies such as ISDN rather than take a chance on new ones such as DSL, WorldCom may well provide the impetus required to shake up these fusty telecoms monoliths and to force them to accelerate the introduction of new Internet technologies and more competitive pricing.

The Internet Worm

Virus researchers usually distinguish between those programs that are parasitic, requiring some kind of software host to be run before they can spread, and those that are entirely standalone. The vast majority of malicious infective programs are indeed viruses.

However, the most important infective program in the history of the Internet is not a virus. It is one of a class of destructive programs called worms, which simply propagate themselves rather than engaging in any more directly malicious action.

But as the incident of 2 November 1988 showed, even worms can wreak havoc. In this case, it managed to bring much of the then Internet to a standstill as computers on the network spent more and more of their resources running copies of the infection.

The Internet worm was a self-replicating program that infected Vax computers and Sun 3 workstations running Berkeley Unix. It exploited security holes in two programs: one used for sending E-mail, and the other employed for the finger service, used to offer simple information about users. It also engaged in a little automatic cracking, trying to guess passwords so as to obtain access to user accounts.

The Internet worm incident has proved an isolated one: the Internet has not been brought to its knees by an infection since. In part this is no doubt due to the fact that the worm caught the online world napping and as a result more efforts have been made and structures put in place to combat future outbreaks.

Just how important the Internet worm was can be gauged from the fact that it even has an entire RFC devoted to it (RFC 1135), published a year after the attack.

X.509

The main form of encryption used on the Internet is that employing public keys - necessarily, since the enormous distances spanned by the Internet make the personal exchange of private keys effectively impossible. The fact that keys are made public means that there is scope for impersonation. A public key could be posted, allegedly belonging to one individual, but in fact created by another.

To get round this problem, digital certificates can be employed to establish that public keys really belong to the person who claims them. This is done using a trusted Certification Authority (CA) which checks the purported owner's claims using some means (physical proof of identification, for example) and then adds a special digital signature to the public key that effectively validates it.

Of course, an important issue here is compatibility: without standards, a digital certificate issued by one CA might be incompatible with those issued by another, and both might be inaccessible to the software employed by someone who wishes to check and use a certified public key.

To avoid this situation, a standard form for digital certificates is being drawn up. Called X.509, it originally formed part of the X.500 series of standards, but additional work is going on to extend it to embrace all kinds of Internet services, including e-mail, the World Wide Web, user authentication and electronic commerce. Given the growing need for such a common digital certificate standard it is highly likely that its acceptance and uptake will be very rapid once it is finalised. Indeed, many products employing certificates already support the draft version of X.509.

XML

The success of the Web is due in no small part to the simplicity of the underlying HTML which is very easy to write and yet addresses most of online publishing's basic needs. But as the Internet matures, the limitations of HTML are becoming apparent.

Vendors such as Netscape and Microsoft have responded by extending the basic HTML with their own improvements. One obvious problem with this is that it fractures the HTML standard. But there is a deeper issue: bolting on ad hoc additions in this way will always be a partial and temporary solution.

What is needed is a general method for extending HTML according to each situation. That is, a mechanism is required whereby new tags (like ) can be added as needed. Such a mechanism already exists for SGML - Standard Generalised Markup Language.

HTML is sometimes called a subset of SGML, but this is incorrect: it is an application of it, not a cut-down version. However, a true subset of SGML that retained its power to define new tags but without the considerable complexity of SGML itself could well be the solution to HTML's problems.

Such, at least, is the hope of the World Wide Web Consortium in sponsoring the creation of something called Extensible Markup Language, or XML. This is a simplified version of SGML, not a one-off application of it. It can therefore reproduce HTML as it currently stands, but can also be extended in any way needed for a particular situation.

XML on its own is likely to be important as a kind of lightweight version of SGML, but its potential is even greater if XML processors are added to the main browsers, allowing HTML extensions to be generated on a per-site basis.

The current tools for dynamic HTML, which allows almost desktop publishing-like precision in Web page design runs completely counter to the Web's foundations of HTML. HTML is an application of standard generalised markup language (SGML), and is a language that signposts the underlying structure of a Web document.

It says nothing about the appearance as such: it is simply that most Web browsers take the structural information and overlay presentation elements. DHTML is the logical conclusion of this process. But as all businesses know, it is not the surface that really counts but the content.

One of the central challenges facing companies is how to extract information in different ways from the huge data stores they possess. As a result, parallel to DHTML, there has been much work looking at how the structural side of HTML can be strengthened, giving Web users the power to extract more information from their data, and to be able to exchange information more easily.

Do it yourself
The result is called extensible markup language, known as XML. XML is a true subset of SGML: like SGML it allows arbitrary markup languages to be created. HTML, by contrast, is a fixed markup set. In other words, XML lets you make up your own tags, rather than restricting you to using ones for headings like The aim is to allow data to describe itself, and in the process enrich the possibilities for derived information.

By using markup tags that are fully descriptive, it will be possible to improve searches through documents: no longer will a search for the word "net" turn up a useless mish-mash of hits referring to the Internet, fishing nets, hair nets, etc. By defining suitable XML tags for a given industry, it would be possible for firms and departments to exchange data using embedded XML tags to provide cross-compatibility, regardless of the various underlying database structures.

XML markup allows documents to be viewed in different ways by different users. For example, a company handbook might contain information about all aspects of a given product. But the information that was displayed could vary according to whether it was an engineer, an accountant or a manager who viewed it.

Similarly, XML markup would allow agent software to locate and display only information relevant to the user who dispatched it. The standard for XML however, is only really of use for those fluent in SGML. More approachable is the main XML FAQ

Introduction
Also worth noting are two books: XML (£21.95, ISBN 1-56592-349-9) which is fairly technical; and Presenting XML (£22.95, ISBN 1-57521-334-6), which provides a gentler introduction. Alongside the main XML standard, which defines how you can place arbitrary tags within your documents, there are two other important components currently being drawn up.

The first is the extensible linking language. One of the powerful new features of XML is that it allows hypertext links to be much richer than those in HTML. For example, a hotspot can have several links flowing from it, allowing the user to choose the destination. A link can refer to sections of text in a linked document, and there is also a syntax for describing which element is linked to.

The other component is the extensible style sheet language (XSL). Style sheets are applied to XML files to produce the document viewed by the user. Because they accomplish more complex tasks than the style sheets used in HTML, they are considerably more complicated.

One of the firms driving XSL forward is Microsoft: it currently has the best resources on the subject.

Experimenting made easy with XML tools

There are already a number of tools and applications that allow users to experiment with XML. For example, since XML is a subset of the Standard General Markup Language (SGML), any SGML editor can be used to create XML files.

However, not all XML files can be properly dealt with by SGML editors: this is because there is a special minimalist form of XML that comes without the full document type definition, which essentially defines what mark-up tags can be used in a document.

The standard language needs a document type definition, but XML does not: provided an XML document is written correctly, its programs can work out what the underlying rules must be. XML lets users knock together their own markup language and leave the details to the software.

Although SGML editors can generally be used for XML creation, it is better to employ a tool specifically designed for this new language (or rather meta-language, since XML defines other mark-up languages), and one early example is Grif's Symposia Pro.

Now for the hard part
In a sense, XML creation is the easy part: you can even use a basic text editor. The hard part is processing a document usefully. First, it must be parsed: that is, the underlying logic of the document must be extracted (either using the explicit document type definition or implicitly from the document). Once that has been done, the resulting information is passed on to another application.

This will not always to be a simple browser, because the point about XML is that every markup is defined differently, and with very different purposes in mind. As a result, there will probably be as many specialised XML applications as there are mark-up languages.

For example, Microsoft's Internet Explorer 4 comes with its own XML parser (Microsoft is well in the vanguard of XML development). But the main use of this is to deal with Channel Definition Format files, which are written in XML.

In this case the file is not itself displayed, but provides information to the browser about the Web pages that go to make up the Active Channel it defines. Another example of XML in action is MathML. This is the long-awaited mathematical markup language which allows equations to be specified using text rather than symbols. There is already an application able to create and display MathML files: WebEQ is written in Java.

A more dramatic use of XML is the Chemical Markup Language. This allows molecules to be described using an application of XML. Its author, Peter Murray-Rust, has also created Jumbo, a Java-based tool that displays Chemical Markup Language data in a variety of ways. Jumbo is freely available and can display arbitrary XML files as expandable trees.

Other early XML applications include the Open Software Description format, the proposed standard from Microsoft and Marimba for incremental software updates .

A way of describing
The Resource Definition Framework, supported by Microsoft, Netscape and others, is designed to provide a general way of describing data. Another application aims to allow electronic data interchange to take place between companies using XML rather than dedicated protocols.

Companies producing software that employs or supports XML include Arbortext, Chrystal, Poet, DataChannel and Webmethods. It is still very early days for XML, but it is clear that it will have a major impact.

One significant XML event is the release of the first working draft of the Extensible Stylesheet Language (XSL) - see www.w3.org/TR/WD-xsl. As well as being a key tool for the practical application of XML - XSL essentially handles things like the output format - it is also, confusingly, an application of XML, probably the most important so far.

But alongside this key building block there have been many other XML-derived markup languages. Some aim to extend XML itself. For example, the Document Content Description language is a Resource Description Format vocabulary designed to describe constraints to be applied to the structure and content of XML documents. It is notable in part for being a joint submission from IBM and Microsoft.

Other applications lie outside the rarefied world of XML, and apply its techniques to the creation of tools for more practical purposes. For example, the Extensible Log Format aims to create an open and extensible structure for Web server logs.

Two other applications involve the Unified Modelling Language. XML Metadata Interchange is another IBM effort - and is designed to give developers working with object technology the ability to exchange programming data over the Internet in a standardised way.

Applications that are likely to be more directly useful in business include the Signed Document Markup Language which is for signing, co-signing, endorsing, co-endorsing and witnessing operations on documents. Also in the area of security is the Authority Public Key Distribution protocol.

One interesting proposal is the Extensible Forms Description Language (XFDL), which aims to define an open protocol for creating, viewing and completing complex business forms on the Internet. VCard V3.0 XML recreates the VCard profile standard in XML, while the Process Interchange Format aims to help exchange process descriptions automatically among a variety of business process modelling and support systems. XML for fax is designed for integrating general applications with fax servers.

Several applications aim to use XML's ability to structure information exchange. For example, the CASE Data Interchange Format is for transferring information between computer-aided software engineering tools, while the Solution Exchange Standard is for exchanging solutions to customer service problems.

The last two give a hint of just how specialised XML applications can be, and of the range that can be expected in the future.

Abstract XML Applications

Here any Word document can be saved as an HTML file with embedded XML, as well as a conventional Office binary file. With the former, Microsoft can finally resolve a problem of its own making that has been a thorn an the side of IT managers for years: the incompatible binary formats for documents used by successive releases of Word. This use of XML on the desktop as an open interchange format is likely to become widespread in the future.

Enterprise Application Integration

Bluestone has created a good image of how XML servers fit into the overall corporate computing architecture.

XML for buisness infrastructures

Microsoft's application of XML in the extranet sphere complements its already strong commitment on the desktop (in Internet Explorer 5 and Office 2000), and on intranets with its future plans for XML support in its Backoffice products.

Zapata

The frenzy accompanying the introduction of portals in July 1998 meant share prices of most of the firms have continued to rise vertiginously. For example, in JuneYahoo was worth about $6.85bn (£4.3bn), a month later it was $8.85bn.

Companies in other sectors have naturally been looking on enviously at this general Internet share madness.

Some have decided that the best strategy is to redefine themselves as an Internet company - ideally one connected with the hot area of portals - in the hope that their share prices will be rated accordingly.

Improbable
Netscape, with its continued expansion of the Netcenter site, is perhaps the most obvious example. But its efforts to re-invent itself as a portal company are by no means the most improbable. That honour probably belongs to Zapata.

As a useful summary of firms relates, this was founded by former US president George Bush as an oil exploration company.

Notwithstanding this unlikely background, the company has boldly announced its intention to create a "unique and exciting new Internet portal", with the aim of becoming "one of the largest Internet companies in the world" through the spinning-off of a new subsidiary with the not unsuitable name of Zap.

Its first move was an attempt to take over the established number two portal Excite. When this was rebuffed, it adopted a different but, once again, novel strategy.

Zap's portal is now being constructed piecemeal through the purchase of other Internet sites. To this end, the company has placed adverts in US newspapers soliciting such potential acquisitions.

Ragbag
Clearly, this is something of a ragbag at the moment, although the company points out that most of the firms it has acquired are already profitable - something that is by no means common for fledgling Web sites and services.

It also claims that "Zap will be both the beginning and end destination for Internet users" - although it is hard to see how Zap's sites differ in this respect from its competitors.

Still, this bold - some would say reckless - move seems to be working: as the graph of the firm's share price shows, since moving into the Internet world, the company's value according to the stock market has almost doubled.

The big question, of course, is whether Zap has left it too late to join the already-crowded portal market.

Although it might seem to be a big disadvantage compared to the more famous names, it does have one huge plus: it is backed by a profitable company rooted in the safe, if boring, world of physical products.

As a result, if - or perhaps when - the Great Internet Shake-out occurs, and the other high-flyers find themselves with huge deficits and no way of financing them, Zap may well slip in and zap them all, so to speak.

Not by virtue of a superior product, but simply because the company has something as quite basic as a healthy balance sheet.

Credits

The main source and style for this version of NetSpeak is Glyn Moody's excellent Net Speak which is published in Computer Weekly's Getting Wired page. When this version of Netspeak was begun Computer Weekly had not provided the text online. This is now available at http://www.computerweekly.co.uk/. However you now have to search for the 250 documents by entering 'Getting Wired' in the search engine as the original index to these valuable features has disappeared.

With the collection Getting Wired, cuttings from Personal Computer World, PC Plus and Computer Shopper I found it hard to search through so many bits of paper, hence a machine readable version.

NetSpeak

Articles

The World Wide Web