Managing a Web of a Mess

Managing a Web of a Mess

by David Strom

(appeared in Infoworld, 12/11/95)

Managing a new World-Wide Web server is like parenting an adolescent: the act of creation is simple when compared to daily maintenance. Like a difficult teenager, a web site has its growth spurts of new content and changing moods as it becomes more popular.

There are a variety of web content management tools, as they are known, to help. But most are still very new and some require a great deal of skill to use -- far more skill than what is required to write pages of Hypertext Markup Language (HTML) or to install the web server software itself.

Content management covers several areas: searching for text, tracking changes to the site, and managing links to other pages and other web sites. Each is an important and often overlooked function by new webmasters. Each area often requires its own set of tools, and installing them can be difficult and require a more-than-modest level of Unix expertise or programming. While the power of the web is its availability on a multitude of server platforms and operating systems, many of these critical tools only work with a few combinations of web server software and operating system versions.

All this means that content management can be a difficult and time-consuming job. Jonathan Young is a creative director at advertising and public relations agency Poppe Tyson in Mountain View Calif. His company has built web sites for a wide variety of high profile clients, including working on a portion of the White House site. "Several of our clients actively manage their sites and it is a full-time job for several people."

The problem is that websites grow like kudzu, and keeping track of newly-minted content can be difficult. "Sites are now expected to have navigation aids, graphics, help sections, pages that look good under a variety of different browsers, and so forth. All of this leads to a chaotic mess of badly linked pages, and a visitor might easily get lost," says Bryan Taylor, director of NetPress Co., an independent website developer based in New York City.

Searching for the right kind of search tools

One way to make navigation easier is to install what is called a "search engine" that indexes various textual information and presents a visitor with a fill-in-the-blanks form to use. There are a variety of both public domain and commercial search tools that can fill this need, and finding the right tool will take some careful searching on their own by web developers. There are several issues to examine, including the actual features of the search tool, whether public domain or commercial search software is more appropriate, and the overall performance and integration of the search tool with the web server software.

Bob Stewart, a district manager responsible for AT&T's Internet tollfree directory (Bridgewater, NJ) has one of the more popular web sites around: their directory gets over 10,000 hits daily from people searching for particular 800-exchange phone numbers. With such a resource, having flexible text searching was critical: users can search for a particular number, a company name, or category of service or product. Stewart used the PL Server from Personal Library Software (Bethesda, Md.) for these searching tasks. "We had our application running in a matter of days, although we did have some minor incompatibility problems with different versions of the software."

Others pick a search tool for speed. Dave Hollander, an Internet technology program manager at Hewlett-Packard in Fort Collins, Colo. has been working with a variety of search tools for many years and has developed an extensive content management system on his site by using their own custom programming. "I need a high performance and reliable searching product," he says. Much of his programming has helped increase the speed of both indexing and searching operations.

Not all search tools offer similar features, however. One of the best features of the PL Server product is what the company calls a "concept search." According to Hollander, "this gives me the ability to try to answer the question the reader wants answered, which may not necessarily be the question they asked originally. This lexical analysis is a key differentiation among various search engines." The feature of concept searching also came in handy at AT&T's on-line directory: Stewart decided to use PL Server rather than an internally developed search engine or a public domain product because it had the flexibility to perform the kinds of searches that end-users normally do when they browse the printed "yellow pages" directories.

Another issue is the very relationship between the web server and its content: "their respective jobs are almost totally unrelated," says Taylor. "What is needed is better integration between the two, so that servers can be much more aware of their own content especially as this content becomes more dynamic."

Hollander agrees: "We could use a lot more professional-quality products for content management. Many of the current generation of tools are interesting, but too invasive of our own processes to be practically deployed. We want to be able to integrate tools into our site, not have them take it over."

Finally, there is the issue of whether to use commercial or public domain search software. Ed Hastings, chief technology officer for Online Computer Market, Inc. a third-party website developer in Southborough, Mass. is bullish on the latter variety: "Our experience with the free search engines has been quite good, once they are customized with the necessary programming." Other webmasters, particularly those at the more popular web sites, don't feel that the public domain software has the performance or the flexibility offered by the commercial products.

"Most webmasters want something to plug in to their website and complement it well. Most search engines are therefore inhibiting because of their high cost, low performance, or complexity," says Taylor.

All told, finding the right search tool and integrating it into oneÍs website will take a great deal of experience, testing, and skill. This market is still very immature, and new products and new versions appear almost weekly.

What about link management?

But search tools are just one aspect of content management: another piece to the puzzle is being able to keep track of links between pages and others' websites. This is especially critical as content on the site is changed, or as external sites change the locations of their content at will, making obsolete oneÍs own links to these places. Indeed, for many webmasters, this is their top priority: "Content management is the heart of our web site. Next in importance is design," says Stewart.

Jeff Moskow, Ready-to-Run Software, Inc. is a small software developer in Forge Village, Mass. who has built his own and others' websites. "Once a web server is installed, there really isn't that much to do or to change. Content, on the other hand, is a constantly changing piece. I suspect that most companies will be continually seeking to revise and refine their message, if they are at all serious about using the Web to promote their businesses." One tool he found was Front Page from Vermeer Technologies, Inc. of Cambridge, Mass. "Vermeer makes changing and adding to our content very easy," he said. "Front Page is very easy to use and it gave us the flexibility to do everything that we wanted to do. It let us do our job much, much faster, especially for getting the visual look of our web pages the way we wanted and to eliminate much of our script programming."

Hastings agrees: "Vermeer is going to be helpful when it comes to testing a web site and ensuring that all the links work -- which can otherwise be a very labor-intensive process. Content management is the key to providing ongoing quality to websites. Front Page will allow us to deliver higher quality with less effort -- broken links will no longer exist. It moves us from working with single files to working on entire sites as a single object. This makes updating a site much easier and allows working copies to be created in a flash."

Other tools are available to manage document links such as HTML Transit from InfoAccess Corporation (Bellevue, Wash.) and several new tools from Novell (Provo, Utah).

To Unix or not to Unix: that is the question

Both the search tools and the link management products share many of the same issues, including the limitation not every tool works with every web server and operating system version. Front Page, for example will work on Windows NT and Solaris 2.4 servers only.

The problem is that most of the commercial content management tools are still focused on the Unix web marketplace, and only recently have begun to expand into non-Unix web servers such as Windows NT and Macintosh, and even Novell NetWare. Novell itself only recently announced a web server product for its operating system, and has a limited number of third party servers available, especially when compared to Unix versions.

The Unix bias is something of a self-fulfilling situation: many of the original web developers at corporations came from Unix backgrounds, and there is still a strong preference for Unix web servers as a result. "You gotta have Unix experience -- we didn't, so we ended up hiring people. If you know HTML, you can limp along," said Young. "Our staff has over 100 person-years of Unix experience," says Stewart, something one might expect given Unix's origins at Ma Bell. Nevertheless, there is hope for those corporations that donÍt wish to run Unix, and perhaps the biggest efforts at supporting a non-Unix operating system have to do with Windows NT. CompuServe, OÍReilly, and others have released commercial versions of NT web servers, and the number of third-party tools are quickly increasing. "One of the best thing an NT Web site developer can do is to install the NT versions of the various Unix tools," says Hastings, who has experience with both operating systems.

After looking over the complete set of tools, though, it may still make sense to run Unix. "Unix is the only operating system I would recommend for running a website," says Taylor. "The tools available are far more powerful, common, and easy to change then those available on Windows or the Macintosh."

The content management marketplace is still very new, and still has plenty of room for improvement. Many of the issues mentioned above have to do with developers still finding their way around the various technical issues of bringing up this new form of electronic publishing. AT&TÍs Stewart says that "we are coming from the printed directory world, an environment with a six-month publishing cycle where data is moved back and forth from offshore locations." His web-based directory is updated twice monthly (making factual corrections immediately) but Stewart feels that the processes have room for improvement: "it still is cumbersome. While we can make changes to the index, the actual data must be updated separately."

"The web is a rich, loamy fertile ground for new products particularly in the area of development software," says Poppe Tyson's Young. Hopefully, the ground will soon bear the fruits of these efforts.


Sidebar -- Picking the right content management tools