CMPnet www.PlanetIT.com
The Network For IT Professionals

Unknown Search Gems
by Greg R. Notess (01/11/99; 9:00 a.m. ET)
URL: http://www.PlanetIT.com/docs/PIT19990114S0003

The search buttons on Internet Explorer and Netscape Navigator make it easy to begin searching the Web. However, not all of the best search engines have paid for prominence on the Netscape and Microsoft search pages. And with Netscape's purchase of the rights to the Excite database under its own brand name and Microsoft offering its own version of an Inktomi database, those pages are likely to point to even fewer external search engines. In this environment, bookmarking some of the lesser-known search engines can make for more powerful Web searching.

Two of the more important up-and-coming search engines are Northern Light and Google! Northern Light features advanced searching capabilities and material beyond the traditional Web. Google! takes a very different approach for improved relevance ranking.

Northern Light
Rather than taking the portal approach, Northern Light has more appeal for the professional searcher and for those wanting to search beyond the Web. Northern Light, available at http://www.northernlight.com or http://www.nlsearch.com, offers a very large database of Web pages as well as access to online versions of published articles.

On the Web side, Northern Light has a database of over 110 million Web pages, making it one of the largest three general Web search engines. In addition to this very large, although far from comprehensive, database of Web pages, Northern Light features its Special Collection - over four million full-text articles from newspapers, magazines, trade journals, and other sources. Northern Light provides the full citations and often even abstracts for free. To see the full text of the article, however, a charge of $1 to $4 is required.

The coverage includes publications such as HP Chronicle, SunServer, InternetWeek, IBM Systems Journal, and InformationWeek. While some of these magazines, like InformationWeek, have well-developed websites with many of the articles from the print version available online, others may have no content available free on the Web or only selections from the print publication. Others may only serve up full-text articles online to subscribers.

Using the Special Collection on Northern Light makes it easier to search numerous publications and to search the past several years' worth of issues. For those that prefer not to pay for the article from Northern Light, their service can still be quite useful in tracking down articles of interest. Then, check on the publication's own website to see if the article is available for free.

Custom Search Folders
Beyond the Special Collection, Northern Light has some other unique features that make it well worth the bookmark. Instead of just sorting the results into an attempt at relevance ranking, Northern Light also sorts the complete search results into specialized folders. These Custom Search Folders can take a large set of hits from searching the Web and make them manageable.

The folders break down the search results into subject groupings, type of documents, source, and even language. For example, a search on "cable modems" finds thousands of Web pages and documents. The first two Custom Search Folders are typically "Search Current News" and "Special Collection documents." Northern Light's current news area includes free access to over 30 news wires. The second folder could isolate the Special Collection documents from the hits from the general Web. After that, there are several subject folders "Cable modems," "Modems," and "Integrated circuits." Some source folders are general, like "Personal pages," while others point to specific hosts such as www.ispcheck.com and www.catv.org. A document type folder of "Questions & Answers" also is available, pointing primarily to FAQs.

In addition to the Custom Search Folders, Northern Light also makes available a whole collection of more advanced searching features. Its Power Search includes the ability to limit by date, language, URL, and country. But those are not all of the advanced search features. Northern Light supports full nested Boolean searching as well as the + and - symbols. The * can be used for stemming and quotes delimit a phrase. The Power Search page also includes options for sorting by date, finding words in the title, and limit by some of the subjects and document types used in the Custom Search Folders.

All in all, Northern Light offers a different searching experience with many advanced features. If you are tired of the consumer and chat emphasis of some of the portal sites, try Northern Light.

Goggle!
Another search alternative comes from a research project at Stanford. Google! is solely a Web search engine with a significantly different approach. Rather than sorting results based on a relevancy ranking algorithm that simply counts word frequency and position, Google! ranks results based on link analysis. Rather than just searching for the occurrence of a term, Google! evaluates links from other Web pages, and in particular the anchor text that is linked.

A search for 'cable modems' on Google! brings back results like http://www.cablemodem.com, a page that is frequently a hypertext link from those words. However, Google! does not just look at the anchor text. It also evaluates how many other pages link to the same site with the same anchor words, and it weights more heavily linkages from well-established websites.

The link analysis approach can yield highly relevant search results for many searches. However, although Google! has moved from its original home at Stanford to its own domain name at http://www.google.com, it's watermark background still proclaims it as an Alpha Test, and its increasing popularity has caused numerous slow downs in its response time.

Google! does not support advanced searching techniques like Northern Light. Instead, it relies on its link analysis approach to give relevant results. As such, it can be an excellent search tool for general topics, products, and company names. While the search results identify phrase matches, it does not allow phrase searching.

Google! does keep cached copies of the pages it indexed and makes the archived pages accessible. These can be served quite quickly and offer a way to find information that has moved since it was last indexed. The original Web page may have been removed, but at least you can see a cached version of what it had looked like when Google! indexed it.

Several other features make Google! rather unique. In addition to the regular search button, Google! sports an "I'm feeling lucky" button. Click that button to be taken directly to the top hit and skip the intervening search results screen.

Aeneid
While Northern Light and Google! offer some different approaches to searching large chunks of the Web, Aeneid provides a more targeted approach for the IT professional. First and foremost, the Aeneid Web site at http://www.aeneid.com serves as an advertisement for the Aeneid Aggregation Platform, which can be used to integrate Internet content with proprietary data. However, as a demonstration of how it works, Aeneid offers some industry-specific search examples at the bottom of its page.

The underlying database comes from Inktomi, who also supplies databases to HotBot, MSN Web Search, Yahoo!, Snap!, and GoTo. The full database is quite large, but Aeneid only uses targeted portions of the full Inktomi offering. Featured subject areas on their demo include High-Tech Industry, Business Press, Hardware Leaders, High-Tech News Analysis, Internet, Networking, Peripherals, Press Releases, Product Reviews, Trade Press, and Software Leaders. In each of these topics, the Aeneid demo chooses known relevant and quality websites to crawl. In addition, with the smaller collection of sites to crawl, they can be crawled every day rather than once every few weeks.

Only one of the subject areas can be searched at a time. And while Inktomi offers many advanced search features, the Aeneid demo only shows a single search box with no options beyond choosing the topic area. Complex searches do not work well here, but for general IT topics and single keywords, their demo can find relevant and very current hits.

Given the "demonstration" nature of the Aeneid search at the moment, there is no guarantee about how long it will be freely available. It may also move from the bottom of the page to a deeper location on its website or to a more prominent one. In either case, it is a search approach that bears watching.

Northern Light, Google!, and Aeneid all offer expanded search opportunities on the Web, but only for those that can find them. Until they either generate enough revenue to be able to afford spots on the Netscape and Microsoft search pages, or until they become popular enough in their own right, bookmarks and intranet links will be the easiest way to reach these lesser-known search engines. In any case, using these less-visited search engines can often find information that is otherwise buried too deep in other search engines.

CMPnet    www.cmpnet.com
The Technology Network

Copyright 1998 CMP Media Inc.