Navigation Toggle

March 2009 Newsletter

March 31, 2009

March 2009 - Archive

CONTENTS


CUSTOMER QUOTE OF THE MONTH

 

"We use the Thunderstone Search Appliance to crawl, index and search Word files, PDFs and other content in our law firm's internal document management system. The Appliance gives us a lot of customization options in the way it operates, with excellent control over precisely what we want to make searchable and what we don't want included. It does everything we need it to do. You can just plug it in and forget about it. It works great. After years of trouble-free performance, when we finally did have a hardware failure — Thunderstone had us quickly up and running again on the same day we received our replacement unit. Their level of customer support is almost unheard of in the I.T. industry."

 

Michael E. Salopek
I.T. Manager
Janik, Dorman & Winter, L.L.P.
http://www.janiklaw.com


TECH TIPS: CONTROLLING YOUR CRAWL WITH WEBINATOR OR THUNDERSTONE SEARCH APPLIANCES — EXCLUDE BY FIELD

Last time we discussed exclusions and requirements for managing what pages your crawler gets, but there's one setting that gets a Tech Tips all to its own: Exclude by Field. It gives you extra power in how you're excluding and what exactly is being excluded.

    • "Metamorph query" matching

      Rather than a prefix or substring match, Exclude by Field uses a "Metamorph query", which is the full-text matching engine used for our normal searches. You can simply type in words to match, or if you begin with a slash (/) then it is treated as a REX expression (our RegEx-like pattern matching language; see the "REX" section in the Vortex documentation on our website for more details).

    • Multiple fields for exclusion

      All previously discussed exclusion & requirement options operate only on the URL itself. Exclude by Field allows you to exclude based on a number of different other areas:

      • HTML — Matches against the raw HTML of the page. Useful if there's something in an HTML comment that you'd like to base the match on.
      • Text — The formatted text of the URL. This is the same text you'd see if you looked at the list/edit info of a page or at the "Match Info" in the search results. Useful if you what to match text but want to ignore any HTML markup that may or may not be present.
      • All Meta — The contents of all available meta fields are put together and then matched against.
      • Meta Field -> — Matches against the contents of a specific meta field, which you specify in the next column "From Meta Field".
      • Keywords, Description, & Mime Type — Matches against the text of these common meta fields.
      • URL — Matches against the URL, just like Exclusion REX. You may want to use this to get the extra Exclude options, listed below.

 

    • What to exclude

      Beyond more power in specifying what to match, Exclude by Field also gives you more control with what to do when you get a match.

      • Pages and Links — This acts like any other exclusion rule. The page and its links are kept completely out of the walk data.
      • Pages only — The content of the page is not included in the walk, but the links from the page ARE followed.
      • Links only — The page is included in the walk, but the links from the page are not followed.

 

    • A word on efficiency

      A disadvantage that Exclude by Field has when using any Field except URL is the page must be fully fetched before the rule can be applied.

      With all other exclusion rules (and Exclude by Field on URL), the URL can be thrown out before the page is fetched an processed.

      When performing Exclude by Field on the content of the page, though, the page must be downloaded and fully processed before we can know if it has HTML or a Body that matches the rules specified.

      When possible, it's better to use other exclusion rules or the URL target for Exclude by Field, as this will allow you to prune URLs before they are fetched. Still, there are many things that Exclude by Field can do that the other settings simply can't (as mentioned below).

    • Example — Excluding directories from a file crawl

      A perfect example of Exclude by Field is directories when performing a file crawl — we can't fully exclude directories because they are what link to all the files, and without them we'd have nothing. Still, we might want them not to show up in the search. We can get this with Exclude by Field.

      • Metamorph Query "//=>>=" (without the quotes) — This is a REX expression for "match anything that ends in a slash". Please see the REX section of the Vortex documentation if you'd like more details on REX syntax.
      • Field - URL
      • Exclude - Pages only — This will keep the contents of the directory "pages" out of the crawl but will still follow the links to get the actual files and use them in the search.

If you have any questions about how to use Exclude by Field, please feel free to contact Thunderstone Support — and we'll discuss it.


HAPPENINGS

The February 2009 issue of CRN, a publication of Everything Channel and ChannelWeb.com, recognized the "top Channel Chiefs in the industry based upon their record of business innovation and dedication to the partner community." This annual list, which CRN calls "Our definitive guide to the movers and shakers of I.T. channel management," included Frederick A. Harmon (Thunderstone's Channel Director & CSO.)

You can visit the CRN website (http://www.crn.com/crn/chiefs/2009cc.jhtml?chief=136) to view pertinent information about Fred Harmon in the 2009 Channel Chiefs list.

 

UPCOMING

Thunderstone's John Turnbull (President and CEO) will present a workshop session entitled The Next Generation in Search: Today's Best Practices on Friday, April 17, 2009, (2:00 p.m. - 3:30 p.m.) during the DigitalNow 2009 Conference at Disney's Yacht and Beach Club Resorts in Lake Buena Vista, Florida.

Session Description
Search has progressed from a complex tool used by librarians through simple tools that let users perform a keyword search, to today's information access tools that can still provide users a simple interface but make use of much of an association's collective knowledge. In this workshop participants will learn what sorts of information can be behind a search engine and how to make it more valuable to users. The session includes a case study from IEEE, the world's largest technical membership association that significantly improved their business by focusing on their customers and helping them access content in new ways.

DigitalNow (http://www.fusionproductions.com/digitalnow/) is an annual conference that brings together senior-level executives and volunteer leaders from some of the most influential professional and trade associations in America. Produced by Fusion Productions and Disney Institute, two of the foremost authorities in adult educational design, with input from registered attendees and a conference advisory board, DigitalNow addresses the critical issues facing association leaders in the digital age.


GET YOUR FREE FLOOR PASS (A $75 VALUE) TO THE MARCH 30 - APRIL 2, 2009 AIIM INTERNATIONAL EXPOSITION + CONFERENCE IN PHILADELPHIA

The AIIM International Exposition + Conference, the yearly gathering for information management professionals across industries and lines of business, will take place Monday, March 30, through Thursday, April 2, 2009, at the Pennsylvania Convention Center in Philadelphia, PA. With 19 tracks, more than 135 conference sessions featuring more than 100 real-world case studies, and an Expo floor showcasing 200+ information management technology solution providers, the event aims to provide attendees with actionable insight they can use.

REGISTER TODAY FOR YOUR FREE EXPO FLOOR PASS
and get access to all keynotes, general sessions,
Expo floor education and the co-located ON DEMAND Expo!

To receive your free pass, use Registration Code: 615M
when you register at WWW.AIIMEXPO.COM
or call +1 888 824 3004.

Your FREE pass comes to you compliments of Thunderstone Software. Please stop by and visit Fred Harmon (Channel Director & CSO) and Peter Thusat (Communication Director & CMO) at Booth 1045.


 

Thunderstone Software's Frederick A. Harmon Named as 2009 Channel Chief by Everything Channel's CRN Magazine

March 1, 2009

Search Pioneer's Top Sales Executive Honored for Successful Launch of New Thunderstone Reseller/Channel Partner Program

CLEVELAND, OH — Thunderstone Software LLC, the R & D leader that pioneered simultaneous searching of both structured and unstructured data with its TEXIS relational database optimized for full-text search, today announced that Frederick A. Harmon, Thunderstone's Channel Director & CSO, has been named a 2009 Channel Chief by Everything Channel's Computer Reseller News (CRN) magazine. This annual listing recognizes influential I.T. channel executives who defend, promote and execute effective channel partner programs and strategies.

According to Robert C. DeMarzo, SVP/Editorial Director of Everything Channel, "Effective channel executives consistently ensure that the Channel's voice is heard when strategic decisions are being made and continually nurture mutually profitable relationships. This year's Channel Chiefs are strong channel advocates, and we applaud them for their successful partner programs and strategies."

Frederick A. Harmon joined Thunderstone Software as Channel Director & CSO in January 2008. He worked closely with CEO John Turnbull and CMO Peter Thusat to plan, develop and launch a new Thunderstone Reseller/Channel Partner Program that already has five Resellers in North America (with 40+ locations) and four Resellers/Channel Partners in Europe.

"Thunderstone has many proven competitive advantages for efficiently bringing a wide range of enterprise search solutions to large, medium-sized and small businesses — as well as to government entities, NGOs and educational institutions," said Harmon.

He continued, "In addition to Thunderstone Search Appliances, we also provide highly customizable search software products and hard-to-find services in the areas of search analysis and design, prototyping, application development, training, application optimization and application hosting — on either dedicated servers or on virtual servers in our data center.

"And we don't force our sales partners to become enterprise search experts. Unless you choose to have some of your employees trained as Thunderstone-Certified Representatives, you can simply rely on Thunderstone to handle all the product demos, the evaluation process and technical support of your customers."

"Resellers want a sound, practical approach to realistic revenue growth with an experienced enterprise search partner who makes things as easy as possible for them. They prefer a Reseller Program that doesn't impose any fees, minimum volume requirements or sales quotas. They insist on world-class tech support delivered by real engineers who actually solve problems."

"We get it. And it shows."

Resellers/Channel Partners can easily profit from an amazing product that will 'wow' their customers, while Thunderstone provides:

  • Account protection with quick email/fax registration of targeted sales opportunities
  • Free and personalized online product demos tailored to each customer's particular needs, using the customer's own data and matching the customer's desired "look and feel"
  • 30-day eval units shipped to customers and pre-configured to their requirements
  • Superior tech support by phone, email and message board
  • One-time, perpetual licenses offering 40-60% upfront savings and even more dramatic year-to-year savings for customers
  • Product Investment Protection that makes upgrading easy, desirable and affordable
  • 28+ years of real-world success as a search industry pioneer, which means Thunderstone understands better than most what works and what doesn't
  • Discounts for Resellers/Channel Partners and exciting SPIF programs available for salespeople

Harmon concluded, "Top industry analysts expect annual enterprise search sales to exceed $1.1 billion in 2009. We've made it easy for Resellers, Integrators and Solution Providers to profitably collaborate with Thunderstone to capitalize on this rapidly growing enterprise search market."

Thunderstone's enterprise search products include:

 

 

    • Thunderstone's TEXIS
      TEXIS, the innovative development platform behind Thunderstone's entire line of enterprise search products, is the only fully-integrated SQL RDBMS that intelligently queries and manages databases containing natural language text, standard data types, geographic information, images, video, audio and other payload data. Texis powers many diverse, real-time and integrated applications such as message profiling & handling, image library management, help-desk support, online news retrieval, business intelligence, research libraries, litigation support and eCommerce search engines for online catalogs.

 

 

    • The Thunderstone Parametric Search Appliance
      The Parametric Search Appliance delivers the flexibility and power of TEXIS plus the ease of use of an appliance. It provides an easy way to create applications that combine full-text and structured data without programming. The Thunderstone Parametric Search Appliance enables standard SQL data retrieval and full-text keyword searches combined with a user-selected filter on up to 50 data fields.

 

 

    • The Webinator
      The Webinator is a sophisticated Web indexing and retrieval package that allows Website administrators to create and customize a high-quality retrieval interface to collections of web documents no matter where they reside. The Webinator serves as an example of the type of applications that can be built around Thunderstone's Texis RDBMS and Web Script (Vortex.)

 

 

    • The Thunderstone Search Appliance
      The Thunderstone Search Appliance is a plug-and-play device combining the simplicity of a hosted service with the security and performance of a local solution. The Appliance can handle more than 1,000 typical queries a minute — providing excellent value without adding administrative overhead. It delivers an easy-to-use feature set that especially appeals to the mission-focused requirements of non-technical executives and staff.

 

About Everything Channel
Everything Channel (http://www.everythingchannel.com) is the one-stop shop for accessing, enabling, managing and accelerating technology sales channels. From branding and recruiting to marketing and sales, Everything Channel offers technology marketers the unmatched breadth and depth of global brands and market intelligence combined with unparalleled audience loyalty and credibility serving all technology sales channels through an extensive database. Everything Channel provides innovative sales and marketing solutions to arm the sellers of technology with the resources they need to achieve measurable and significant results.

You can visit the CRN website (http://www.crn.com/crn/chiefs/2009cc.jhtml?chief=136) to view pertinent information about Fred Harmon in the list of 2009 CRN Channel Chiefs.

About Thunderstone
As a true industry pioneer — providing some of the world's most powerful, flexible and scalable search solutions since 1981 — Thunderstone Software LLC (http://www.thunderstone.com) has developed hard-to-match expertise in creating high-performance products with tremendous value for governments, NGOs, educational institutions and businesses of all sizes.

Sales contact: Fred Harmon

+1 216 820 2200 ext.105

Media contact: Peter Thusat

+1 216 820 2200 ext.118

Thunderstone Software to Showcase Its Enterprise Search Solutions at the 2009 AIIM International Exposition + Conference

March 1, 2009

PHILADELPHIA, PA — Thunderstone Software LLC, the R & D leader that pioneered simultaneous searching of both structured and unstructured data with its Texis relational database optimized for full-text search, will participate as an exhibitor (Booth 1045) during the 2009 AIIM International Exposition + Conference at the Pennsylvania Convention Center in Philadelphia.

WHO: Thunderstone Software's Fred Harmon (Channel Director & CSO) and Peter Thusat (Communication Director & CMO)
WHAT: AIIM International Exposition + Conference
WHEN: March 30 - April 2, 2009
WHERE: Pennsylvania Convention Center, 1101 Arch St, Philadelphia, PA 19107

AIIM exhibition visitors to Booth 1045 will have an opportunity to learn more about Thunderstone's enterprise search products, including:

 

 

    • Thunderstone's TEXIS
      TEXIS, the innovative development platform behind Thunderstone's entire line of enterprise search products, is the only fully-integrated SQL RDBMS that intelligently queries and manages databases containing natural language text, standard data types, geographic information, images, video, audio and other payload data. Texis powers many diverse, real-time and integrated applications such as message profiling & handling, image library management, help-desk support, online news retrieval, business intelligence, research libraries, litigation support and eCommerce search engines for online catalogs.

 

 

    • The Thunderstone Parametric Search Appliance
      The Parametric Search Appliance delivers the flexibility and power of TEXIS plus the ease of use of an appliance. It provides an easy way to create applications that combine full-text and structured data without programming. The Thunderstone Parametric Search Appliance enables standard SQL data retrieval and full-text keyword searches combined with a user-selected filter on up to 50 data fields.

 

 

    • The Webinator
      The Webinator is a sophisticated Web indexing and retrieval package that allows Website administrators to create and customize a high-quality retrieval interface to collections of web documents no matter where they reside. The Webinator serves as an example of the type of applications that can be built around Thunderstone's Texis RDBMS and Web Script (Vortex.)

 

 

    • The Thunderstone Search Appliance
      The Thunderstone Search Appliance is a plug-and-play device combining the simplicity of a hosted service with the security and performance of a local solution. The Appliance can handle more than 1,000 typical queries a minute — providing excellent value without adding administrative overhead. It delivers an easy-to-use feature set that especially appeals to the mission-focused requirements of non-technical executives and staff.

 

About AIIM
The AIIM International Exposition + Conference (http://www.http://www.aiimexpo.com) is the definitive industry gathering for information management professionals across industries and lines of business. It's produced and managed by Questex Media Group, Inc., a global, diversified business-to-business integrated media and information provider, headquartered in Newton, MA.

 

About Thunderstone
As a true industry pioneer -- providing some of the world's most powerful, flexible and scalable search solutions since 1981 -- Thunderstone Software LLC (http://www.thunderstone.com) has developed hard-to-match expertise in creating high-performance products with tremendous value for governments, NGOs, educational institutions and businesses of all sizes.

Sales contact: Fred Harmon

+1 216 820 2200 ext.105

Media contact: Peter Thusat

+1 216 820 2200 ext.118

February 2009 Newsletter

February 28, 2009

February 2009 - Archive

CONTENTS



TECH TIPS: CONTROLLING YOUR CRAWL WITH WEBINATOR OR THUNDERSTONE SEARCH APPLIANCES — REQUIREMENTS & EXCLUSIONS

The crawler provides many ways of controlling what you do and don't crawl. Note that URLs manually specified by you (Base URLs, URL URLs, Single Pages, etc.) are exempt from all inclusion/exclusion rules — they will always be used.

  • Exclusions

    The shotgun approach — any URLs that contain any of the text listed in an exclusion line anywhere in the URL will not be included in the walk. It doesn't need to be a full path or filename, sub-matches are okay.

    If you specify "archive" as an exclusion, then "http://www.example.com/archive/index.htm" will be excluded and "http://www.example.com/site/newsarchivefrom2004" will also be excluded.

  • Exclusion prefix

    Like Exclusion, except it has to be the same starting from the beginning. This gives a bit more control over what exactly matches.

    If you specify "http://www.example.com/archive" as an exclusion prefix, then "http://www.example.com/archive/index.htm" will be excluded and "http://www.example.com/archivePages..htm" will be excluded, but "http://www.example.com/site/newsarchivefrom2004" will be allowed.

  • Required prefix

    The opposite of Exclusion prefix — instead of rejecting URLs that DO match the prefix, it rejects URLs that DON'T match the expression. Both settings are used for weeding out URLs, it just swaps which are used and which aren't. Multiple Required prefixes can be specified, and URLs are allowed if they match at least one.

    If you specify "http://www.example.com/archive" as an required prefix, then "http://www.example.com/archive/index.htm" will be used and "http://www.example.com/archivePages..htm" will be used, but "http://www.example.com/site/newsarchivefrom2004" will be excluded.

  • Exclusion REX & Required REX

    Similar ideas to Exclusion prefix & Required prefix, except you use our powerful REX pattern matcher to specify what should match instead of just a prefix. It's similar to regular expressions but much faster. Please see the the REX pages in the Vortex manual on our website (http://www.thunderstone.com/site/vortexman/rex_split.html) for more details on the exact syntax.

If you specify both requirements and exclusions, then URLs must satisfy both to be used — they must not match any Exclusion Prefix, AND they must match at least one Required Prefix (if specified).

There's an even more powerful way to exclude pages with Exclude by Field, but that's for another Tech Tips article. (Watch for it in next month's newsletter.)

If you have questions about how any of these operate, feel free contact Thunderstone Support.


HAPPENINGS

Steve Kolowich, a reporter for THE CHRONICLE OF HIGHER EDUCATION, noted what he referred to as Thunderstone's determined efforts at "Out-Googling Google" in his article entitled In Search of a Better Search Engine (http://chronicle.com/free/v55/i24/24a01501.htm) for the February 20, 2009 issue of The Chronicle. He wrote, in part:

The Virginia Bioinformatics Institute at Virginia Tech, facing a thickening swamp of digital documents, opted for Thunderstone's search appliance, which starts at $13,000, about six months ago. The institute uses the device to index reams of unpublished data and notes stored on its intranet. James E. Stoll, who leads Internet projects at the institute, said the appliance allowed research collaborators and other authorized users to retrieve items from across the institute's network of repositories without exposing those documents to the public Web, as basic site-search software would require. Researchers "don't want to be scooped," Mr. Stoll said. "This is their livelihood."

 

UPCOMING

Thunderstone's Fred Harmon (Channel Director and CSO) and Peter Thusat (Communication Director and CMO) will participate as exhibitors (Thunderstone Software Booth: 1045) during the AIIM International Exposition and Conference March 30 - April 2, 2009 at the Pennsylvania Convention Center in Philadelphia, PA.

Conference: March 30 - April 2, 2009
Exhibits: March 31 - April 2, 2009


GET YOUR FREE FLOOR PASS (A $75 VALUE) TO THE MARCH 31 - APRIL 2, 2009 AIIM INTERNATIONAL EXPOSITION IN PHILADELPHIA

REGISTER TODAY FOR YOUR FREE EXPO FLOOR PASS
and get access to all keynotes, general sessions,
Expo floor education and the ON DEMAND Expo!

To receive your free pass, use Registration Code: 615M
when you register at WWW.AIIMEXPO.COM
or call +1 888 824 3004.

Your FREE pass comes to you compliments of Thunderstone
Software. Please stop by and visit us at Booth 1045.


CUSTOMER SUCCESS STORY: USING WEBINATOR TO SEARCH ONLINE COLLECTIONS OF EURASIAN AND EAST EUROPEAN RESEARCH

The Center for Russian & East European Studies, a sub-unit of the larger University Center for International Studies (UCIS) at the University of Pittsburgh, won a competition a number of years ago to create the Vladimir I. Toumanoff Virtual Library — a collection that includes searchable online documents from many top U.S. researchers and analysts who write about politics, history, sociology, economics and foreign policy related to the states of the former Soviet Union and Central and Eastern Europe. Thunderstone's Webinator indexing and retrieval software enabled the responsible Informatics team to accomplish this goal in an efficient and affordable manner.

The University Center for International Studies (UCIS) provides the organizational framework that supports the University of Pittsburgh's mission to integrate and reinforce all its strands of international scholarship in research, teaching and public service. UCIS includes — in addition to many other highly-acclaimed programs and component units — a Center for Russian & East European Studies, an Asian Studies Center, a Center for Latin American Studies, a European Studies Center, an International Business Center (jointly sponsored with the Katz School of Business) and a European Union Center of Excellence (funded by the European Union.)

As a thin layer on top of the whole UCIS structure, Central Administration handles all business-related core functions and technology issues. When individuals in any of the sub-units need advice or consulting related to I.T. Services, Knowledge Management, database planning, upgrading of their websites or anything else that would fall into technology-mediated information, they call upon Mark J. Weixel, Director of Informatics at UCIS.

Discovering Webinator and Getting Started With Using It as an Easily Customizable Development Tool

Weixel recalled, "Back in I guess it was '98, I found out about Webinator from a friend of mine who was at Princeton at the time. We had a particular niche here in International Studies, and we wanted to create mini search engines for web content that was specific to certain world regions. We were hoping to create search engines like AltaVista, since Google wasn't even around then, that would allow people to do full-text searching of those websites. But, because we were vetting the list of sites, we thought we could increase the probability that searchers would come across something really relevant to the part of the world we were focusing on.

"We used Webinator to index and search collections of websites that were in and dealt with Russia and Eastern Europe.

"So, that was my original introduction to Webinator. We bought the entry-level product to begin with, and we currently have the Enterprise version. What I really like about it, still, is the fact that it's relatively easy to configure. It's much easier to configure that it was back when we bought the original product, when everything was run through command lines. I like the notion of relevance in terms of returned hits. It seems to make a lot more sense to me than, for example, Google page ranking — which places a much higher priority on popularity than it does on the actual content of the pages where text matches.

"Another thing that has been nice is the fact there is support for synonym matching within the server. And I think Vortex as a scripting language is very powerful. Even though I haven't used it to its fullest ability, it's proven to be quite flexible when we've needed to make modifications." Read More...
Download the 3-page UCIS case study PDF here.


Feedback, suggestions and questions are welcome. Send your email to

Customer Success Story: Using Webinator To Search Online Collections Of Eurasian And East European Research

February 24, 2009
Customer Success Story: Using Webinator To Search Online Collections Of Eurasian And East European Research

The Center for Russian & East European Studies, a sub-unit of the larger University Center for International Studies (UCIS) at the University of Pittsburgh, won a competition a number of years ago to create the Vladimir I. Toumanoff Virtual Library — a collection that includes searchable online documents from many top U.S. researchers and analysts who write about politics, history, sociology, economics and foreign policy related to the states of the former Soviet Union and Central and Eastern Europe. Thunderstone's Webinator indexing and retrieval software enabled the responsible Informatics team to accomplish this goal in an efficient and affordable manner.

The University Center for International Studies (UCIS) provides the organizational framework that supports the University of Pittsburgh's mission to integrate and reinforce all its strands of international scholarship in research, teaching and public service. UCIS includes — in addition to many other highly-acclaimed programs and component units — a Center for Russian & East European Studies, an Asian Studies Center, a Center for Latin American Studies, a European Studies Center, an International Business Center (jointly sponsored with the Katz School of Business) and a European Union Center of Excellence (funded by the European Union.)

As a thin layer on top of the whole UCIS structure, Central Administration handles all business-related core functions and technology issues. When individuals in any of the sub-units need advice or consulting related to I.T. Services, Knowledge Management, database planning, upgrading of their websites or anything else that would fall into technology-mediated information, they call upon Mark J. Weixel, Director of Informatics at UCIS. 

Discovering Webinator and Getting Started With Using It as an Easily Customizable Development Tool

Weixel recalled, "Back in I guess it was '98, I found out about Webinator from a friend of mine who was at Princeton at the time. We had a particular niche here in International Studies, and we wanted to create mini search engines for web content that was specific to certain world regions. We were hoping to create search engines like AltaVista, since Google wasn't even around then, that would allow people to do full-text searching of those websites. But, because we were vetting the list of sites, we thought we could increase the probability that searchers would come across something really relevant to the part of the world we were focusing on.

"We used Webinator to index and search collections of websites that were in and dealt with Russia and Eastern Europe.

"So, that was my original introduction to Webinator. We bought the entry-level product to begin with, and we currently have the Enterprise version. What I really like about it, still, is the fact that it's relatively easy to configure. It's much easier to configure that it was back when we bought the original product, when everything was run through command lines. I like the notion of relevance in terms of returned hits. It seems to make a lot more sense to me than, for example, Google page ranking — which places a much higher priority on popularity than it does on the actual content of the pages where text matches.

"Another thing that has been nice is the fact there is support for synonym matching within the server. And I think Vortex as a scripting language is very powerful. Even though I haven't used it to its fullest ability, it's proven to be quite flexible when we've needed to make modifications."

Implementing a Sophisticated Indexing and Retrieval Package with an Attractive ROI Track Record

Did they look at any competing products? According to Weixel, no, they didn't — for a couple of reasons. One, they're a small shop and they have to ask, "How much is this going to cost?" And, he said, the ROI for a one-time investment in a perpetual Webinator license was always pretty clear. It was a known quantity to them. Plus, Weixel strongly believed, as the person in charge of actually setting up and administering it, Webinator provided an affordable and high-quality solution for his specific application requirements. The business manager trusted Weixel's judgment, and by all accounts Webinator has delivered excellent results.

As to future expansion beyond the Center for Russian & East European Studies, discussions have begun with several of the other sub-units within UCIS. The Center for Latin American Studies and the European Studies Center also seem interested in putting more and more of their materials online — newsletters, conference reports, etc.

Webinator offers UCIS sub-units the possibility of acquiring a well-proven search engine that they could customize as desired and manage on their own.

Digitizing, Capturing and Making Searchable the Publications that Comprise the Vladimir I. Toumanoff Virtual Library

Weixel said their Webinator-powered search implementation getting the heaviest use right now is a project that the University of Pittsburgh's Center for Russian and East European Studies (REES) has done in conjunction with The National Council for Eurasian & East European Research (NCEEER, frequently pronounced 'Nickser') — a federally funded organization charged with supporting research, typically in social sciences, focusing on the former Soviet Union and Eastern Europe.

REES won a competition a number of years ago to create the Vladimir I. Toumanoff Virtual Library comprised of research reports and working papers submitted to NCEEER by scholars under their grants over the last two decades. This collection includes searchable online documents from many top U.S. researchers and analysts who write about politics, history, sociology, economics and foreign policy related to the states of the former Soviet Union and Central and Eastern Europe. NCEEER continues adding to the collection as its funded researchers prepare new papers.

"We proposed scanning and digitizing more than 20 years' worth of reports and then taking it and essentially pointing Webinator at it and, using the documents plug-in, doing a full-text index of the entire corpus. And I think one of the reasons that we won the competition is because, once we had done the really hard work of creating PDFs out of all the printed documents — we were going to be able to put it in once place and, overnight, have a full-text search index. It's my understanding that that was not a component of the other proposals," said Weixel.

He continued, "We successfully contended for that particular project, got it, spent the better part of nine months digitizing the materials and, I kid you not, it took, I think, less than 24 hours, and we had a fully searchable index of the entire corpus of research products. And it worked out well. We have this nice, targeted archive of material. We've got it set to re-index on a regular schedule, so anytime NCEEER gets a new batch of project reports — they upload them, they get caught in the next cycle of indexing, and it makes us very happy.

"The search interface for the archive materials of NCEEER is available through the Vladimir I. Toumanoff Virtual Library at the website of The National Council for Eurasian & East European Research. You kick off the search there, and then you're transported to Pittsburgh for the actual results set.

"Recently we put the server housing Webinator behind the firewall as part of our new increased security policy at the University of Pittsburgh. The fact that the folks at Thunderstone — John, in particular, in the Support Group — were able to work with me in coming up with a way to take a search query and pipe it through a back door into Webinator and then take the result set and present that to users in an accessible front-end, was just fantastic. It took me about two weeks once I had access to the beta version of the code, and that worked out really well. It was satisfying for me on a number of levels, not just because the product did what it was supposed to, but because I had support from people who could actually help me efficiently accomplish what I needed to do. That worked out very, very well."

Weixel added, "Our audience is interesting. Of course, we're housed within a major research university. So, we do have a number of our projects where we're trying to target our students and our faculty. But the area studies centers, these sub-units underneath the University Center for International Studies, most of them have federal funding that mandates what they call 'outreach' — trying to bring the message of international studies to a larger community, whether it's a local business community or whether it's local educators at the Kindergarten through high school level. Most of them probably have some kind of academic interest in one of the regions of focus. However you look at it, it's a pretty large and diverse audience.

"Being in an international studies environment, one thing that is important to us is foreign language support. I will admit to not having tried this yet with any of the CJK languages. But, in terms of the European and Cyrillic-based languages that we've indexed, Webinator has been a really good performer. And we've been quite happy with that."

For more information about UCIS or any of its area studies centers, you may contact UCIS by mail or email at:

University of Pittsburgh
University Center for International Studies
4400 Wesley W. Posvar Hall
Pittsburgh, PA 15260

Recent