
More and more Web content is being stored in databases and generated dynamically. But storing Web content in databases should not obscure the fundamental nature of most Web content: It is primarily text. Because textual Web content is less structured than transactional or tabular data, the familiar products and techniques used to manage relational data may not be ideal, even if you can force-fit text into relational tables. We must expand and integrate our repertoire of tools and techniques as text becomes an increasingly large part of our data landscape. We should try to cross-fertilize RDBMSs with free-form search capabilities borrowed from text databases and the Web.
The way we search for data is one of the most glaring differences between relational and text or document databases, at least from an end user's perspective. Searching for data in a relational database is not a casual endeavor. Most end user query tools hide the gory details of SQL, but they still expect users to know the database name, the table name, the field to look in, and even the field type. (Metadata layers may gloss over some of these details, but only if an administrator takes the time to create the metadata.) Users may also need to specify a "begins with" or "contains" search in order to avoid missing reasonable matches. You can't just say "Find Charles" without directing the search interface where to look. Is "Charles" a first name, a last name, a company name, or even a street or city name? Yet a free-form or unguided search is quite natural in a text or document database, especially on the Web. You enter a word or phrase, hit the search button, and get a list of matching documents. The result set will even be ranked and sorted based on relevance. Yes, you can define fancy search criteria if you want to, but that's optional.
Why can't it be this easy to search for data buried in relational databases? Although text databases can have structure, searching a text database is far more content oriented. Relational databases impose too much structure on the search process. However, there are some examples that hint at textbase-like flexibility for searching relational databases, or at least tables. When a Microsoft FoxPro table is open in browse view, the Find command on the Edit menu lets you enter a string that FoxPro searches for in any field in any record. It even ignores field type distinctions by treating all displayed data as characters. Essentially, this interface treats a table like a document. The Find dialog in Microsoft Access lets you choose to search the current field or all fields, and you can limit matches to the whole field, any part of the field, or the start of the field. I assume other DBMS products have similar table-level free form search features.
Unfortunately, these examples don't go far enough. They stop at the level of a table, but to be really useful they must operate across the boundaries of tables within a database, databases on a server, and even individual database servers. After all, Web search services span tens of thousands of separate domain names, and some even provide a single search interface for Web sites, newsgroups, and other Internet formats. We take this ignorance of boundaries or structure for granted when we search for what we want.
As corporate Intranets blend access to various information formats, we must let end users seamlessly search across various flavors of relational databases, text and document databases, and even operating system files. Users don't care about technical distinctions that hinder their ability to find information they need.