
Web Sites Match Companies and Job Seekers in Ways Previously Impossible.
The Internet is a remarkable leveling tool that brings together people who might otherwise never interact. It works perfectly in the employment arena, allowing job seekers and employers to network in ways that were previously impossible. In just a short period of time, job-related sites have popped up all over the Net. Some simply provide job listings; others let job hunters create and store an online resume that employers can browse at their will. These sites will also notify you when a job of your specifications has been posted.
In this issue I look at three employment-related sites and explore how people are using client/server, Internet/Intranet technology to recruit and obtain employment. I show how each database-driven Web site has solved the puzzle of providing large-scale information retrieval, storage, and delivery over the Internet, from initial development to the site you can call up on your Web browser today.
As more people discovered the value of the Internet for finding jobs, the issues became much more complex. Job sites not only had to maintain large databases of information, but they had to grapple with competitive demands and start generating revenues. This meant drawing more employers and employees to the site, making information as appealing and easy to navigate as possible, and offering more ways to search and personalize their services.
Some companies had a head start in that they already offered client/server versions of their employment products and merely had to port them to the Web. One such site is IntelliMatch Inc.'s IntelliMatch (San Jose, Calif., www.intellimatch.com). The company already had a client/server model of its service built on Oracle with PowerBuilder as its front end, which it sold to large companies.
Four machines maintain the IntelliMatch Web site. Two Sun Enterprise 5000 database servers run Oracle7, the Oracle Web Agent, and Netscape Commerce Server. The site also has two Sun SPARC 20 servers dedicated to the resumes and connected straight into the databases that hold the files of employers and seekers; the servers also run Netscape Commerce Server. At any given time it holds approximately 50,000 resumes.
Between 10 and 12 people were involved in developing the site at any given time. Because the company already offered a client/server model of the product that IntelliMatch is based on, it decided that Internet technology had advanced enough to support the product on the Web in mid-1995. The first Web application was up and running in January 1996, with free text search of resumes; in March of 1996, the company introduced the Precision Matching engine.
The development team used Oracle Designer/2000 for the architectural database design work. According to Director of Software Engineering Edwin Westlake, the team used Oracle because they inherited it from the earlier client/server application and they had licenses for it. Now that the site and application development environment are open on the Web, Westlake hopes to expand the hardware and database platform support.
The team uses Microsoft FrontPage to prototype the HTML pages before putting them on the site. HTML is dynamically generated via PL-SQL calls to the Oracle Web Agent. Although the site is currently monitored by Oracle's own tools, Westlake is evaluating monitoring tools from Platinum and may switch over sometime next year.
The main differentiator for IntelliMatch is its Precision Matching technology, which identifies specific candidate characteristics, matches them to job requirements, and determines whether a candidate is qualified. The technology is based on a structured format in which text resumes are broken down into component parts. It also relies on a common lexicon -- standard names, degrees, qualifications, experience, and so on -- which is stored in the database. Both employer and employee are using the same language to define job postings, resumes, and search criteria.
Employers use this lexicon when specifying what they are seeking in a candidate. After all the criteria have been entered, a dynamic SQL statement is generated that queries the database, and information is returned in the form of an Oracle table. This information is partitioned and appears to the employer as a list of job seekers sorted by how recently the resume has been updated.
The site receives between 200,000 and 220,000 hits daily, with approximately 10,000 unique user IDs per day. Many of these hits are the result of links from partner sites such as Knight-Ridder, InfoSeek, Jobs Across America, and Internet University. Westlake uses WebTrends from e.g. Software Inc. (Portland, Ore.) to measure activity and see how people are navigating the site. A new version is released every three weeks, taking turns between the employee side of the site and the employer side. Westlake tries to introduce some type of site architecture change every three months.
On the Intranet side, the IntraViewer product went into beta last September. It lets employers capture and access all of the skills in their workforce, helping them put project teams together simply by entering a query. It gives managers a dynamic repository of their employees' skills and experience so that they can target specific people for specific projects or new positions. Meanwhile, employees can use IntraViewer to promote themselves and their achievements and find out about advancement opportunities within the company.
SmartMatch is based on the notion that you can view terms in a document as vectors in a high-dimensional space. Each dimension in the vector represents how strongly a particular term is present in a given document. Thus, in a typical domain of jobs or resumes, a document vector might have 20,000 dimensions. Similarly, a query is also represented as a vector in this space. To retrieve documents, you find document vectors that are sufficiently close to the query vector. SmartMatch is capable of searching and scoring 50,000 documents per second, regardless of the number of terms used in the query.
CareerSite content is automatically coded by means of a knowledge base of 40,000 employment concepts (similar in idea to IntelliMatch's lexicon). Concepts are arranged hierarchically and support an unlimited number of synonyms. Concepts consist of words or phrases up to 10 words long that describe various aspects of employee recruitment and job search. Through this common reference to the knowledge base, SmartMatch can find appropriate content regardless of the words or phrases used in the query request.
Work on the site began in February 1995, and it went live in August of the same year. A six-member, full-time staff worked with several outside consultants to develop the site, and the developers continue to add new functionality and create new interfaces. Hosted on a dual-processor Silicon Graphics Challenge DM server, the site runs a combination of Microsoft SQL Server and proprietary content management software. Microsoft SQL Server handles the job subscriber and service administration tasks, and SmartMatch handles the content. The team developed the site in Perl, C, CVS, Apache Web server (from Community ConneXion Inc., Berkeley, Calif.), and CGI.pm. The Web server is linked to the database server via a Perl-based CGI interface. The team plans to use the Microsoft SQL Server authentication module for the Apache server in the future. Approximately 78 templates had to be created with a C program preprocessor to serve up the HTML; however, the team primarily uses Perl tools such as CGI.pm to build templates. The HTML is served via a Perl-based CGI interface.
Aside from a few static pages on the site, the pages are dynamically generated. One example of a dynamic page is job search query input, which generates concept validation pages that let users decide on a variable number of concepts to be used in the query. Responses to a job search query are generated dynamically, as is the job-seeker desktop and the employer virtual office desktop.
The staff members chose to develop their own database manager because they thought that relational databases did not provide the performance, accuracy, or flexibility they needed for data files and keyword retrieval. At that time, they could not find a search engine that let them deliver appropriate response times for document retrievals using a large number of concepts in searching and matching operations, and none let them develop domain-specific scoring. They are currently considering a switch to Oracle and Sybase.
Hotjobs.com lets you search for jobs by keywords or browse a complete listing of jobs, enter or edit an online resume, search for a candidate, or access information on member companies. You can create and save an online resume that you can edit at any time, and you can arrange to have your resume automatically emailed in response to job openings you want to pursue.
The site took a mere four weeks of initial development by a team of two technical engineers and one graphics/layout and design engineer. Running on Oracle7 and developed with Netscape Commerce Server, the site was created in C using the GNU GCC compiler, version 7.2.2. All applications compile and run under SunOS 4.x, Solaris 2.x, and IRIS 6.2. The linking between the Web server and database servers is made with Netscape Commerce Server's CGI, which also generates the HTML.
Companies enter job information into HTML form fields through their browser; this data is passed to Netscape Commerce Server's CGI on the back end to verify the validity of the data presented. The data is then moved into appropriate tables within the database system's back end. Applicants submit their resumes online to specific jobs through a CGI interface that pulls the job information from the job database and sends the member company the applicant's resume, along with information about the job(s) to which he or she is applying.
The site also offers a service to member companies whereby they can gain on-the-fly, realtime statistics regarding how many times each of their jobs came up in an applicant's search, how many times an applicant looked at a particular job, and how many resumes they have received for a particular job. Also, user host information, browser type, and other information is stored for version statistics and tracking purposes. All of this information is available online through CGI-generated HTML forms to member companies.
Earle Ady, vice president of OTEC, was unable to give me specific usage numbers for the site, but he estimates that it receives some 50,000 hits per day, of which approximately 1000 are unique users. These numbers are slightly skewed by the fact that large sites such as AOL, which use proxy servers for their Web servers, count as "one site" because connections are originating from their proxy machine. To monitor the site, OTEC uses tracking facilities within all CGI applications on the back end. Netscape Commerce Server's logging facilities provide hit statistics.
Each solution has its pros and cons, and each site is sure to undergo significant changes over the next six months -- such is the nature of Web applications. The solutions these three sites have found are good representations of how companies are exploiting the "World Wide" nature of the Web and also the storage and searching capabilities of today's database and client/server technology. Along with the growth in this area comes innovation; more sophisticated sites are bound to evolve from the ones reviewed here. I'll keep you posted.