














Copyright © 2004
by Polaris Computing.
All rights reserved.
| |
|
 |
| |
|

Extracting Useful Information from the Web |
| Web Pan is a
program that extracts information from Web pages. Manually copying
down the information you need from the Web pages is often impractical.
There are a few programs that can help you do this, but none of them are
particularly easy to
use. Web Pan is designed to make the extraction job as easy as
possible.
|

Click on the picture to enlarge.
|
Here is an example
of how Web Pan works.
The user enters the URL, and specifies the target labels
for the information he wants. Then he imports a list of item ID's from
a text file. The program extracts all the wanted information,
displays it on the screen and writes it into a tab-delimited text file. The user can
then easily import this output file into other programs. |
| The user needs
to simply copy the URL from a Web browser, identify the part of the URL that
contains the item ID, and replace that part with "< >" (see the example). For
the target labels of the information he wishes to extract, he enters what he
sees on the Web page (such as "Price:" for a vendor website). There is no
need to actually examine the HTML source code itself. He then specifies the
filename of the input file (a text file containing a list of item ID's). Once the procedure is defined,
the user can run it again and again, with one mouse click. He can run
it on the same item ID's list repeatedly, or a new list of ID's by updating
the input file. The program can
maintain 32 procedures at the same time. The user can also manually enter a
single
item ID as the input. Web Pan comes
in two editions: personal edition and corporate edition. The personal
edition limits the length of the item ID's list to 10 ID's, so each of the
32 procedures can process up to 10 items at a time. The corporate
edition limits the list length to 100 ID's. The limit
is to prevent the program from wasting Web servers' resources. If you
have more items than the limit allows, simply run the procedure a few times,
updating the input file as necessary. The personal edition is free, and can
be freely distributed. You can also use it for evaluation purposes. Click
[here] to download Web Pan Personal Edition.
Polaris Computing also provides an unlimited-installation license for Web Pan, which allows
unlimited installations within a company. Click
[here] for pricing information.
Note: Web Pan can access redirected
websites. If a website does not support programs
other than Web browsers, Web Pan will not try to extract data from it.
|
|