Telegraph FFF Frequently Asked Questions |
Technical Overview Questions
Q: What is the FFF? What is Telegraph?
A: FFF stands for the "Facts and Figures Federation", and is
meant to describe the collection of information stored in databases
available on the Internet. These databases are "federated", in the
sense that independent organizations manage each of them, but they can
be made to work together in a loose affiliation. Note that these
databases have no hyperlinks are not meant to be browsed, and hence do
not form a "web" in the sense of the WWW. The WWW often points to
databases in the FFF, though, so people sometimes say that a database
is "on the web".
Telegraph is the
name of a software research project at UC Berkeley, building an
adaptive dataflow system. The Telegraph project is named after Telegraph Avenue, the famous
main street of the Berkeley campus. Like the street, Telegraph
software is designed to be a vital thoroughfare for a volatile,
eclectic mix coming from all over the world.
Q: Is the FFF the same thing as the deep
web, the hidden web, and dark matter on the web?
What is this all about?
A: Yes, these terms all refer to the same data. As you have
probably noticed, there is a lot of data on the web that you can only
get by filling out a form -- this includes data from directories like
yellow pages, data from government databases, data from stock quote
servers, and so on. Text-search engines do not know what values to
fill into forms, and hence do not crawl or index data behind the
forms. As a result, this data has whimsically been called "hidden",
"deep", or "dark matter" -- i.e. data that you know is there, but you
cannot "see" in a traditional search engine.
Q: Is the FFF engine a new kind of a search engine?
A: The Telegraph FFF engine does search and more, with a focus on
facts and figures, not full text. The FFF engine searches for facts
and figures from selected sites on the Internet, and allows them to
be combined and analyzed in complex ways.
Q: How does the FFF engine differ from traditional
text search engines like Hotbot, Google, AltaVista, etc.?
A: The FFF engine differs from the traditional text search engines in three main
ways.
-
The FFF engine focuses on facts and figures, not on text
documents. Text documents are largely used in a "find and read"
mode. Facts and figures are in some sense richer than text,
since they can be gathered together, correlated, and analyzed.
-
The FFF engine can fetch data from the deep web -- that is,
data that you can only get on the Internet by filling out query forms.
Traditional search engines miss this class of data, which is estimated
to be 500 times larger than the data these search engines contain.
Because this data is not explicitly "published" on static web pages,
it's often the most interesting information on the Internet: not marketing
material, but raw facts and figures stored in large databases.
-
The FFF engine gives you the power to correlate and analyze facts and
figures. Traditional text-search engines cannot do this -- you cannot
"search" for something that has not been computed yet. For example,
try to use a search engine to find out how much Bush and Gore received
in donations from health insurance companies and their employees in
California. Unless somebody has already posted that total as text on
the web, there is no way for a text-search engine to find it out for
you.
Q: Why does the FFF service require me to run
Java and download a plugin?
A: The current FFF demo provides a high-function,
interactive client that cannot be easily realized in static HTML
pages. So we use Java, with the Swing widget set. We are planning to
release a lower-function, less interactive version of the FFF client
that will run without Java and Swing.
Q: Why does the FFF engine seem slow?
A: When you start a query on the FFF demo, it automatically issues
thousands of web-clicks and form-fillout operations on your behalf,
getting data directly off of other websites.
While the FFF engine uses new technologies
invented at Berkeley to adapt to the unpredictable performance of
these websites, it is eventually limited by the speeds of the sites
it's querying.
We also provide "quick and dirty" results. If you're not
interested in seeing the latest versions of the data, you can always
view a slightly out-of-date, pre-computed result to the query. Note
that this is the idea that makes search engines fast -- they build
indexes which basically precompute the results of keyword searches.
For this demo over election facts and figures, we felt that live
data access was more interesting. But the choice is yours.
Election Questions
Q: What are you trying to prove with the FFF
Election 2000 demo?
A; Our intent is to illustrate the practicality of our research on adaptive dataflow, which
brings the power of rich queries to the unpredictable performance of
Internetworked databases. We also hope to highlight the importance
and availability of facts and figures on the Internet -- everybody knows
about text on the web, but many people overlook the facts and
figures.
We are not attempting to build a comprehensive site for election facts
and figures, nor are we trying to affect the election in any way.
Q: Where does the FFF engine get its
data? Can you guarantee the accuracy of your results?
A: Like traditional search engines, the FFF engine gets all of its
data from other sources on the Internet, and as such cannot guarantee the
veracity of any information we use.
We make every effort to ensure that the results we present are
accurate computations over the data we retrieve. However, we note
that the Telegraph FFF engine is a piece of research software, and as
a result there is a chance that computational errors may exist in our
code. We encourage our users to double-check any results before
drawing any conclusions.
Q: How up to date is the data being shown
by the FFF engine?
A: In the current demo, most of the data we show is accessed
off other sites on the Internet, so it is exactly as up to date as those
other sites. However we cache some of this data for short periods, to improve
responsiveness. More information is provided under the "Our Datasources"
section in each query.
Q: I want to ask a different query than the
ones in the demo. How can I do that?
A: This feature will be available shortly. In the meantime, we
would love to hear suggestions for:
- Internet sites with interesting data
- Queries you'd like to be able to ask
We're especially eager to hear from staff at political and media
organizations about ways that we can make the site or the software more useful.
Email us at fffsuggest@fff.cs.berkeley.edu.
Software Availability
Q: Can I get a copy of your software? Is
it open source?
A: An alpha version of our Telegraph software, including source
code, will be made available to the general public soon, pending
bugfixing and documentation efforts. The software will be available
subject to the standard UC Berkeley license (one of the more flexible
open source licenses.)
Please see the Telegraph project
website, and/or email telecode@telegraph.cs.berkeley.edu.
System Requirements
Q: What are the minimum system requirements to view the demonstrations on the web page?
A: You need Java(TM)Plugin 1.2 or higher and Java(TM) 2 Runtime Environment. Here is a tutorial on how to run Swing Applets.
A step-by-step guide to downloading and installing the Java(TM) Plugin is available here.
Reaching Us
Q: Can I contact you?
A: Please do! Use one of the following email addresses as
appropriate: