Results 1 to 7 of 7

Thread: How does a web crawler works

  1. #1
    Join Date
    Nov 2009
    Posts
    612

    How does a web crawler works

    Hi,
    I was just wondering how does popular search engine like Google works. All I found that this are called crawler type or also called as web robot of search engine. I need some detail information on this. First how a search engine works and finds a web page what I want. Why they are called as crawlers. Also what is the basic working of this types of search engine means how does a web crawler works. Thanks in advance

  2. #2
    Join Date
    Apr 2008
    Posts
    2,277

    Re: How does a web crawler works

    Sometimes you want to visit a particular site thoroughly and you do not necessarily have a connection at high speed to do so. To remedy this, it is possible to download an entire site so you can see when you're not connected. To do this, you can use a web spider. A web spider reproduced in full - or part - of a site URL on your hard drive. This means that in the address bar of the browser URL, you see a path to your hard drive. It is possible to navigate within the site at different levels of depth that you set when importing data from the site with the vacuum cleaner sites.

  3. #3
    Join Date
    Apr 2008
    Posts
    2,276

    Re: How does a web crawler works

    Web crawler is type of search engine process also known as web robot. It is a type of program or a script which runs automatically on the internet. The script looks for a the pages on the internet for which a query is passed via web browser.The most basic example of the same is a search engine. WWW is like a web and search engines are like spider which crawl on the web and fetch information.

  4. #4
    Join Date
    May 2008
    Posts
    2,792

    Re: How does a web crawler works

    Mostly the search engine crawls on the sites every day to update them in cas of any new content added to it. The search engines saves a copy of a page which are visited mostly so that it becomes easy for them to index later. It is the most basic technique used on the internet and thus widely successful also.

  5. #5
    Join Date
    Apr 2008
    Posts
    2,572

    Re: How does a web crawler works

    Here is how a web crawler works. First when you enter a keyword on the search box of Google it searches for the web address or a url. Then to browse users get access to the same site via http. This is the protocol used by all of use to access web pages. Then once the web servers reads the query you get the pages on your computer screen. This happens in just matter of some seconds.

  6. #6
    Join Date
    Oct 2005
    Posts
    2,358

    Re: How does a web crawler works

    It is right to say that we entirely depend on a single protocol to access internet. You can search text, images, videos, etc on the internet with the help of this crawlers. It is not easy to develop a search engine. A web crawler type of search engine collects information like the url, title, meta tag, content and links. Then the entire information once collected is reverted back to you on your desktop.

  7. #7
    lunalovegod Guest

    Re: How does a web crawler works

    A Web crawler is a computer program that browses the World Wide Web in a methodical, automated manner. Other terms for Web crawlers are ants, automatic indexers, bots, and worms or Web spider, Web robot, or—especially in the FOAF community—Web scutter.

    This process is called Web crawling or spidering. Many sites, in particular search engines, use spidering as a means of providing up-to-date data. Web crawlers are mainly used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches. Crawlers can also be used for automating maintenance tasks on a Web site, such as checking links or validating HTML code. Also, crawlers can be used to gather specific types of information from Web pages, such as harvesting e-mail addresses (usually for spam).

    A Web crawler is one type of bot, or software agent. In general, it starts with a list of URLs to visit, called the seeds. As the crawler visits these URLs, it identifies all the hyperlinks in the page and adds them to the list of URLs to visit, called the crawl frontier. URLs from the frontier are recursively visited according to a set of policies.

Similar Threads

  1. Which is the best Dungeon Crawler
    By GoutamB in forum Video Games
    Replies: 6
    Last Post: 29-09-2011, 10:33 PM
  2. Is it safe to install Crawler Toolbar?
    By bAALAaDITYA in forum Networking & Security
    Replies: 4
    Last Post: 21-05-2011, 10:40 AM
  3. How to write web crawler
    By Ceadda in forum Technology & Internet
    Replies: 5
    Last Post: 14-05-2011, 03:51 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
Page generated in 1,711,672,275.51494 seconds with 17 queries