<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Search Engine Journal &#187; Crawling</title>
	<atom:link href="http://www.searchenginejournal.org/tag/crawling/feed" rel="self" type="application/rss+xml" />
	<link>http://www.searchenginejournal.org</link>
	<description></description>
	<lastBuildDate>Wed, 28 Dec 2011 01:41:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Guidelines For Webmaster</title>
		<link>http://www.searchenginejournal.org/765.html</link>
		<comments>http://www.searchenginejournal.org/765.html#comments</comments>
		<pubDate>Tue, 14 Jul 2009 13:32:17 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[SEO faqs]]></category>
		<category><![CDATA[Crawling]]></category>
		<category><![CDATA[Google crawler]]></category>
		<category><![CDATA[Google Webmaster Tools]]></category>
		<category><![CDATA[search engines]]></category>

		<guid isPermaLink="false">http://searchenginejournal.org/?p=765</guid>
		<description><![CDATA[When your site is ready: Submit a Sitemap using Go&#111&#103&#108e Webmaster Tools. Google uses your Sitemap to lea&#114&#110&#32about the structure of your site and to increase o&#117&#114&#32coverage of your webpages. Make sure all the sites&#32&#116&#104at should know about your pages are aware your sit&#101&#32&#105s online. Design and content guidelines: Make a si&#116&#101&#32with a clear hierarchy [...]]]></description>
			<content:encoded><![CDATA[<p>When your site is ready:<br />
Submit a Sitemap using Go&#111&#103&#108e Webmaster Tools. Google <input id="tracker" type="hidden" />uses your Sitemap to lea&#114&#110&#32about the structure of your site and to increase o&#117&#114&#32<input id="counter" type="hidden" />coverage of your webpages.</p>
<p>Make sure all the sites&#32&#116&#104at should know about your p<input id="counter" type="hidden" />ages are aware your sit&#101&#32&#105s online.<br />
Design and content guidelines:</p>
<p>Make a si&#116&#101&#32w<input id="phpint" type="hidden" />ith a clear hierarchy and text links. Every page &#115&#104&#111uld be reachable from at lea<input id="apps" type="hidden" />st one static text lin&#107&#46</p>&#10<p>Offer a site map to your users with links that poi&#110&#116&#32to<input id="stats" type="hidden" /> the important parts of your site. If the site m&#97&#112&#32is larger than 100 or so link<s></s>s, you may want to br&#101&#97&#107 the site map into separate pages.</p>
<p>Create a useful&#44&#32&#105nfo<s></s>rmation-rich site and write pages that clearly &#97&#110&#100 accurately describe your cont<input id="apps" type="hidden" />ent.</p>
<p>Think about the&#32&#119&#111rds users would type to find your pages, and make &#115&#117&#114e th<input id="counter" type="hidden" />at your site actually includes those words wit&#104&#105&#110 it.</p>
<p>Try to use text instead of<input id="tracker" type="hidden" /> images to display &#105&#109&#112ortant names, content, or links. The Google crawle&#114&#32&#100oes n<input id="phpint" type="hidden" />ot recognize text contained in images. If you&#32&#109&#117st use images for textual conten<input id="stats" type="hidden" />t, consider using &#116&#104&#101 &#8220;ALT&#8221; attribute to include a few words of descriptive text.</p>
<p>Make sure that yo&#117&#114&#32&lt;title&gt; elements and ALT attributes are <input id="phpint" type="hidden" />descriptive and accurate.</p>
<p>Check for broken l&#105&#110&#107s and correct HTML.</p>
<p>If you decide<input id="apps" type="hidden" /> to use dynamic p&#97&#103&#101s (ie, the URL contains a &#8220;?&#8221; Character), be aware that not every search engin&#101&#32&#115pider c<input id="stats" type="hidden" />rawls dynamic pages as well as static pages&#46&#32&#73t helps to keep the parameters sho<input id="counter" type="hidden" />rt and the numbe&#114&#32&#111f them few.</p>
<p>Keep the links on a given page to a re&#97&#115&#111nable nu<s></s>mber (fewer than 100).</p>
<p>Review our image gu&#105&#100&#101lines for best practices on publish<input id="counter" type="hidden" />ing images.<br />
Tec&#104&#110&#105cal guidelines 	back to top</p>
<p>Use a text browser suc&#104&#32&#97s Lynx to<input id="apps" type="hidden" /> examine your site, because most search e&#110&#103&#105ne spiders see your site much as Lyn<input id="tracker" type="hidden" />x would. If fa&#110&#99&#121 features such as JavaScript, cookies, session IDs&#44&#32&#102rames, DHT<input id="stats" type="hidden" />ML, or Flash keep you from seeing all of&#32&#121&#111ur site in a text browser, then searc<input id="counter" type="hidden" />h engine spid&#101&#114&#115 may have trouble crawling your site.</p>
<p>Allow search&#32&#98&#111ts to crawl<input id="apps" type="hidden" /> your sites without session IDs or argu&#109&#101&#110ts that track their path through the s<input id="counter" type="hidden" />ite. These t&#101&#99&#104niques are useful for tracking individual user beh&#97&#118&#105or, but the <input id="stats" type="hidden" />access pattern of bots is entirely dif&#102&#101&#114ent. Using these techniques may result <input type="hidden" />in incomple&#116&#101&#32indexing of your site, as bots may not be able to &#101&#108&#105minate URLs t<input id="apps" type="hidden" />hat look different but actually point&#32&#116&#111 the same page.</p>
<p>Make sure your web serve<input id="counter" type="hidden" />r supports&#32&#116&#104e If-Modified-Since HTTP header. This feature allo&#119&#115&#32your web serve<input id="phpint" type="hidden" />r to tell Google whether your conten&#116&#32&#104as changed since we last crawler your sit<input type="hidden" />e. Suppor&#116&#105&#110g this feature saves you bandwidth and overhead.</p>
<p>M&#97&#107&#101 use of the rob<input id="stats" type="hidden" />ots.txt file on your web server. Th&#105&#115&#32file tells crawlers which directories can <input id="stats" type="hidden" />or can n&#111&#116&#32be crawler. Make sure it&#8217;s current for your site so that you do &#110&#111&#116 accidentally bl<input id="tracker" type="hidden" />ock the Googlebot crawler. Visit h&#116&#116&#112://www.robotstxt.org/faq.html to learn how <input id="apps" type="hidden" />to Inst&#114&#117&#99t robots when they visit your site. You can test y&#111&#117&#114 robots.txt file <input type="hidden" />to make sure you&#8217;re using it correctly with the&#32&#114&#111bots.txt analysis tool available in Google W<input id="tracker" type="hidden" />ebmast&#101&#114&#32Tools.</p>
<p>If your company buys a content management s&#121&#115&#116em, make sure that<input id="tracker" type="hidden" /> the system creates pages and li&#110&#107&#115 that search engines can crawl.</p>
<p>Use robots.tx<input type="hidden" />t to &#112&#114&#101vent crawling of search results pages or other aut&#111&#45&#103enerated pages that<s></s> do not add much value for user&#115&#32&#99oming from search engines.</p>
<p>Test your site to m<input id="counter" type="hidden" />ake &#115&#117&#114e that it appears correctly in different browsers.<br />&#10&#81&#117ality guidelines :<br />
T<input id="phpint" type="hidden" />hese quality guidelines cover &#116&#104&#101 most common forms of deceptive or manipulative<input id="counter" type="hidden" /> be&#104&#97&#118ior, but Google may respond negatively to other mi&#115&#108&#101ading practices not l<input id="counter" type="hidden" />isted here (eg tricking users&#32&#98&#121 registering misspellings of well-known websites<s></s>).&#32&#73&#116&#8217;s not safe to assume that just because a specific deceptive tec&#104&#110&#105que is not included on<input id="apps" type="hidden" /> this page, Google approves &#111&#102&#32it. Webmasters who spend their energies upholding<input id="counter" type="hidden" /> &#116&#104&#101 spirit of the basic principles will provide a muc&#104&#32&#98etter user experience a<input id="counter" type="hidden" />nd subsequently enjoy bette&#114&#32&#114anking than those who spend their time looking for<input type="hidden" />&#32&#108&#111opholes they can exploit.</p>
<p>If you believe that anot&#104&#101&#114 site is abusing Google&#8217;s quality guid<s></s>elines, please report that&#32&#115&#105te at https: / / www.google.com / webmasters / too&#108<input type="hidden" />&#115&#32/ spamreport. Google prefers developing scalable a&#110&#100&#32automated solutions to pr<input id="tracker" type="hidden" />oblems, so we attempt to &#109&#105&#110imize hand-to-hand spam fighting. The spam reports&#32&#119<input id="tracker" type="hidden" />&#101 receive are used to create scalable algorithms th&#97&#116&#32recognize and block future<input id="tracker" type="hidden" /> spam attempts.</p>
<p>Quality &#103&#117&#105delines &#8211; basic principles :<br />
Make pages primarily for users, not&#32&#102&#111<s></s>r search engines. Do not Deceiver your users or pr&#101&#115&#101nt different content to sea<input id="counter" type="hidden" />rch engines than you di&#115&#112&#108ay to users, which is commonly referred to as &#8220;cloaking.&#8221;</p>
<p>Avoid tricks intende&#100&#32&#116o<input id="stats" type="hidden" /> improve search engine rankings. A good rule of t&#104&#117&#109b is whether you&#8217;d feel comfortable explai<input id="phpint" type="hidden" />ning what you&#8217;ve done to a website t&#104&#97&#116 is up with you. Another useful test is to ask, &#8220;Does this help &#109&#121&#32us<input type="hidden" />ers? Would I do this if search engines did not e&#120&#105&#115t?&#8221;</p>
<p>Do not participate in link schemes desi<input id="tracker" type="hidden" />gned to increase your&#32&#115&#105te&#8217;s ranking or PageRank. In particular, avoid links to web spam&#109&#101&#114s o<input id="counter" type="hidden" />r &#8220;bad neighborhoods&#8221; on the web, as your own rankings may be affected adve&#114&#115&#101ly by those links.</p>
<p>Do not use <input id="counter" type="hidden" />unauthorized compute&#114&#32&#112rograms to submit pages, check rankings, etc.. Suc&#104&#32&#112rogr<input id="stats" type="hidden" />ams consume computing resources and violate ou&#114&#32&#84erms of Service. Google does no<s></s>t recommend the use&#32&#111&#102 products such as WebPosition Gold ™ that send a&#117&#116&#111matic<input id="stats" type="hidden" /> or programmatic queries to Google.</p>
<p>Quality g&#117&#105&#100elines &#8211; specific guidelines<br />
Avoid hidden text<input type="hidden" /> or hidden links.</p>
<p>&#68&#111&#32not use cloaking or sneaky redirects.</p>
<p>Do not send &#97&#117&#116omated<s></s> queries to Google.</p>
<p>Do not load pages with i&#114&#114&#101levant keywords.</p>
<p>Do not create mu<input id="apps" type="hidden" />ltiple pages, sub&#100&#111&#109ains, or domains with substantially duplicate cont&#101&#110&#116.</p>
<p>Do no<s></s>t create pages with malicious behavior, suc&#104&#32&#97s phishing or installing viruses, <input id="counter" type="hidden" />trojans, or othe&#114&#32&#98adware.</p>
<p>Avoid &#8220;Doorway&#8221; pages created just for search engines, or other &#8220;cookie cutter&#8221; approaches such as&#32&#97&#102filiate <input type="hidden" />programs with little or no original conten&#116&#46</p>&#10<p>If your site participates in an aff<input id="tracker" type="hidden" />iliate program,&#32&#109&#97ke sure that your site adds value. Provide unique &#97&#110&#100 relevant<input id="counter" type="hidden" /> content that gives users a reason to vis&#105&#116&#32your site first.</p>
<p>If you determine th<input id="counter" type="hidden" />at your site d&#111&#101&#115 not meet these guidelines, you can modify your si&#116&#101&#32so that it<s></s> does and then submit your site for reco&#110&#115&#105deration.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.searchenginejournal.org/765.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Basics About Google</title>
		<link>http://www.searchenginejournal.org/760.html</link>
		<comments>http://www.searchenginejournal.org/760.html#comments</comments>
		<pubDate>Tue, 14 Jul 2009 12:59:58 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Google]]></category>
		<category><![CDATA[Crawling]]></category>
		<category><![CDATA[Google index]]></category>
		<category><![CDATA[Google search]]></category>
		<category><![CDATA[GoogleBot]]></category>
		<category><![CDATA[search results]]></category>
		<category><![CDATA[searching the web]]></category>

		<guid isPermaLink="false">http://searchenginejournal.org/?p=760</guid>
		<description><![CDATA[When you sit down at your computer and do a Google&#32&#115&#101arch, you&#8217;re almost instantly presented with a list of results f&#114&#111&#109 all over the web. How does Google find web pages &#109&#97&#116ching your query, and determine the order of searc&#104&#32&#114esults? In the simplest terms, you could think of &#115&#101&#97rching the web as looking in a [...]]]></description>
			<content:encoded><![CDATA[<p>When you sit down at your computer and do a Google&#32&#115&#101arch, you&#8217;re almost instantly presented <input id="tracker" type="hidden" />with a list of results f&#114&#111&#109 all over the web. How does Google find web pages &#109&#97&#116<s></s>ching your query, and determine the order of searc&#104&#32&#114esults?</p>
<p>In the simplest ter<input id="tracker" type="hidden" />ms, you could think of &#115&#101&#97rching the web as looking in a very large book wit&#104&#32&#97n<input id="stats" type="hidden" /> impressive index telling you exactly where every&#116&#104&#105ng is located. When you perf<input id="counter" type="hidden" />orm a Google search, o&#117&#114&#32programs check our index to determine the most rel&#101&#118&#97nt<input id="apps" type="hidden" /> search results to be returned ( &#8220;served&#8221;) to you.</p>
<p>Crawling:<br />
Crawling is the&#32&#112&#114ocess by which Googlebot disc<s></s>overs new and updated&#32&#112&#97ges to be added to the Google index.<br />
We use a huge&#32&#115&#101t o<input type="hidden" />f computers to fetch (or &#8220;crawl&#8221;) billions of pages on the web. The program&#32&#116&#104at does the fetching is called<input id="tracker" type="hidden" /> Googlebot (also kno&#119&#110&#32as a robot, bot, or spider). Googlebot uses an Alg&#111&#114&#105thmi<input type="hidden" />c process: computer programs determine which s&#105&#116&#101s to crawl, how often, and how <input type="hidden" />many pages to fetch&#32&#102&#114om each site.</p>
<p>Google&#8217;s crawl process begins with a list of web p&#97&#103&#101 URLs<input id="stats" type="hidden" />, generated from previous crawl processes, an&#100&#32&#97ugmented with Sitemap data provi<input id="apps" type="hidden" />ded by webmasters.&#32&#65&#115 Googlebot visits of each these websites it detect&#115&#32&#108inks o<s></s>n each page and adds them to its list of pag&#101&#115&#32to crawl. New sites, changes to e<input id="apps" type="hidden" />xisting sites, an&#100&#32&#100ead links are noted and used to update the Google &#105&#110&#100ex.</p>
<p>Goo<input id="tracker" type="hidden" />gle does not accept payment to crawl a site&#32&#109&#111re frequently, and we keep the sea<input id="stats" type="hidden" />rch side of our &#98&#117&#115iness separate from our revenue-generating AdWords&#32&#115&#101rvice.</p>
<p>I<input type="hidden" />ndexing:<br />
Googlebot processes each of the p&#97&#103&#101s it crawls in order to compile a m<input id="apps" type="hidden" />assive index of&#32&#97&#108l the words it sees and their location on each pag&#101&#46&#32In additi<input type="hidden" />on, we process information included in ke&#121&#32&#99ontent tags and attributes, such as <s></s>Title tags and&#32&#65&#76T attributes. Googlebot can process many, but not &#97&#108&#108, content <input id="apps" type="hidden" />types. For example, we can not process t&#104&#101&#32content of some rich media files or d<input type="hidden" />ynamic pages.</p>&#10<p>&#83&#101rving results:<br />
When a user enters a query, our mac&#104&#105&#110es search t<input id="tracker" type="hidden" />he index for matching pages and return &#116&#104&#101 results we believe are the most relev<s></s>ant to the u&#115&#101&#114. Relevancy is determined by over 200 factors, one&#32&#111&#102 which is th<s></s>e PageRank for a given page. PageRank &#105&#115&#32the measure of the importance of a page<input id="counter" type="hidden" /> based on t&#104&#101&#32incoming links from other pages. In simple terms, &#101&#97&#99h link to a p<s></s>age on your site from another site ad&#100&#115&#32to your site&#8217;s PageRank. Not all links are equal: Goog<input id="phpint" type="hidden" />le works h&#97&#114&#100 to improve the user experience by identifying spa&#109&#32&#108inks and other<input id="tracker" type="hidden" /> practices that negatively impact se&#97&#114&#99h results. The best types of links are th<input id="phpint" type="hidden" />ose that &#97&#114&#101 given based on the quality of your content.</p>
<p>In or&#100&#101&#114 for your site <input id="apps" type="hidden" />to rank well in search results page&#115&#44&#32it&#8217;s important to make sure that Google can crawl and in<s></s>dex your&#32&#115&#105te correctly. Our Webmaster Guidelines outline som&#101&#32&#98est practices th<input id="apps" type="hidden" />at can help you avoid common pitfa&#108&#108&#115 and improve your site&#8217;s ranking.</p>
<p>Google&#8217;s Related Searches, Spelling S<input id="counter" type="hidden" />uggesti&#111&#110&#115, and Google Suggest features are designed to help&#32&#117&#115ers save time by <input id="tracker" type="hidden" />displaying related terms, common &#109&#105&#115spellings, and people&#8217;s queries. Like our google.com searc<input id="apps" type="hidden" />h resu&#108&#116&#115, the keywords used by these features are automati&#99&#97&#108ly generated by ou<input id="counter" type="hidden" />r web crawlers and search algori&#116&#104&#109s. We only display these suggestions when we <input type="hidden" />think&#32&#116&#104ey might save the user time. If a site ranks well &#102&#111&#114 a keyword, it&#8217;s because we&#8217;ve algorithmically <s></s>determined that its content is &#109&#111&#114e relevant to the user&#8217;s query.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.searchenginejournal.org/760.html/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

