Getting into Google

an article added by: Carlos Torres at 04302007


In: Categories » Internet and online » Search engines » Getting into Google

Getting into Google This article is about getting your site to appear on Google search pages. I’m not talking about the Google Directory, submission to which is a simple matter also covered here. The challenge is to appear in search results based on keywords related to your site. Articles 3 and 4 focus on becoming more prominently placed on those search results pages; this article is more elementary but no less crucial for new sites. The Three-Step Process Many of the suggestions, tactics, and concepts discussed in this article and Article 3 and 4 apply to both getting into Google (the first step) and improving a site’s status in Google (an ongoing project). Understanding the Google crawl (this article), networking your site, and site optimization are important topics for newcomers and veterans alike. There’s no proper order in which to tackle these subjects they are presented here in a certain order, but the topics in these three articles add up to a single process that maximizes your site’s exposure in Google. Here is a summary of the ground covered in these three articles:

-  Getting into Google. Understand how the Google spider crawls the Web and what the spider looks at. Judge whether to submit a new page manually to the index or let the spider find it. Find out how to keep material out of Google.

-  Networking your site. Develop a matrix of incoming links, which is crucial for building a higher status in Google and effective for getting into the index at the start.

Optimizing your site for Google

Create content, optimize your page’s meta tags, and introduce keywords as the fundamental building blocks of a highly ranked site. These are golden topics for the serious Webmaster at all stages of business development, from conception to customer interaction. First things first. New sites must get into Google and then work to raise their profiles. Getting into Google really means getting into the Google index, which is a database of Web content. Google builds the index by crawling through the Web collecting pages. When a user searches for keywords, Google doesn’t actually search the Web it searches its index. If your site already appears in Google search results, you might feel tempted to skip this article and head straight for Article 3. However, the next two sections contain useful information about Google’s behavior and ways for both new and existing sites to leverage its quirks.

Meet Google’s Pet Spider All search engines operate in the same basic way: they crawl the Web with automatic software robots called spiders or crawlers, which create searchable indexes of Web content. Every engine allows visitors to search its index for keywords and groups of keywords. Search results come in a variety of list formats, but most display a bit of information about each Web page in the list and a link to that page. Each engine’s index is unique, thanks to the programming of its spider. The main element of that programming is the engine’s algorithm, which ranks pages in an index. This ranking determines the order in which search results are presented. Google’s central technology asset is its algorithm the complex ranking formula that gives people good search results and often seems to be reading people’s minds when they Google something. The results of Google’s algorithm are summarized in a single ranking statistic called PageRank. Google is secretive about the software formula from which PageRank is derived, but the company does promote the importance of PageRank, and offers Webmasters broad hints for improving a site’s PageRank. Google displays a general approximation of any page’s rank (on a 0-to-10 scale) in the Google Toolbar. Although the exact formulation of PageRank is a well-protected secret, its basic ingredients are well-known (and discussed in Article 3).

Search engine integrity

One reason pre-Google search engines declined in usefulness and popularity as Web-content portals was the emergence of paid listings. Hungry for revenue, some engines sold positions on the search results page to advertisers. This dilution of objectivity polluted search results and undermined the essential democracy of the Web. The distinction blurred between search engines, which supposedly located what you wanted, and browser channels, which sent you to the browser’s business affiliates. Even though many search engines did not accept paid placement, distrust grew among users. Google started a renaissance of utility and trust. Google’s integrity is symbolized by its gunk-free home page, the spartan design of which lures the user with the promise of search, and nothing but search. To be sure, Google accepts advertising, and Parts II and III of this article are all about Google ads. But Google’s paid content is clearly separated from search listings. Not everyone agrees with the ranking of search results in Google, but nobody thinks that a high rank can be bought. Timing Google’s crawl Google crawls the Web at varying depths and on more than one schedule.

The so-called deep crawl occurs roughly once a month. This extensive reconnaissance of Web content requires more than a week to complete and an undisclosed length of time after completion to build the results into the index. For this reason, it can take up to six weeks for a new page to appear in Google. Brand new sites at new domain addresses that have never been crawled before might not even be indexed at first, depending on considerations explained later in this article. If Google relied entirely on the deep crawl, its index would quickly become outdated in the rapidly shifting Web. To stay current, Google launches various supplemental fresh crawls that skim the Web more shallowly and frequently than the deep crawl. These supplementary spiders do not update the entire index, but they freshen it by updating the content of some sites. Google does not divulge its fresh-crawling schedules or targets, but Webmasters can get an indication of the crawl’s frequency through sharp observance. Google has no obligation to touch any particular URL with a fresh crawl. Sites can increase their chance of being crawled often, however, by changing their content and adding pages frequently.

Remember the shallowness aspect of the fresh crawl; Google might dip into the home page of your site (the front page, or index page) but not dive into a deep exploration of the site’s inner pages. (More than once I’ve observed a new index page of my site in Google within a day of my updating it, while a new inner page added at the same time was missing.) But Google’s spider can compare previous crawl results with the current crawl, and if it learns from the top navigation page that new content is added regularly, it might start crawling the entire site during its frequent visits. The deep crawl is more automatic and mindlessly thorough than the fresh crawl. Chances are good that in a deep crawl cycle, any URL already in the main index will be reassessed down to its last page. However, Google does not necessarily include every page of a site.

As usual, the reasons and formulas involved in excluding certain pages are not divulged. The main fact to remember is that Google applies PageRank considerations to every single page, not just to domains and top pages. If a specific page is important to you and is not appearing in Google search results, your task is to apply every networking and optimization tactic described in Article 3 to that page. You may also manually submit that specific page to Google (see the next section). The terms deep crawl and fresh crawl are widely used in the online marketing community to distinguish between the thorough spidering of the Web that Google launches approximately monthly and various intermediate crawls run at Google’s discretion. Google itself acknowledges both levels of spider activity, but is secretive about exact schedules, crawl depths, and formulas by which the company chooses crawl targets. To a large extent, targets are determined by automatic processes built into the spider’s programming, but humans at Google also direct the spider to specific destinations for various reasons, some of which are discussed in this article. Earlier, I said that the Google index remains static between crawls.

Technically, that’s true. Google matches keywords against the index, not against live Web content, so any pages put online (or modified) between visits from Google’s spider remain excluded from (or out of date in) the search results until they are crawled again. But two factors work against the index remaining unchanged for long. First, the frequency of fresh crawls keeps the index evolving in a state that Google-watchers call everflux. Second, some time is required to put crawl results into the index on Google’s thousands of servers. The irregular heaving and churning of the index that results from these two factors is called the Google dance. To submit or not to submit You can get your site into the Google index in two simple ways: -  Submit the site manually -  Let the crawl find it Neither method offers a guarantee. Google accepts URL submissions, but it doesn’t respond to them nor assure Webmasters that their submissions will be added to the index. When Google decides to manually add a site, it does so by sending the spider crawling to the submitted URL to take stock of the site’s various pages. Characteristically, Google doesn’t inform the Webmaster that the site has been accepted, and it doesn’t provide a schedule for crawling accepted sites.

Google’s hands-off operation

Google is a reasonably communicative company in certain departments, such as AdWords, AdSense, and enterprise solutions. And Google accepts URL submissions for the index, though it doesn’t acknowledge them. But asking humans at Google to interfere with the construction of its index is an exercise in futility. Google builds its index through robotic interaction, for the most part, and prides itself on these sophisticated automated processes. Google does not correct a Webmaster’s outdated listings or make any custom change to the index. The company counts on time and thorough crawling to solve problems. Google doesn’t want to hear from you about your index issues. The key to attracting Google’s spider is getting your page linked on other sites. Google finds your content by following links to your pages. With no incoming links (also called backlinks), you are an unreachable island as far as the Google crawl is concerned. This isolated condition is the natural state of any new site. Of course, anybody can reach you directly by entering the URL, but you won’t pluck the spider’s web until you get some other sites to link to you. Submitting a site might not be a ticket to instant success, but at least it’s easy. Enter your submitted URL at this address:

www.google.com/addurl.html

Fill in the form and click the Add URL button, keeping in mind that the button is misnamed. You are not adding the URL, you are submitting it. Only the spider can add your site, and only a Google human can tell it to. If you add a page to a URL already in the Google index, there’s no need to submit the new page. Under most circumstances, Google will find the new page the next time your site is crawled in its entirety. You don’t have to choose between submitting and not submitting; do both if you’re impatient. Submitting doesn’t stop the spider from visiting you in the normal course of events, but it doesn’t encourage the spider, either. Conversely, the spider’s failure to find you doesn’t affect the disposition of your submitted request.

Are you getting the idea that gaining admission to Google’s index is a crapshoot? Not really. In fact, Google’s spider is so thorough that entering the index is practically inevitable if you follow the networking suggestions in the next article. Submitting a URL manually is a crapshoot, though. My best suggestion is to submit if you must, but don’t only submit. Get to work networking your site and implementing other optimization tactics in Article 3, which will get you inside the index more quickly and push your site to a higher PageRank. The directory route If submitting a URL seems too uncertain and networking seems too difficult, you can get into the Web index by getting your site listed in the Google Directory.

The Google Directory is a categorized list of Web sites, built by hand. Google does not build its own directory a fact that surprises many people. Instead, Google repurposes the large Web directory created by the Open Directory Project. The Open Directory Project (ODP) is a non-profit organization staffed by thousands of volunteer editors who accept URL submissions for their respective subject niches. Google applies a PageRank to the Open Directory thereby reordering the directory listings, and presents the whole thing in familiar Google style. Naturally, the Google spider crawls the directory, so any new directory listing is automatically added to Google’s main Web index. Submit a URL to the Open Directory Project at this address:

www.dmoz.org/add.html

When it comes to accepting submissions, the Open Directory Project does not guarantee your entry any more than Google does. With ODP, you are at the mercy of whichever editor is in charge of your most relevant category, and the chance of developing a companionable dialogue with that person is slim. Furthermore, the ODP URL-submission process is much more complicated than at Google. Finally, you can usually count on a long and indeterminate wait before your site is added. Keep checking by searching for your site in the Google Directory. Checking your site’s status in Google During the sometimes-long wait to be included in Google, you naturally want to know when you’ve succeeded. (So you can run through the streets yelling, “Google me! Google me!”) How do you know whether your site is in the Google index? Don’t try searching for it with general keywords that method is hit-or-miss. You could search for an exact phrase located in your site’s text (by putting quotes around the phrase), but if the phrase is not unique you could get tons of other matches. The best bet is to simply search for your URL. Make it exact, and include the www prefix. If you’re searching for an inner page of the site, precision is likewise necessary, so remember to include the .htm or .html file extension if it exists. When adding a page to a site already in Google, be prepared for a long wait for it to appear, especially if you don’t change your content often. If Google’s spider checks your site during only its deep crawl and the timing is off, you could tap your fingers for about six weeks before seeing the new page in search results.

Indexing frustrations

Moving is hell, on land and in cyberspace. Moving your site from one URL to another and especially from one domain to another presents a vexing indexing problem. There’s a good chance that Google will continue to list your old site after you move, and even after it begins to list your new site. The Google spider is not dense. It trusts incoming links, many of which probably still point to your old location. From Google’s perspective, you haven’t really moved until you update your entire network of incoming links (which, if you take Article 3 seriously, you worked hard to establish), pointing them to your new location.

Your PageRank will drop considerably, too, until you get those backlinks up to speed. Moving is a serious consideration for any site that depends on stature in Google, and it shouldn’t be undertaken lightly or without planning. Partial listings can also spark frustration, for example, when Google’s spider locates your site and files the address but does not crawl all of its content. Because Google’s descriptions are quoted from the pages, your listing on any search page is bereft of a description. This situation bodes ill, for descriptions often provide the motivation to click on search results. Your only recourse is to build up your PageRank to the level at which Google sniffs out all your content and provides descriptions of your pages.

legal notice

Our website is not responsible for the information contained by this article. Web-articles is a free articles resource.
Suggestion: If you need fresh, daily updated content for your website, feel free to use our service. Click here for more information.

Useful tools and features

Link to this article from your page    Send this article to you or to a friend
If you like this article (tutorial), please link to it from your web page using the information above.

related articles

1. A few words about google
Like Yahoo! and eBay before it, Google came on the scene with good technology and then needed to work out a way to make money. Fortunately, that’s where you come in. To put it simply, Google makes money when you do. That’s the ideal, anyway. Google’s revenue model is based largely on increasing the visibility and traffic of its thousands of small-business partners, streamlining their marketing costs, qualifying their leads, and helping track returns on investment. There’s genius in Google&...

2. Google and your business
Without satisfied searchers, the business side has no value. Consumers may freely focus on the search experience, with no awareness of the business forces competing in the background. But business users who ignore consumer-search priorities court their own downfall. Google’s Empowerment Model At the top of this article, I stated that Google’s business model makes money when you do. But as I also mentioned, Google makes money even if you don’t. That’s not a situation Google likes, and it...

3. Keeping Google Out with robots.txt
This article is about partnering with Google: getting into the index, improving your PageRank, advertising on Google, distributing other people’s Google ads on your site, and other ways of building your online business through Google. So a section about rebuffing Google might seem counterproductive. But in the interest of covering all bases, here it is. Sometimes even publicity-hungry Webmasters want to keep Google away from certain parts of their business. Private pages designed for friends and semiprivate ...

4. Optimizing a Site for Google
The field of search engine optimization (SEO) is both simple and complex. It’s simple in that the principles of preparing your site for beneficial crawling are a lot easier than SEO companies (who want you as a client) might have you believe. It’s also complex because ideal SEO goes beyond tweaking a site’s tags or page structure to a deeper consideration of a site’s purpose, who it wants to attract, and how it wants visitors to behave. SEO might or might not be connected to making money. (Fo...

5. Putting Google Search on Your Site
The simplest and most identifiable method of partnering with Google is to incorporate Google searching on your site. You may offer Google search to your visitors free of charge (to them and to you), and you may customize the search to a reasonable degree. Giving your users options to search the Web or your site (or other specific sites) is fairly easy. Google offers four free search services and three paid services: -  Google Free. A Google-branded search box that delivers Web results. ...

6. Introducing Search Advertising and Google AdWords
This first article on AdWords is an overview of both search advertising in theory and AdWords in practice. I sketch the main points of Google’s service here, and get into the details in later articles. Search advertising brings new marketing propositions to the table. This is not to say that search advertising is brand new, but it is reaching a tipping point (to borrow author Malcolm Gladwell’s phrase). Nobody knows what we are tipping into. But there’s no question that search adve...

7. Understanding How AdWords Works
As a preview, the following list outlines the basic steps of designing and running ads in Google, in roughly the order in which most people proceed: -  Start an account. Starting an AdWords account is pain-free and expensefree. You don’t even have to be certain that you’ll ever run a single ad. Opening the account simply lets you into Google’s AdWords staging area, called the Control Center, where you create and deploy campaigns. No ads are displayed, and no billin...

8. Creating Effective Ad Groups
Ad Groups are the fundamental marketing units that propel your AdWords campaign. If keywords are the sparks of AdWords success, Ad Groups are the flames. And, one hopes, your campaign is a roaring bonfire. But forget the heated analogy. The point is that success in AdWords depends largely on the effective creation and manipulation of Ad Groups. Why is the Ad Group the most powerful element of your campaign? Because it contains the four motors of your advertising and conversion strategy: ads, keywords, bid...

9. AdWords bid on keywords
The Control Center provides three ways to edit the crucial CPC (cost-per-click) bid. This is the bid that helps determine your ad’s position on search pages. Normally, the bid applies to all keywords in an Ad Group, but you may also specify unique bids for individual keywords. Following are the three methods of tweaking your CPC bid: -  Using the Edit Keywords link. I describe this method in the preceding section, in the discussion about editing keywords. The same screen allows keyword ed...

10. Managing AdWords Campaigns
This article is about the daily operation of AdWords campaigns. I emphasize five important topics in this article: -  Pausing and resuming campaigns and Ad Groups -  Understanding why accounts are slowed, and knowing how to reactivate a slowed account -  Coping with slowed and disabled keywords, situations that can be baffling to the uninitiated -  Understanding and choosing geo-targeting -  Implementing Google’s conversion tracking feature Pausing and Resuming...