All about webtips: search engine

Showing posts with label search engine. Show all posts

Friday, May 1, 2009

SEO: Google's Next Big Move archive copy

By David Leonhardt

(Will your website be ready, or will you be playing catch-up six months too late?

November 2003 might go down in history as the month that Google shook a lot of smug webmasters and search engine optimization (SEO) specialists from the apple tree. But more than likely, it was just a precursor of the BIG shakeup to come.

Google touts highly its secret PageRank algorithm. Although PageRank is just one factor in choosing what sites appear on a specific search, it is the main way that Google determines the "importance" of a website.

In recent months, SEO specialists have become expert at manipulating PageRank, particularly through link exchanges.

There is nothing wrong with links. They make the Web a web rather than a series of isolated islands. However, PageRank relies on the naturally "democratic" nature of the web, whereby webmasters link to sites they feel are important for their visitors. Google rightly sees link exchanges designed to boost PageRank as stuffing the ballot box.

I was not surprised to see Google try to counter all the SEO efforts. In fact, I have been arguing the case with many non-believing SEO specialists over the past couple months. But I was surprised to see the clumsy way in which Google chose to do it.

Google targeted specific search terms, including many of the most competitive and commercial terms. Many websites lost top positions in five or six terms, but maintain their positions in several others. This had never happened before. Give credit to Barry Lloyd of SearchEngineGuide.com for cleverly uncovering the process.

For Google, this shakeup is just a temporary fix. It will have to make much bigger changes if it is serious about harnessing the "democratic" nature of the Web and neutralizing the artificial results of so many link exchanges.

Here are a few techniques Google might use (remember to think like a search engine):

Google might start valuing inbound links within paragraphs much higher than links that stand on their own. (For all we know, Google is already doing this.) Such links are much less likely to be the product of a link exchange, and therefore more likely to be genuine "democratic" votes.
Google might look at the concentration of inbound links across a website. If most inbound links point to the home page, that is another possible indicator of a link exchange, or at least that the site's content is not important enough to draw inbound links (and it is content that Google wants to deliver to its searchers).
Google might take a sample of inbound links to a domain, and check to see how many are reciprocated back to the linking domains. If a high percentage are reciprocated, Google might reduce the site's PageRank accordingly. Or it might set a cut-point, dropping from its index any website with too many of its inbound links reciprocated.
Google might start valuing outbound links more highly. Two pages with 100 inbound links are, in theory, valued equally, even if one has 20 outbound links and the other has none. But why should Google send its searchers down a dead-end street, when the information highway is paved just as smoothly on a major thoroughfare?
Google might weigh a website's outbound link concentration. A website with most outbound links concentrated on just a few pages is more likely to be a "link-exchanger" than a site with links spread out across its pages.

Google might use a combination of these techniques and ones not mentioned here. We cannot predict the exact algorithm, nor can we assume that it will remain constant. What we can do is to prepare our websites to look and act like a website would on a "democratic" Web as Google would see it.

For Google to hold its own against upstart search engines, it must deliver on its PageRank promise. Its results reflect the "democratic" nature of the Web. Its algorithm must prod webmasters to give links on their own merit. That won't be easy or even completely possible. And people will always find ways to turn Google's algorithm to their advantage. But the techniques above can send the Internet a long way back to where Google promises it will be.

The time is now to start preparing your website for the changes to come.

Friday, January 2, 2009

How Search Engines Work

The term "search engine" is often used generically to describe both crawler-based search engines and human-powered directories. These two types of search engines gather their listings in radically different ways.

Crawler-Based Search Engines

Crawler-based search engines, such as Google, create their listings automatically. They "crawl" or "spider" the web, then people search through what they have found.

If you change your web pages, crawler-based search engines eventually find these changes, and that can affect how you are listed. Page titles, body copy and other elements all play a role.

see about how to listed in google index

Human-Powered Directories

A human-powered directory, such as the Open Directory ( see about how to submited to open directory ), depends on humans for its listings. You submit a short description to the directory for your entire site, or editors write one for sites they review. A search looks for matches only in the descriptions submitted.

Changing your web pages has no effect on your listing. Things that are useful for improving a listing with a search engine have nothing to do with improving a listing in a directory. The only exception is that a good site, with good content, might be more likely to get reviewed for free than a poor site.

"Hybrid Search Engines" Or Mixed Results

In the web's early days, it used to be that a search engine either presented crawler-based results or human-powered listings. Today, it extremely common for both types of results to be presented. Usually, a hybrid search engine will favor one type of listings over another. For example, MSN Search is more likely to present human-powered listings from Look Smart. However, it does also present crawler-based results (as provided by Inktomi), especially for more obscure queries.

The Parts Of A Crawler-Based Search Engine

Crawler-based search engines have three major elements. First is the spider, also called the crawler. The spider visits a web page, reads it, and then follows links to other pages within the site. This is what it means when someone refers to a site being "spidered" or "crawled." The spider returns to the site on a regular basis, such as every month or two, to look for changes.

Everything the spider finds goes into the second part of the search engine, the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider finds. If a web page changes, then this book is updated with new information.

Sometimes it can take a while for new pages or changes that the spider finds to be added to the index. Thus, a web page may have been "spidered" but not yet "indexed." Until it is indexed -- added to the index -- it is not available to those searching with the search engine.

Search engine software is the third part of a search engine. This is the program that sifts through the millions of pages recorded in the index to find matches to a search and rank them in order of what it believes is most relevant. You can learn more about how search engine software ranks web pages on the aptly-named, see about How Search Engines Determine Your Rank.

Major Search Engines: The Same, But Different

All crawler-based search engines have the basic parts described above, but there are differences in how these parts are tuned. That is why the same search on different search engines often produces different results. Some of the significant differences between the major crawler-based search engines are summarized on the Search Engine Features Page. Information on this page has been drawn from the help pages of each search engine, along with knowledge gained from articles, reviews, books, independent research, tips from others and additional information received directly from the various search engines.

Now let's look more about how crawler-based search engine rank the listings that they gather.

Friday, December 26, 2008

Getting into Yahoo and Live (Formerly MSN)

Most websites get into Yahoo and Live by getting spidered from other sites. Meaning if you're already linked from another site or web directory, Live and Yahoo's robots will eventually find you when they update their search index.

If you don't want to wait to get spidered, you can submit directly to Live by going here http://search.msn.com/docs/submit.aspx.

Do not keep re-submitting your site. It will not speed up the process of getting listed and may even get you banned.

Yahoo's direct submission site is https://siteexplorer.search.yahoo.com/submit.

The Yahoo Directory

Not to be confused with the regular Yahoo search function, the Yahoo directory is also a place you can add your site.

When you go to Yahoo.com and enter a search into the box, you are using their regular search engine, not the directory. So it is possible to be included in Yahoo's regular search index and not be included in their directory.

The Yahoo directory is actually a categorical listing of sites located here http://dir.yahoo.com and it's not used very much by web surfers.

The real benefit to being listed here is to have a high quality link pointing to your site. Many search engines look at who is linking to you and if you have a link from the Yahoo directory, it may give you some "credibility" points.

Some believe that having a listing in the directory will help boost your rank in Google and other engines. There is no solid proof of this, however. Yahoo also states that if you are listed in their directory, it does not have any affect on your position in their regular search results.

If your site is commercial it will cost you $299 per year to be included in Yahoo's directory. So if you can justify/afford the cost, I would still recommend getting into Yahoo just to have the high-quality link pointing to your site - but don't expect a lot of traffic from the actual page you'll be listed on.

It's up to you to decide if it's worth it or not. Many webmasters do quite well in the "Big 3" search engines without a listing in the Yahoo directory.

see too about how to get listed in google

How to Get Listed In Google

There are three ways to list your site with Google, but I will warn you that using any one of these 3 methods no longer guarantees your site will be listed.

Google is getting more and more selective about who gets in, and the first step is ensuring that your site is full of useful, unique content.

After that, work on getting quality, relevant sites to link back to you. These days those two steps are the best ways to find your site in the almighty Google.

Having said that, here are some methods that may also get you in...

1) Get your site listed in The Open Directory - www.dmoz.org.

This is a directory that is managed by volunteers that act as "category editors." To list your site, simply go to the most appropriate category for your web site, then drill down to the relevant subcategory and select the "Add URL" link at the top of that category's page.

Wait about a month to see if your site appears. If it does not, I recommend emailing one of the category editors and asking for advice on how to get your site listed. If you're lucky, you'll receive a helpful response, but most of the category managers do not answer emails.

Sometimes it may take up to one year for an editor to review and list your site and other times it may only take a couple of weeks. Be patient and please don't keep submitting!

Unfortunately since DMOZ is run by volunteers, the time it takes to get your site reviewed really depends on the availability of the volunteers. Sometimes they may not check the submissions for weeks, which can be quite frustrating for people trying to get listed.

Lately, it seems to be more and more difficult to get in, but the good news is this is not the only way to get into Google. Years ago this was one of the fastest ways to get listed and ranked. Fortunately there are other options that are just as effective, and I'll discuss those below.

Warning: Don't try to submit to The Open Directory unless you have enough useful information on your site. If your site is only one or two pages long, then you won't likely get listed. They want medium to large sized web sites with useful and unique content. Strive for at least 15 pages.

If you need more content, articlecity.com has some free articles you can post on your site. Simply find the category that closely matches your theme and add them. Be careful though, don't use too many articles because the engines may penalize your site if you have too much duplicate content.

2) The second way to get listed in Google is to use their own Add URL form located here http://www.google.com/addurl.html. This method is not as dependable as listing with The Open Directory, but it can get you in. Google admits that they may not add every site, so don't be surprised if this does not work for you.

3) The third way to get listed is to be linked from another web site that is already in Google. That way, when Google's spider goes to visit that site for updates, it will pick up the link to your site and add it. This method does not always work, but many sites do get in this way.

and see too about submit site to open directory

How Search Engines Determine Your Rank

Before you try to add your site to the search engines, you should understand what they look for when they decide how to rank your site. Just because you're listed doesn't mean you'll get traffic. You have to make sure your site is search engine ready.

The general rule of thumb is that most engines use a "formula" to determine keyword relevancy. The technical term is called an "algorithm", and each search engine has its own unique algorithm that it uses to rank pages.

Generally, this magic formula consists of your page title, overall body content and the number and quality of links pointing back to your site, how long people stay on your site, etc.

It's important to note that every engine is different. Some may look at inbound links (number of people linking to you), others may place more emphasis on your body content. These days, meta tag content is becoming less and less important.

In case you don't know, meta tags are hidden descriptors that appear at the beginning of your HTML code, inside your tag. They may be invisible to your visitor's eyes, but search engine spiders can read them.

They usually consist of a title, description, and keyword tag and they look something like this:

<>

<>Title of Your Site< / title >

< name="description" content="Description of your site here.">

< name="keywords" content="keywords separated by commas">

< / head >

Because of abuse, many search engines no longer use these tags to help rank pages, but you should still include them because they do use them to display information about your site.

and see too about Keyword Search Techniques