[CALUG] robots.txt

Tue Jan 11 10:40:35 EST 2011

Hi Walt,

On Tue, Jan 11, 2011 at 07:03:50AM -0800, Walt Smith wrote:
> 
> Hi,
> 
> I have an account over at lonestar.
> I've seen tidbits about a file used apparently
> for search engines called robots.txt.  I don't know
> in what "domain" it's used exactly.
> 
> Just wondering if I could get a short synopsis
> from someone about whether that file is used as 
> a site file, or whether a user can use it to 
> "encourage" searches at his web page?  i.e.
> have their own robots.txt file.   OF course I 

Robots.txt is a file that lives in the root directory of your web server
that tells web crawlers/scrapers how you wish them to behave on your
site.  Whether they respect your wishes or not, is up to them, though.
You can't encourage searches with robots.txt, but you can ask them to
leave you alone.

> would assume a site file would have a narrowing precedence
> over a user account... ( can I add a robot.txt file
> for a personal web page in a  *nix style system?)
> this is all assuming I actually have some idea
> what the file is used for.

AFAIK, the OS doesn't matter.

> 
> also, in the html web page file, are keywords put into 
> some html top section <meta something ??> still used and
> is it really useful for search engines ?
> 
> I'm not so much interested in overtly trying to push or advertise,
> but I don't want to discourage a search engine for a particular
> web page; i.e. to let them find something.

The last time I checked, including pertinent information in meta tags
was still recommended.  I'm not sure that makes a big difference one way
or another.

John