No robots.txt file in my top level directory

PG always includes a robots.txt to added sub directories.

But I didn’t have one for my https://mcgraphics.us.

Here is my new https://mcgraphics.us/robots.html. Code look ok?
`User-agent: *
Allow: /

Sitemap: https://www.mcgraphics.us/sitemap.xml`

Hi @kat,

Your robot text should look like this:

User-agent: *
Disallow:
Sitemap: https://www.mcgraphics.us/sitemap.xml

Be sure that the robot text file ends with .txt ! And place it in the root of your site!
robots.txt

The sitemap.xml should be in the root as well! Sub directories don’t matter, you only use one sitemap.xml and one robots.txt the site map if well made directs Google to the proper pages anyway.
Notice that the sitemap is ending with .xml sitemap.xml

1 Like

This is very helpful. Thanks for taking the time to answer.

Why Disallow: instead of Allow: /
Interesting 1 only each of sitemap & robots in root only needed.

It basically says:

User-agent: * (all sniffing bots, browsers etc…)
Disallow: (Allow all files, because nothing is stated after it)
Sitemap: https://www.mcgraphics.us/sitemap.xml (from this sitemap.xml file)

Google sees the links to the pages in the sitemap.xml that’s why it’s there, no matter in what directory!

In my country we have websites with the 3 languages of the country, I make each language in it’s own folder with it’s own index and one index in the root as landings page. Only one sitemap.xml is needed and a robot.text leading to it.

You can test both files in your website with https://search.google.com/search-console where your site is indexed by you for Google search.

Had a look at your site and most importantly I would take care of all the HTML errors on the pages.

Disallow: means allow all, funny.
https://search.google.com/search-console won’t fetch my sitemap.xml. This is why I started checking my robots.txt.
A google FAQ says a sitemap isn’t needed for small sites, like mine.

Suggestion on how to find my HTML errors?

Thanks for you help and info on how you set up your multiple language site.

To exclude all robots from the entire server

User-agent: *
Disallow: /

To allow all robots complete access

User-agent: *
Disallow:
Sitemap: https://www.mcgraphics.us/sitemap.xml

Of course you can use Allow, but my example does the same! It’s .txt code (xml) no normal English :wink:

For HTML testing in Pinegrow you can go to Page> Check for HTML errors with a web page openend.

Or install this plugin in the Chrome browser and click the small silver wheel right top browser and choose Validate HTML. Web Developer - Chrome Web Store

2 Likes

T=[quote=“AllMediaLab, post:6, topic:4854”]
For HTML testing in Pinegrow you can go to Page> Check for HTML errors with a web page openend.

Or install this plugin in the Chrome browser and click the small silver wheel right top browser and choose Validate HTML. Web Developer - Chrome
[/quote]

This is great. I’m on it. Thanks again for your help.
Kat

This is new to me. As far as I know PG does not create or add a robots.txt itself?

True, Pinegrow does not create any robots.txt

But I think that would be a great new feature. Anyone who wants to create this (and the sitemap) itself should be able to du this. But it would save a lot of time if you could just say this sites you want to block and let PG do the rest.

Great!!
Thanks, I had no idea, that the FF dev tools were kicking about.
Fab!

I now use the Brave browser and have added it to that.
This browser is pretty cool.
A sort of tightened up, security version of Chrome.
Ive been using it for months now. I recommend it and this is first extension :slight_smile: Cheers @AllMediaLab