搜索

回答

百度一般都是通过站的sitemap.xml来进行页面爬取和索引，所以你必须在网站的根目录下生成一个sitemap.xml文件，让百度知道你的站点更新的啥，sitemap.xml的地址可以放到 robots.txt 文件中。

根目录 robots.txt如下，这里定义了user-agent，表示容许哪些爬虫来爬取，*表示所有，百度叫Baiduspider，谷歌叫Googlebot,Disallow表示哪些页面和目录不容许爬取。注意：sitemap 谷歌为xml格式，百度为html格式

User-agent: *
Disallow: 
Disallow: /admin/
Sitemap: http://domain.com/sitemap.xml

那么sitemap怎么写呢

<?xml version="1.0" encoding="UTF-8" ?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
    xmlns:mobile="http://www.baidu.com/schemas/sitemap-mobile/1/">
    <url>
        <loc>https://www.domian.com/</loc>
        <mobile:mobile type="pc,mobile" />
        <priority>0.8</priority>
        <lastmod>2021-05-14</lastmod>
        <changefreq>daily</changefreq>
    </url>
    <url>
        <loc>https://www.domian.com/blog/1.html</loc>
        <mobile:mobile type="pc,mobile" />
        <priority>0.8</priority>
        <lastmod>2021-05-14</lastmod>
        <changefreq>daily</changefreq>
    </url>
    
        <url>
        <loc>https://www.domian.com/blog/</loc>
        <mobile:mobile type="pc,mobile" />
        <priority>0.8</priority>
        <lastmod>2021-05-14</lastmod>
        <changefreq>weekly</changefreq>
    </url>
    
</urlset>

loc就是需要百度检索的网页url；

mobile表示是否支持手机和pc的自适应显示；

priority用来指定此链接相对于其他链接的优先权比值，取值范围为0.0~1.0之间。值越大，表示此链接的优先权就越高；

lastmod表示上次修改时间；

changefreq表示修改频率，可以填写：“always”（经常）、“hourly”（每时）、“daily”（每天）、“weekly”（每周）、“monthly”（每月）、“yearly”（每年）。像首页就可以用“always”；对于很久前的链接或不再更新内容的链接就可以使用“yearly”

网友回复

我知道答案，我要回答

我有问题

回答

kimi-k3与qwen-3.8及deepseek-v4正式版到底哪个更强？

veo与哦Omni免费体验网站哪有？

gemini免费api可以使用哪些模型和限制？

抖音版权中心提交的短剧被人搬运到其他平台如何维权？

大模型越来越像loop agent了，越来越耗时了

win10上powershell运行wsl报错Wsl/0x80072f7d如何解决？

人工智能技术属于科学范畴吗？

pi与claw及codex有啥不同？

有没有在电脑上运行iphone虚拟机的软件？

如何将电脑扬声器声音同步输出到另外一台手机或电脑上/