Data types
๐บ๏ธ Sitemap
Get Started
- โก Quickstart
- ๐ Introduction
- โ FAQs
- ๐ป Full stack
- ๐ Integrations
Components
- ๐งฉ Introduction
- Data sources
- Overview
- Data types
- ๐ฐ PDF
- ๐ CSV
- ๐ JSON
- ๐ Text
- ๐ Directory/Folder
- ๐ HTML Web page
- ๐ฝ๏ธ Youtube Channel
- ๐บ Youtube Video
- ๐ Code Docs website
- ๐ Mdx file
- ๐ Docx file
- ๐ Notion
- ๐บ๏ธ Sitemap
- ๐งพ XML file
- โ๐ฌ Queston and answer pair
- ๐ OpenAPI
- ๐ฌ Gmail
- ๐ Github
- ๐ Postgres
- ๐ฌ MySQL
- ๐ค Slack
- ๐ฌ Discord
- ๐จ๏ธ Discourse
- ๐ Substack
- ๐ Beehiiv
- ๐ Directory/Folder
- ๐พ Dropbox
- ๐ผ๏ธ Image
- โ๏ธ Custom
- Data type handling
- ๐ค Large language models (LLMs)
- ๐๏ธ Vector databases
- ๐งฉ Embedding models
- ๐ฌ Evaluation
Community
Product
Data types
๐บ๏ธ Sitemap
Add all web pages from an xml-sitemap. Filters non-text files. Use the data_type as sitemap
. Eg:
from embedchain import App
app = App()
app.add('https://example.com/sitemap.xml', data_type='sitemap')
Was this page helpful?