Unstructured Technologies

San Francisco, CA, USA
Founded 2022

Unstructured provides an open-source ETL platform and commercial API for ingesting and preprocessing unstructured documents—such as PDFs, HTML, Word files, and images—into formats ready for use with large language models. The platform supports over 64 file types and offers connectors to enterprise data sources, enabling organizations to build retrieval-augmented generation (RAG) pipelines and agentic AI workflows at scale.

Last Updated: March 22, 2026

Current Valuation

$230M

as of March 14, 2024 (Source)

Funding Summary

$65M

Total reported funding

Key People

Brian Raymond
Founder & CEOFounder
LinkedIn Profile

Core OSS Projects

Open-source ETL library for converting complex documents into clean, structured formats for language models

License: Apache-2.0

Business Information

Industries

Data & Analytics

Technologies

LLMs
RAG
ETL
Unstructured Data
Data Pipelines
Document Processing

Sectors

Enterprise

Licenses

Apache-2.0

Cossmology Badge

Showcase your company's presence on Cossmology by embedding a custom badge on your website or GitHub repository.

COSS Weekly Newsletter

Stay up to date with the latest news, funding rounds, and announcements from the COSS universe.

Check out COSS Weekly on the web

All information submitted through this form is handled in accordance with the Privacy Policy of Chinstrap Community.