FlowiseAI
English
English
  • Introduction
  • Get Started
  • Contribution Guide
    • Building Node
  • API Reference
    • Assistants
    • Attachments
    • Chat Message
    • Chatflows
    • Document Store
    • Feedback
    • Leads
    • Ping
    • Prediction
    • Tools
    • Upsert History
    • Variables
    • Vector Upsert
  • CLI Reference
    • User
  • Using Flowise
    • Agentflow V2
    • Agentflow V1 (Deprecating)
      • Multi-Agents
      • Sequential Agents
        • Video Tutorials
    • API
    • Analytic
      • Arize
      • Langfuse
      • Lunary
      • Opik
      • Phoenix
    • Document Stores
    • Embed
    • Monitoring
    • Streaming
    • Uploads
    • Variables
    • Workspaces
    • Evaluations
  • Configuration
    • Auth
      • Application
      • Flows
    • Databases
    • Deployment
      • AWS
      • Azure
      • Alibaba Cloud
      • Digital Ocean
      • Elestio
      • GCP
      • Hugging Face
      • Kubernetes using Helm
      • Railway
      • Render
      • Replit
      • RepoCloud
      • Sealos
      • Zeabur
    • Environment Variables
    • Rate Limit
    • Running Flowise behind company proxy
    • SSO
    • Running Flowise using Queue
    • Running in Production
  • Integrations
    • LangChain
      • Agents
        • Airtable Agent
        • AutoGPT
        • BabyAGI
        • CSV Agent
        • Conversational Agent
        • Conversational Retrieval Agent
        • MistralAI Tool Agent
        • OpenAI Assistant
          • Threads
        • OpenAI Function Agent
        • OpenAI Tool Agent
        • ReAct Agent Chat
        • ReAct Agent LLM
        • Tool Agent
        • XML Agent
      • Cache
        • InMemory Cache
        • InMemory Embedding Cache
        • Momento Cache
        • Redis Cache
        • Redis Embeddings Cache
        • Upstash Redis Cache
      • Chains
        • GET API Chain
        • OpenAPI Chain
        • POST API Chain
        • Conversation Chain
        • Conversational Retrieval QA Chain
        • LLM Chain
        • Multi Prompt Chain
        • Multi Retrieval QA Chain
        • Retrieval QA Chain
        • Sql Database Chain
        • Vectara QA Chain
        • VectorDB QA Chain
      • Chat Models
        • AWS ChatBedrock
        • Azure ChatOpenAI
        • NVIDIA NIM
        • ChatAnthropic
        • ChatCohere
        • Chat Fireworks
        • ChatGoogleGenerativeAI
        • Google VertexAI
        • ChatHuggingFace
        • ChatLocalAI
        • ChatMistralAI
        • IBM Watsonx
        • ChatOllama
        • ChatOpenAI
        • ChatTogetherAI
        • GroqChat
      • Document Loaders
        • Airtable
        • API Loader
        • Apify Website Content Crawler
        • BraveSearch Loader
        • Cheerio Web Scraper
        • Confluence
        • Csv File
        • Custom Document Loader
        • Document Store
        • Docx File
        • Epub File
        • Figma
        • File
        • FireCrawl
        • Folder
        • GitBook
        • Github
        • Google Drive
        • Google Sheets
        • Jira
        • Json File
        • Json Lines File
        • Microsoft Excel
        • Microsoft Powerpoint
        • Microsoft Word
        • Notion
        • PDF Files
        • Plain Text
        • Playwright Web Scraper
        • Puppeteer Web Scraper
        • S3 File Loader
        • SearchApi For Web Search
        • SerpApi For Web Search
        • Spider - web search & crawler
        • Text File
        • Unstructured File Loader
        • Unstructured Folder Loader
      • Embeddings
        • AWS Bedrock Embeddings
        • Azure OpenAI Embeddings
        • Cohere Embeddings
        • Google GenerativeAI Embeddings
        • Google VertexAI Embeddings
        • HuggingFace Inference Embeddings
        • LocalAI Embeddings
        • MistralAI Embeddings
        • Ollama Embeddings
        • OpenAI Embeddings
        • OpenAI Embeddings Custom
        • TogetherAI Embedding
        • VoyageAI Embeddings
      • LLMs
        • AWS Bedrock
        • Azure OpenAI
        • Cohere
        • GoogleVertex AI
        • HuggingFace Inference
        • Ollama
        • OpenAI
        • Replicate
      • Memory
        • Buffer Memory
        • Buffer Window Memory
        • Conversation Summary Memory
        • Conversation Summary Buffer Memory
        • DynamoDB Chat Memory
        • MongoDB Atlas Chat Memory
        • Redis-Backed Chat Memory
        • Upstash Redis-Backed Chat Memory
        • Zep Memory
      • Moderation
        • OpenAI Moderation
        • Simple Prompt Moderation
      • Output Parsers
        • CSV Output Parser
        • Custom List Output Parser
        • Structured Output Parser
        • Advanced Structured Output Parser
      • Prompts
        • Chat Prompt Template
        • Few Shot Prompt Template
        • Prompt Template
      • Record Managers
      • Retrievers
        • Extract Metadata Retriever
        • Custom Retriever
        • Cohere Rerank Retriever
        • Embeddings Filter Retriever
        • HyDE Retriever
        • LLM Filter Retriever
        • Multi Query Retriever
        • Prompt Retriever
        • Reciprocal Rank Fusion Retriever
        • Similarity Score Threshold Retriever
        • Vector Store Retriever
        • Voyage AI Rerank Retriever
      • Text Splitters
        • Character Text Splitter
        • Code Text Splitter
        • Html-To-Markdown Text Splitter
        • Markdown Text Splitter
        • Recursive Character Text Splitter
        • Token Text Splitter
      • Tools
        • BraveSearch API
        • Calculator
        • Chain Tool
        • Chatflow Tool
        • Custom Tool
        • Exa Search
        • Gmail
        • Google Calendar
        • Google Custom Search
        • Google Drive
        • Google Sheets
        • Microsoft Outlook
        • Microsoft Teams
        • OpenAPI Toolkit
        • Code Interpreter by E2B
        • Read File
        • Request Get
        • Request Post
        • Retriever Tool
        • SearchApi
        • SearXNG
        • Serp API
        • Serper
        • Tavily
        • Web Browser
        • Write File
      • Vector Stores
        • AstraDB
        • Chroma
        • Couchbase
        • Elastic
        • Faiss
        • In-Memory Vector Store
        • Milvus
        • MongoDB Atlas
        • OpenSearch
        • Pinecone
        • Postgres
        • Qdrant
        • Redis
        • SingleStore
        • Supabase
        • Upstash Vector
        • Vectara
        • Weaviate
        • Zep Collection - Open Source
        • Zep Collection - Cloud
    • LiteLLM Proxy
    • LlamaIndex
      • Agents
        • OpenAI Tool Agent
        • Anthropic Tool Agent
      • Chat Models
        • AzureChatOpenAI
        • ChatAnthropic
        • ChatMistral
        • ChatOllama
        • ChatOpenAI
        • ChatTogetherAI
        • ChatGroq
      • Embeddings
        • Azure OpenAI Embeddings
        • OpenAI Embedding
      • Engine
        • Query Engine
        • Simple Chat Engine
        • Context Chat Engine
        • Sub-Question Query Engine
      • Response Synthesizer
        • Refine
        • Compact And Refine
        • Simple Response Builder
        • Tree Summarize
      • Tools
        • Query Engine Tool
      • Vector Stores
        • Pinecone
        • SimpleStore
    • Utilities
      • Custom JS Function
      • Set/Get Variable
      • If Else
      • Sticky Note
    • External Integrations
      • Zapier Zaps
  • Migration Guide
    • Cloud Migration
    • v1.3.0 Migration Guide
    • v1.4.3 Migration Guide
    • v2.1.4 Migration Guide
  • Tutorials
    • RAG
    • Agentic RAG
    • SQL Agent
    • Agent as Tool
    • Interacting with API
  • Use Cases
    • Calling Children Flows
    • Calling Webhook
    • Interacting with API
    • Multiple Documents QnA
    • SQL QnA
    • Upserting Data
    • Web Scrape QnA
  • Flowise
    • Flowise GitHub
    • Flowise Cloud
Powered by GitBook
On this page
  • Features
  • How It Works
  • Parameters
  • Required Parameters
  • Outputs
  • Document Output
  • Text Output
  • Database Integration
  • Document Structure
  • Usage Examples
  • Basic Store Selection
  • Accessing Document Content
  • Best Practices
  • Notes
Edit on GitHub
  1. Integrations
  2. LangChain
  3. Document Loaders

Document Store

Load data from pre-configured document stores.

PreviousCustom Document LoaderNextDocx File

Last updated 7 days ago

The Document Store loader enables you to load data from pre-configured document stores in your database. This loader provides a convenient way to access and utilize previously processed and stored documents in your workflows.

Features

  • Load documents from synchronized stores

  • Automatic metadata handling

  • Multiple output formats

  • Asynchronous store selection

  • Database integration

  • Chunk-based document retrieval

  • JSON metadata support

How It Works

  1. Store Selection:

    • Lists all available document stores that are in 'SYNC' status

    • Provides store information including name and description

    • Allows selection from synchronized stores only

  2. Document Retrieval:

    • Fetches document chunks from the selected store

    • Reconstructs documents with original metadata

    • Maintains document structure and relationships

Parameters

Required Parameters

  • Select Store: Choose from available synchronized document stores

    • Displays store name and description

    • Only shows stores in 'SYNC' status

    • Dynamically updated based on database content

Outputs

The loader provides two output formats:

Document Output

Returns an array of document objects, each containing:

  • pageContent: The actual content of the document chunk

  • metadata: Original document metadata in JSON format

Text Output

Returns a concatenated string containing:

  • All document chunks' content

  • Separated by newlines

  • Properly escaped characters

Database Integration

The loader integrates with your database through:

  • TypeORM data source connection

  • Document store entity management

  • Chunk-based storage and retrieval

  • Metadata preservation

Document Structure

Each loaded document contains:

{
  pageContent: string,    // The actual content
  metadata: {            // Parsed JSON metadata
    // Original document metadata
    // Store-specific information
    // Custom metadata fields
  }
}

Usage Examples

Basic Store Selection

{
  "selectedStore": "store-id-123"
}

Accessing Document Content

// Document output format
[
  {
    "pageContent": "Document content here...",
    "metadata": {
      "source": "original-file.pdf",
      "page": 1,
      "category": "reports"
    }
  }
]

// Text output format
"Document content here...\nNext document content here...\n"

Best Practices

  1. Ensure stores are synchronized before access

  2. Choose appropriate output format for your use case

  3. Handle metadata appropriately in your workflow

  4. Consider chunk size when processing large documents

  5. Monitor database performance with large stores

Notes

  • Only synchronized stores are available for selection

  • Metadata is automatically parsed from JSON

  • Documents are reconstructed from chunks

  • Supports both document and text output formats

  • Integrates with TypeORM for database access

  • Handles escape characters in text output

  • Maintains original document structure

This section is a work in progress. We appreciate any help you can provide in completing this section. Please check our to get started.

Contribution Guide
Document Store Node