RAG_DATA_PREP
v1.0.0
[SYS] Advanced data preparation workflow for RAG (Retrieval Augmented Generation) systems
CORE_FEATURES:
Multi Source Processing
> URL content extraction
> YouTube transcript retrieval
> PDF, CSV, audio, text processing
Intelligent Tagging
> Content categorization
> Automated metadata generation
> Semantic classification
FAQ Generation
> Automatic question creation
> Content-based answers
> Query optimization
Vector DB Ready
> Optimal chunk sizing
> JSON structure formatting
> Enhanced retrieval preparation
EXAMPLE_OUTPUT:
Workflow Input:
"Source URL": "https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.aifunctions-ai-agent/",
"Content Format": "website",
"Chunking Strategy": "semantic paragraphs",
"Generate FAQs": "yes"
AI Agents Documentation (RAG-Ready)
Document Metadata
- Source: n8n Documentation
- Chunks: 12 semantic paragraphs
- FAQs Generated: 13
- Embedded Tags: #AIAgent, #PlanExecuteAgent, #ReActAgent, #SQLAgent, #VectorDatabase, #Prompting
Content Chunks
Chunk #1: Plan and Execute Agent
Tags: #AIAgent #PlanExecuteAgent
Plan and Execute Agent node
The Plan and Execute Agent is like the ReAct agent but with a focus on planning. It first creates a high-level plan to solve the given task and then executes the plan step by step. This agent is most useful for tasks that require a structured approach and careful planning.
Chunk #2: ReAct AI Agent
Tags: #AIAgent #ReActAgent
ReAct AI Agent node
The ReAct Agent node implements ReAct logic. ReAct (reasoning and acting) brings together the reasoning powers of chain-of-thought prompting and action plan generation.
The ReAct Agent reasons about a given task, determines the necessary actions, and then executes them. It follows the cycle of reasoning and acting until it completes the task. The ReAct agent can break down complex tasks into smaller sub-tasks, prioritise them, and execute them one after the other.
9 more chunks available (hidden for brevity)
Generated FAQs
Q: What is a vector database?
A: A vector database stores mathematical representations of information. Use with embeddings and retrievers to create a database that your AI can access when answering questions.
Q: What is the Plan and Execute Agent?
A: The Plan and Execute Agent is like the ReAct agent but with a focus on planning. It first creates a high-level plan to solve the given task and then executes the plan step by step. This agent is most useful for tasks that require a structured approach and careful planning.
Q: What is the SQL AI Agent?
A: The SQL Agent uses a SQL database as a data source. It can understand natural language questions, convert them into SQL queries, execute the queries, and present the results in a user-friendly format. This agent is valuable for building natural language interfaces to databases.
10 more FAQs available (hidden for brevity)
Vector Database Ready Format
{ "id": "chunk_001", "text": "Plan and Execute Agent node\nThe Plan and Execute Agent is like the ReAct agent but with a focus on planning...", "metadata": { "source": "n8n-docs", "url": "https://docs.n8n.io/integrations/builtin/cluster-nodes/root-nodes/n8n-nodes-langchain.aifunctions-ai-agent/", "tags": ["AIAgent", "PlanExecuteAgent"], "chunk_type": "semantic_paragraph" }, "embedding": [0.023, -0.112, 0.043, ...] // Vector representation (768 dimensions) }
This is an example of RAG-prepared content created with our template
$ system_requirements
MODELS: gemini 2.5, openai whisper
STORAGE: google drive, google sheets
SERVICES: potentially rapidapi youtube v2
OUTPUT: google sheet rows of file links
PRICING: gemini - per token,
whisper - per minute,
google drive - free,
google sheets - free,
rapidapi youtube v2 api - free tier
EST. PER RUN COST: €0.01
PROCESS_FLOW:
AUTOMATION_BENEFITS:
- > Process diverse content types with one workflow
- > Dramatically improve RAG retrieval accuracy
- > Generate FAQs to enhance knowledge coverage
- > Eliminate manual data preparation tasks
- > Consistent formatting for all content sources
* Compatible with all n8n installations v1.0.0+
*Superflowz is a subsidiary of CARDUME ESBELTO UNIP. LDA. Your purchase will be from, and your receipt will list, CARDUME ESBELTO