Build a RAG System with Node.js and OpenAI - No Database Required

Authors
  • avatar
    Name
    Hamza Rahman
Published on
-
7 mins read
rag

What is RAG and Why Use It?

Ever wished your documentation could answer questions like a human expert? That's exactly what RAG (Retrieval Augmented Generation) does. Instead of relying on outdated training data, it reads your actual documentation to give accurate, up-to-date answers.

Think of it like having a smart assistant who reads your docs before answering questions. The best part? With OpenAI's latest models, you can feed in entire documentation sets at once - no complex database needed!

Table of Contents

  1. Setting Up Your Project
  2. Building the Search System
  3. Creating the API
  4. Testing Your System
  5. Best Use Cases

Setting Up Your Project

Let's build something cool! We'll create a system that automatically picks the right documentation file and answers questions about it.

  1. Create a new project:
mkdir my-rag-project
cd my-rag-project
npm init -y
  1. Install the needed packages:
npm install openai dotenv
  1. Create a .env file:
OPENAI_API_KEY=your-api-key-here

Building the Search System

Our search system does two clever things:

  1. Uses AI to pick the most relevant documentation file
  2. Reads that file and answers questions about it

Here's a simple example that answers questions about your documentation. Since GPT-4o mini has a large 128K token context window, we can include more context than previous models, but we should still be mindful of very large files:

import { OpenAI } from 'openai';
import fs from 'fs/promises';
import dotenv from 'dotenv';
// Load environment variables
dotenv.config();
// Initialize OpenAI
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function answerQuestion(question) {
try {
// 1. Read the documentation file
const docContent = await fs.readFile('./docs/documentation.md', 'utf-8');
// Warning: For very large files, you might want to split them
// GPT-4o mini supports 128K tokens, but that's roughly 400K characters
if (docContent.length > 400_000) {
console.warn('Document might be too large for context window');
}
// 2. Ask AI with context
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "You are a helpful assistant that answers questions based on the provided documentation."
},
{
role: "user",
content: `
Documentation: ${docContent}
Question: ${question}
Please answer the question using information from the documentation. If the answer isn't in the documentation, say so politely.
`
}
],
// You can adjust max tokens based on your needs
max_tokens: 16000 // GPT-4o mini supports up to 16K output tokens
});
return response.choices[0].message.content;
} catch (error) {
console.error('Error:', error);
return "Sorry, I couldn't process your question right now.";
}
}
// Example usage
const answer = await answerQuestion("How do I handle errors?");
console.log(answer);

Creating the API

Now let's wrap our search system in a simple API that anyone can use:

Let's build a simple API to test our RAG system. First, create your project structure:

mkdir my-rag-project
cd my-rag-project
npm init -y

Install the required dependencies:

npm install openai dotenv express

Create a .env file in your project root:

OPENAI_API_KEY=your-api-key-here

Create these folders and files:

mkdir src docs
touch src/ragService.js src/server.js

Add some sample documentation files in the docs folder:

# Windows
echo "# API Reference" > docs/api-reference.md
echo "# Getting Started" > docs/getting-started.md
echo "# Troubleshooting" > docs/troubleshooting.md
# Mac/Linux
touch docs/api-reference.md docs/getting-started.md docs/troubleshooting.md

Now create the RAG service:

src/ragService.js
import { OpenAI } from 'openai';
import fs from 'fs/promises';
import dotenv from 'dotenv';
dotenv.config();
const openai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY
});
async function selectRelevantFile(question) {
const files = await fs.readdir('./docs');
const fileList = files
.filter(f => f.endsWith('.md') || f.endsWith('.txt'))
.map(f => ({ filename: f }));
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "You are a helpful assistant that selects the most relevant documentation file based on a user's question. Respond in JSON format with 'filename' and 'reason' fields."
},
{
role: "user",
content: `
Available files: ${JSON.stringify(fileList)}
User question: "${question}"
Select the most relevant file and explain why. Respond in JSON format.
`
}
],
response_format: { type: "json_object" }
});
return JSON.parse(response.choices[0].message.content);
}
export async function smartRAG(question) {
try {
// 1. Select the relevant file
const fileSelection = await selectRelevantFile(question);
console.log(`Selected ${fileSelection.filename} because: ${fileSelection.reason}`);
// 2. Read the selected file
const docContent = await fs.readFile(`./docs/${fileSelection.filename}`, 'utf-8');
// 3. Get AI response with context
const response = await openai.chat.completions.create({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "You are a helpful assistant that answers questions based on the provided documentation."
},
{
role: "user",
content: `
Documentation from ${fileSelection.filename}:
${docContent}
Question: ${question}
Please answer based on this documentation. If the answer isn't in the documentation, say so politely.
`
}
],
});
return {
fileSelection,
answer: response.choices[0].message.content
};
} catch (error) {
console.error('Error:', error);
throw error;
}
}

Create your Express server:

src/server.js
import express from 'express';
import { smartRAG } from './ragService.js';
const app = express();
app.use(express.json());
// Setup your docs folder structure:
// docs/
// - api-reference.md
// - getting-started.md
// - troubleshooting.md
app.post('/ask', async (req, res) => {
const { question } = req.body;
try {
const result = await smartRAG(question);
res.json(result);
} catch (error) {
res.status(500).json({
error: "Couldn't process your question"
});
}
});
app.listen(3000, () => {
console.log('RAG API running on port 3000');
});

Update your package.json:

package.json
{
"name": "my-rag-project",
"version": "1.0.0",
"type": "module",
"scripts": {
"start": "node src/server.js",
"dev": "node --watch src/server.js"
},
"dependencies": {
"dotenv": "^16.4.5",
"express": "^4.21.1",
"openai": "^4.72.0"
}
}

Your project structure should look like this:

my-rag-project/
├── .env
├── package.json
├── src/
│ ├── ragService.js
│ └── server.js
└── docs/
├── api-reference.md
├── getting-started.md
└── troubleshooting.md

Start the server:

npm run dev

Testing Your System

The moment of truth! Let's see our system in action:

Testing with Postman

  1. Open Postman
  2. Create a new POST request to http://localhost:3000/ask
  3. Set the header: Content-Type: application/json
  4. Set the body (raw/JSON):
{
"question": "How do I handle errors?"
}

You'll get a response like:

{
"fileSelection": {
"filename": "troubleshooting.md",
"reason": "The question is about error handling, which would be covered in the troubleshooting documentation"
},
"answer": "Based on the documentation..."
}

Best Use Cases

You might be surprised how powerful this simple approach can be. Here's where it really shines:

Documentation That Changes Often

Perfect for API docs, user guides, and technical specs that need frequent updates. Just edit your markdown files and the system instantly uses the new content - no rebuilding or reindexing needed.

Internal Tools and Support

Ideal for customer support teams and internal documentation. Your team gets instant, accurate answers based on your actual documentation, not outdated training data.

Quick Prototypes and MVPs

Need to prove AI-powered search can work for your docs? This approach gets you there in hours, not weeks. No complex infrastructure required.

The best part? With modern AI models supporting massive context windows (about 400K characters), you can feed in entire documentation sets at once. This means you can handle surprisingly large documentation bases without any extra complexity.

Remember: Sometimes the simplest solution is the best one. Don't jump to complex vector databases until you've outgrown this approach!