Ground ChatGPT or Claude in Your Own Sources with RAG Prompting
Learn RAG prompting to ground AI in your documents. Stop hallucinations, force citations, and get accurate answers from ChatGPT using your own sources.
Picture this: you ask an AI to summarize your company’s Q3 performance based on an internal report. The AI produces a beautiful summary, confident sentences, specific numbers. You check the report. Half the facts are wrong. The AI filled gaps with plausible-sounding information it made up.
Now imagine a different approach.
You provide the Q3 report directly in your prompt and say “Answer ONLY using information from this document. If the answer isn’t in the document, say so.” The AI reads your actual data, quotes the relevant sections, and admits when information is missing. The summary is accurate because it’s grounded in your sources, not the model’s training data.
This is Retrieval-Augmented Generation, or RAG. Instead of relying on what the model learned during training (which might be outdated, generic, or just wrong for your situation), you provide the exact documents you want it to use. The model retrieves relevant parts and generates answers based only on that context.
What’s in this lesson?
In this lesson, you’ll learn:
how RAG works in plain English,
how to chunk documents so meaning doesn’t get lost,
how to write prompts that demand citations, and when to make the model refuse to answer.
These techniques transform AI from a creative writer into a reliable research assistant.
Prompt Engineering Full Course
Ground AI in Your Own Sources with RAG ← You are here
Big Idea
AI models trained on the internet give generic answers that might be wrong. When you ground them with your own documents and demand citations, they become reliable research tools that work with your specific facts, not guesses.
Think - The Research Assistant
Imagine you hire two research assistants to answer questions about your business.
Assistant A has read thousands of business books and articles. When you ask “What were our Q3 sales?”, they confidently say “Based on typical Q3 patterns, I’d estimate around $500K.” It sounds reasonable, but it’s completely made up. They’re guessing based on general knowledge, not your actual data.
Assistant B doesn’t memorize anything. Instead, when you ask a question, they go to your filing cabinet, pull out the relevant documents, read them carefully, and answer based only on what they found. They say “According to the Q3 Sales Report (page 3), sales were $387K.” If you ask something not in the documents, they say “I don’t see that information in the files you gave me.”
AI models are naturally like Assistant A, trained on massive datasets but prone to hallucination. RAG turns them into Assistant B, grounded in your specific documents with citations you can verify.
How RAG Works [in Plain English]
Retrieval-Augmented Generation, or RAG has three simple steps:
Step 1: Retrieve Relevant Context
When you have a question and a collection of documents, you first need to find which parts of which documents are relevant.
Simple version (for small documents): Just paste the entire document into your ChatGPT or Claude chat window along with prompt. Works fine for documents under 2,000 words.
Advanced version (for large collections):
Break documents into chunks (more on this below)
Find chunks most relevant to the question
Include only those chunks in the prompt
For this lesson, we’ll focus on the simple version where you control what context to include.
Step 2: Augment the Prompt with Context
Add the retrieved content to your prompt in a clearly marked section. Make it obvious to the model what’s “source material” versus your instructions.
Structure:
[Instructions about task and citation rules]
Context documents:
---
[Document 1 content]
---
[Document 2 content]
---
Question: [your question]
Answer based ONLY on the context above. Cite sources.Step 3: Generate Answer with Citations
The model reads the context, finds relevant information, and generates an answer that references specific parts of your documents.
The magic happens when you add strict rules:
“Answer ONLY from this context” and “Cite your sources.”Chunking: Breaking Documents Without Breaking Meaning
When documents are too long to fit in a single prompt, you need to break them into chunks for Retrieval-Augmented Generation, or RAG.
Bad chunking loses context and makes answers worse.
Bad Chunking (Arbitrary Splits)
Don’t do this:
Chunk 1: “...the product launch exceeded expectations. Sales in Q3 were”
Chunk 2: “$450K, up 23% from Q2. The marketing campaign...”The sales figure got split across chunks! If you only retrieve Chunk 1, you get incomplete information.
Good Chunking (Semantic Boundaries)
Do this instead:
Split at section headers
Keep complete paragraphs together
Include a sentence of overlap between chunks
Keep related information (like “Q3 sales: $450K”) in the same chunk
Better example:
Chunk 1: “...the product launch exceeded expectations. Sales in Q3 were $450K, up 23% from Q2.”
Chunk 2: “Sales in Q3 were $450K, up 23% from Q2. The marketing campaign drove...”Notice the overlap? Both chunks contain the key fact, ensuring it won’t be lost.
RAG Chunking Best Practices
Chunk size: 200-500 words per chunk (fits comfortably in prompts while preserving context)
Overlap: Include 1-2 sentences from the previous chunk at the start of each new chunk
Boundaries: Split at:
Section headers
Paragraph breaks
Complete sentences
Natural topic shifts
Don’t split:
Mid-sentence
Lists or tables
Code blocks
Related facts (dates + numbers, names + roles)
Citation Prompts: Making AI Show Its Work
The key to reliable RAG is forcing the model to cite sources. Without citations, you can’t verify answers.
Basic Citation Prompt
You are a research assistant. Answer questions using ONLY the provided context.
Context:
---
[paste documents here]
---
Question: [question]
Rules:
- Answer ONLY using information from the context above
- For each claim, cite the specific document and section
- If the answer is not in the context, say “I don’t have that information in the provided documents”
- Do not make assumptions or use external knowledge
Format: [Your answer with citations like (Document 1, Section 2)]Advanced Citation Prompt with Quotes
You are a research assistant. Answer questions with direct quotes from provided documents.
Context:
---
Document 1: Q3 Sales Report
[content]
---
Document 2: Marketing Analysis
[content]
---
Question: [question]
Rules:
- Quote exact phrases from documents to support each claim
- Format quotes as: “exact text” (Document name, page/section if available)
- If information is not in documents, state: “Not found in provided documents”
- Never paraphrase when a direct quote is available
- If documents conflict, mention both and cite each
Answer:Why This Works
Explicit boundaries: “ONLY the provided context” prevents hallucination
Citation requirement: Forces model to reference sources
Refusal rule: Gives permission to say “I don’t know”
Format specification: Makes verification easy
Chain of Thought Prompting: Make AI Think Step by Step for Better Accuracy [Lesson 5]
Picture this: you ask someone to solve a math problem. They blurt out an answer immediately. You ask how they got it, and they can’t explain. You’re not confident in that answer.
Refusal Rules: When to Say “I Don’t Know”
One of the most important RAG techniques is teaching the model when NOT to answer.
Refusal Prompt Template
You are a fact-checker. Answer questions using provided documents.
Context:
[documents]
Question: [question]
Confidence rules:
- If the answer is clearly stated in documents: Provide answer with citation
- If answer requires connecting multiple facts: State this and cite each fact
- If answer is ambiguous or unclear: Say “The documents don’t clearly state this”
- If answer is not in documents at all: Say “This information is not in the provided documents”
Never guess or use external knowledge to fill gaps.Examples of Good Refusals
Question: “What was our Q4 revenue?”
Documents contain: Only Q3 revenue data
Good answer: “The provided documents only contain Q3 revenue ($450K from Q3 Sales Report, page 2). Q4 revenue is not mentioned.”
Bad answer (hallucination): “Following the Q3 trend, Q4 revenue was likely around $480K.”
Setting Confidence Thresholds
You can make the model more or less strict about answering:
Strict (high confidence only):
Only answer if the exact information is explicitly stated in the documents. When in doubt, refuse to answer.Moderate (reasonable inference allowed):
Answer if information is clearly stated or can be directly inferred from multiple facts in the documents. Cite all facts used.Lenient (not recommended):
Answer based on documents, using reasonable assumptions where needed.For most use cases, stick with strict or moderate. Lenient defeats the purpose of RAG.
Real-World RAG Examples
Example 1: Company Knowledge Base QA
You are an internal knowledge assistant. Answer questions about company policies using the employee handbook.
Handbook excerpt:
---
Section 4.2: Remote Work Policy
Employees may work remotely up to 3 days per week with manager approval. Equipment requests must be submitted via IT portal. VPN access is required for all remote work.
Section 4.3: Time Off
Employees accrue 2 days PTO per month, up to 24 days annually. Unused PTO expires at year end. Sick leave is separate and does not expire.
---
Question: How many days can I work from home?
Rules:
- Answer using only the handbook excerpt
- Cite section numbers
- If policy details are missing, say so
Answer:Good response: “According to Section 4.2, you may work remotely up to 3 days per week with manager approval. You’ll also need to submit equipment requests through the IT portal and use VPN for all remote work.”
Example 2: Technical Documentation
You are a technical support assistant. Answer questions using product documentation.
Documentation:
---
Setup Guide (Page 5):
To configure the API, set these environment variables:
- API_KEY: Your authentication key
- BASE_URL: https://api.example.com
- TIMEOUT: Request timeout in seconds (default: 30)
Troubleshooting (Page 12):
Error 401: Invalid API key. Check that API_KEY environment variable is set correctly.
Error 429: Rate limit exceeded. Wait 60 seconds before retrying.
---
Question: How do I fix a 401 error?
Requirements:
- Use exact instructions from documentation
- Quote error messages verbatim
- Cite page numbers
Answer:Good response: “According to the Troubleshooting section (Page 12), Error 401 means ‘Invalid API key. Check that API_KEY environment variable is set correctly.’ Make sure you’ve set the API_KEY environment variable as described in the Setup Guide (Page 5).”
Example 3: Research Synthesis
You are a research synthesizer. Summarize findings from multiple studies.
Study 1 (Smith et al., 2024):
“Participants using Method A showed 15% improvement in retention scores (n=120, p<0.05).”
Study 2 (Johnson et al., 2024):
“Method A group demonstrated no significant improvement in retention (n=45, p=0.12).”
Study 3 (Lee et al., 2023):
“Combined approach of Method A + B yielded 22% improvement (n=200, p<0.01).”
Question: Does Method A improve retention?
Instructions:
- Present all relevant findings
- Note where studies agree or conflict
- Cite author and year for each claim
- Do not draw conclusions beyond what studies state
Answer:Good response: “The research shows mixed results for Method A. Smith et al. (2024) found a 15% improvement in retention scores (statistically significant, n=120), while Johnson et al. (2024) found no significant improvement (n=45). Lee et al. (2023) found that combining Method A with Method B yielded 22% improvement (n=200). The conflicting results may be due to different sample sizes or implementation details.”
Break Big AI Tasks Into Small Steps: Plan, Draft, Critique Method for Better Results [Lesson 6]
Picture this: you need to write a comprehensive marketing plan for a new product launch. You open ChatGPT and type
RAG Context Ranking and Selection
When you have many documents, you need to decide which ones to include. This is context ranking.
1. Simple Manual Ranking
Read the question
Identify which documents are most relevant
Include only those documents
Start with most relevant
2. Keyword-Based Selection
Question: “What was Q3 marketing spend?”
Available documents:
- Q3 Sales Report (mentions “marketing campaign”)
- Q3 Budget Overview (has section “Marketing Expenses”)
- Q2 Sales Report (no Q3 data)
- HR Policies (not relevant)
Selected for context:
1. Q3 Budget Overview (most relevant)
2. Q3 Sales Report (somewhat relevant)3. Context Window Budget
Most models have token limits. Use your context budget wisely:
Reserve 500-1000 tokens for instructions and question
Reserve 500-1000 tokens for the answer
Use remaining space for context documents
Prioritize most relevant documents first
Example with 4K token limit:
Instructions: 500 tokens
Answer: 500 tokens
Available for context: 3,000 tokens (~2,000 words)
If you have 5 relevant documents of 600 words each, include the top 3 most relevant.
Why RAG Matters
RAG transforms AI from creative guessing to reliable research. When you ground answers in your documents:
Accuracy improves dramatically: No more hallucinations about your data
Answers are verifiable: Every claim can be traced to a source
Outdated knowledge isn’t a problem: Use your latest documents, not old training data
Domain-specific knowledge works: Internal docs, specialized fields, proprietary information
You stay in control: The model only knows what you give it
The difference between asking AI “What’s our Q3 revenue?” (answer: made up) and providing the Q3 report with strict citation rules (answer: accurate with sources) is the difference between a creative writer and a reliable research assistant.
Key Takeaways
RAG provides your own documents as context so AI answers from facts, not training data
The three steps: retrieve relevant content, augment prompt with context, generate with citations
Good chunking preserves meaning by splitting at natural boundaries with overlap
Chunk size should be 200-500 words with 1-2 sentences of overlap between chunks
Citation prompts must explicitly require sources and format for verification
Refusal rules teach the model when to say “I don’t have that information”
Use strict confidence thresholds to prevent hallucination and guessing
Context ranking selects which documents to include when space is limited
Budget your context window: instructions + documents + answer must fit token limit
RAG works best when you say “Answer ONLY from these documents” and “Cite your sources”
Mini Exercises
Exercise 1: Basic RAG prompt
Take a real document from your work (meeting notes, report, policy). Write a RAG prompt that asks a question answerable from that document. Include: context section, question, citation rules, refusal rule.
Exercise 2: Test refusal
Using your prompt from Exercise 1, ask a question that the document CANNOT answer. Did the model correctly refuse? If it guessed, strengthen your refusal rules.
Exercise 3: Chunking practice
Take a 1,000-word document and break it into 3 chunks of ~300 words each. Follow the chunking rules: split at natural boundaries, include overlap, keep related facts together.
Exercise 4: Citation verification
Run a RAG prompt with a document and question. For each claim in the answer, verify it actually appears in your source document. Did the model cite correctly? Did it invent anything?
Exercise 5: Multi-document synthesis
Provide 2-3 short documents (200 words each) on the same topic. Ask a question that requires information from multiple documents. Check if the model cites all relevant sources.
Success check: Your RAG prompts should produce answers that are verifiable against source documents, include proper citations, and refuse to answer when information is missing. If the model hallucinates or guesses, revise your instructions to be more explicit about using ONLY provided context.
Next Lesson
You now know how to ground AI in your own documents for reliable, fact-based answers with citations. In the next lesson, we’ll start bringing toegther these patterns on Prompt Engineering that you learned in this course to address real world problems.
You’ll learn how to apply patterns from Lesson 1-7 to real work by stacking them into workflows that handle complexity, stay accurate, and produce consistent results..
For now, practice RAG with your own documents. Start simple with short documents that fit entirely in your prompt. Notice how much more reliable answers become when you provide the source material and demand citations. Save your RAG prompt templates because they become your research assistant workflow.
The difference between an AI that makes things up and one you can trust? Grounding it in your sources and making it show its work.
Your PluggedIn assets for this lesson
What’s inside the Prompt Engineering Mastery Bundle:
Complete 9-lesson ebook (PDF)
7 niche-specific prompt packs (55+ prompts):
Customer support automation
Content creation on a budget
Client proposals & SOWs
Research & analysis
Email & communication
Sales & lead nurture
Operations & SOPs


![Chain of Thought Prompting: Make AI Think Step by Step for Better Accuracy [Lesson 5]](https://substackcdn.com/image/fetch/$s_!fU8F!,w_280,h_280,c_fill,f_auto,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f82c497-c681-44ab-871c-b665098434fc_1376x768.jpeg)
![Break Big AI Tasks Into Small Steps: Plan, Draft, Critique Method for Better Results [Lesson 6]](https://substackcdn.com/image/fetch/$s_!79LC!,w_280,h_280,c_fill,f_auto,q_auto:good,fl_progressive:steep,g_auto/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1623f8b9-780c-4983-8a31-8dd5e88464b5_1376x768.jpeg)
