# RAG and LLM layer

The RAG layer starts after retrieval.

Its job is to turn retrieved sources into a grounded answer.

## Relevant files

```text
src/githelp/rag/prompting.py
src/githelp/rag/extractive_answerer.py
src/githelp/rag/retrieval_query.py
src/githelp/rag/answering.py
src/githelp/rag/llm_provider.py
src/githelp/rag/llm_factory.py
src/githelp/rag/qwen_provider.py
```

## Prompting

File:

```text
src/githelp/rag/prompting.py
```

This module formats retrieved sources into a prompt.

The prompt instructs the LLM to:

- answer only from the provided sources;
- cite sources inline with `[Source 1]`, `[Source 2]`, etc.;
- avoid inventing commands, paths, APIs, modules, or configuration keys;
- avoid interpreting configuration values unless the sources explain them;
- begin with a brief, direct context sentence;
- use practical numbered steps for how-to questions;
- group parameters by role instead of repeating one description pattern;
- clearly separate supported facts, safe inferences, and missing evidence;
- say when the sources are incomplete or insufficient.

Debug command:

```bash
python scripts/debug_prompting.py \
  "How do I configure indexing?" \
  --backend simple \
  --corpus-path data/projects/mmore/corpus.jsonl \
  --config-path configs/app_config.yaml
```

## LLM providers

GitHelp uses a provider interface:

```text
src/githelp/rag/llm_provider.py
```

The active provider is selected from:

```text
configs/app_config.yaml
```

Example:

```yaml
llm:
  provider: qwen
  model_name: Qwen/Qwen3-4B
  max_new_tokens: 512
  temperature: 0.0
  enable_thinking: false
```

The Qwen provider uses Hugging Face Transformers. A `dummy` provider is also
implemented for tests and pipeline debugging. No external or hosted LLM
provider is currently included.

## High-level answering helpers

Conversation-aware retrieval query detection and rewriting live in:

```text
src/githelp/rag/retrieval_query.py
```

This module keeps follow-up detection, ambiguity handling, and LLM-assisted
query rewriting separate from retrieval and answer generation.

File:

```text
src/githelp/rag/answering.py
```

This module exposes:

```python
prepare_answer_prompt(...)
answer_question(...)
answer_question_with_llm(...)
answer_question_with_provider(...)
```

Current flow:

```text
current question + recent chat
→ keep standalone questions unchanged, or rewrite a clear follow-up
→ ask for clarification when a follow-up has no single clear referent
→ project profile query expansion
→ retrieval
→ project profile filtering/reranking
→ optional project profile direct answer
→ prompt construction with the original question and lightweight recent context
→ LLM generation
```

The rewritten query is used only for retrieval. Recent chat is not appended to
the retrieval query, and standalone questions are not forced into the previous
topic. The final answer prompt receives at most six recent messages to resolve
references, but instructs the model not to repeat earlier answers unless the
user explicitly asks for a summary or rephrasing.

## Direct answers from project profiles

Some structured questions are better answered deterministically than by an LLM.

For example, the MMORE profile can answer Milvus parameter questions directly.
It scans the retrieved records for a fixed allowlist of known Milvus keys. This
avoids returning unrelated fields such as `model_name`, `top_k`, or
`max_workers`, but it is not a general YAML schema parser.

## Temporary extractive answerer

File:

```text
src/githelp/rag/extractive_answerer.py
```

This remains available when LLM generation is disabled.

It:

- takes the top retrieved source;
- returns its content;
- has a small special case for signature questions.

Command:

```bash
python scripts/answer_question.py \
  "How do I configure indexing?" \
  --backend simple
```

## LLM answer generation

LLM generation can be enabled with:

```bash
python scripts/answer_question.py \
  "How do I configure indexing?" \
  --llm \
  --backend simple \
  --corpus-path data/projects/mmore/corpus.jsonl \
  --config-path configs/app_config.yaml
```

The expected answer includes inline source citations.