Configuration¶
GitHelp uses YAML configuration files stored in configs/, plus project-specific configuration files generated under data/projects/.
There are two levels of configuration:
configs/
→ global GitHelp settings
data/projects/<project_name>/
→ generated configuration and corpus for a selected target project
app_config.yaml¶
This file stores application-level settings.
Example:
app_title: GitHelp
app_subtitle: Ask questions about a project's documentation
show_sources: true
default_top_k: 5
project_profile: mmore
llm:
provider: qwen
model_name: Qwen/Qwen3-4B
max_new_tokens: 512
temperature: 0.0
enable_thinking: false
project:
config_path: configs/project_config.yaml
MMORE sparse indexing currently requires Transformers 4.x. GitHelp pins:
transformers>=4.51.0,<5
If MMORE index building fails with a tokenizer error such as
BertTokenizer has no attribute batch_encode_plus, reinstall the compatible
version:
python -m pip install "transformers>=4.51.0,<5"
Main fields¶
Field |
Meaning |
|---|---|
|
Parsed application title. The current Streamlit header still renders |
|
Parsed application subtitle. The current Streamlit header still renders its caption directly. |
|
Parsed display default; persisted Streamlit state takes precedence in the current UI. |
|
Parsed retrieval default; persisted Streamlit state currently initializes the sidebar value. |
|
Project-specific behavior profile, for example |
|
Project config used to identify the indexed project when preparing prompts. This is not changed automatically when Streamlit builds another project. |
|
LLM provider used by GitHelp. |
|
Model name used by the provider. |
|
Maximum number of generated tokens. |
|
Generation temperature. |
|
Whether to enable Qwen thinking mode. |
project_config.yaml¶
This file describes the target project being indexed.
The default configuration is:
configs/project_config.yaml
When using the Streamlit interface, GitHelp also generates a project-specific config:
data/projects/<project_name>/project_config.yaml
Example for MMORE:
project_name: mmore
package_name: mmore
repo_path: /absolute/path/to/mmore
docs_path: /absolute/path/to/mmore/docs/source
code_path: /absolute/path/to/mmore/src/mmore
include_yaml_configs: true
yaml_config_paths:
- /absolute/path/to/mmore/examples
- /absolute/path/to/mmore/production-config
include_repo_structure: true
repo_structure_max_depth: 4
Main fields¶
Field |
Meaning |
|---|---|
|
Project name stored in metadata and used for project folders. |
|
Python package prefix used to build full module names. |
|
Root folder of the target repository. |
|
Folder containing Markdown or reStructuredText docs. |
|
Python source folder used for docstring extraction. |
|
Whether to include YAML files in the corpus. |
|
Folders scanned for |
|
Whether to add a synthetic repository tree document. |
|
Maximum depth of the generated repository tree. |
Generated project configs also contain include_extensions and
exclude_patterns. These values are parsed and retained for future filtering,
but the current corpus builder selects sources through the configured paths,
source-specific extensions, and loader-level exclusions instead.
indexing_config.yaml¶
This file contains prototype corpus/retrieval settings and the defaults used by
scripts/build_index.py.
include_markdown: true
include_code_docstrings: true
include_signatures: true
include_code_snippets: false
chunk_size: 1200
chunk_overlap: 150
top_k: 5
retrieval_backend: simple
collection_name: mmore_docs
mmore_index_config_path: configs/mmore_index_config.yaml
The current index command reads collection_name and
mmore_index_config_path. Fields describing source inclusion, chunking,
top_k, and retrieval_backend are currently reserved configuration: the
corpus builder and Streamlit sidebar do not consume them. Corpus construction
instead relies on Markdown sections and extracted code documentation records.
mmore_index_config.yaml¶
This file is passed to MMORE when building the index.
indexer:
dense_model:
model_name: sentence-transformers/all-MiniLM-L6-v2
is_multimodal: false
sparse_model:
model_name: splade
is_multimodal: false
db:
uri: ./data/indexes/mmore/githelp.db
name: my_db
collection_name: mmore_docs
documents_path: data/processed/mmore_corpus.jsonl
mmore_retriever_config.yaml¶
This file configures MMORE retrieval.
db:
uri: ./data/indexes/mmore/githelp.db
name: my_db
hybrid_search_weight: 0.5
k: 5
collection_name: mmore_docs
use_web: false
reranker_model_name: null
Both MMORE config files currently point to one local Milvus Lite database and
the shared mmore_docs collection. Project-specific corpora and exports are
isolated under data/projects/, but native MMORE indexes are not yet isolated
per project.
data/app_state.json¶
The Streamlit app persists the latest local UI state in:
data/app_state.json
It can contain:
{
"project_name": "mmore",
"project_path": "/path/to/mmore",
"corpus_path": "/path/to/githelp/data/projects/mmore/corpus.jsonl",
"project_config_path": "/path/to/githelp/data/projects/mmore/project_config.yaml",
"mmore_corpus_path": "/path/to/githelp/data/projects/mmore/mmore_corpus.jsonl",
"collection_name": "mmore_docs",
"indexing_mode": "mmore",
"backend": "simple",
"top_k": 5,
"use_llm": true,
"show_sources": true,
"show_full_sources": false,
"show_debug": false
}
This file is machine-specific and should normally be ignored by Git.