5.3.2.1. gemini_application.chatpopup.chatpopup

Interactive chat pop-up application using Azure OpenAI or local Ollama.

Supports document ingestion into ChromaDB and retrieval-augmented generation.

Classes

ChatPopup()

Retrieval-augmented chat application built on ChromaDB + Ollama/Azure OpenAI.

ChunkRecord(chunk_id, text, metadata)

A single text chunk with id and metadata for storage and citation.

class gemini_application.chatpopup.chatpopup.ChatPopup[source]

Bases: ApplicationAbstract

Retrieval-augmented chat application built on ChromaDB + Ollama/Azure OpenAI.

Initialize configuration fields; actual clients are created in initialize_model().

build_prompt(user_message, selected)[source]

Build RAG prompt and return structured citation items for UI display.

Return type:

Tuple[str, List[Dict[str, Any]]]

calculate()[source]

Compatibility method for the parent framework.

Return type:

str

chunk_text_with_metadata(source, page, text, file_sig, lang='unknown', translated=False)[source]

Normalize and chunk page text, then attach metadata for citations and filtering.

Return type:

List[ChunkRecord]

chunksplitter_for_embeddings(text, max_words, overlap_words=0)[source]

Split text into overlapping word chunks suitable for embedding models.

Return type:

List[str]

static cosine_sim(a, b)[source]

Compute cosine similarity without numpy.

Return type:

float

delete_collection()[source]

Delete the current Chroma collection.

Return type:

None

detect_language(text)[source]

Detect language of a text sample. Returns ‘en’, ‘nl’ etc. .

Return type:

str

embed_one_ollama(text)[source]

Embed one text, shrinking it iteratively if Ollama rejects it as too long.

Return type:

List[float]

file_signature(file_path)[source]

Return a lightweight signature used to detect file changes.

Return type:

Dict[str, Any]

filter_context(context)[source]

Extract and filter Chroma query results by similarity threshold.

Return type:

Dict[str, Any]

get_embedding(user_message)[source]

Embed a user query string for retrieval.

Return type:

List[float]

get_embedding_list(chunks)[source]

Embed a list of chunks using batching.

Return type:

List[List[float]]

get_response(prompt)[source]

Generate an answer from the selected LLM.

Return type:

str

init_parameters(parameters)[source]

Apply parameters from a dict and initialize models and database clients.

Return type:

None

initialize_model()[source]

Create LLM/embedding clients and open the Chroma collection.

Return type:

None

is_context_error(e)[source]

Return True if an exception indicates an embedding context-length overflow.

Return type:

bool

load_manifest()[source]

Load the manifest containing file signatures for incremental ingestion.

Return type:

Dict[str, Any]

load_pdf_pages(file_path)[source]

Load a PDF and return a list of (page_index, page_text).

Return type:

List[Tuple[int, str]]

load_txt(file_path)[source]

Load a UTF-8 text file.

Return type:

str

manifest_path()[source]

Return the absolute path to the ingestion manifest file.

Return type:

str

maybe_translate_text(text)[source]

Detect language and translate to English if needed.

Return type:

tuple[str, str, bool]

mmr_rerank(query_emb, candidates, top_k, lam)[source]

Select a diverse set of relevant chunks using Max Marginal Relevance (MMR).

Return type:

List[Dict[str, Any]]

process_prompt(user_message)[source]

Answer a question by retrieving relevant chunks and generating a grounded response.

Return type:

Dict[str, Any]

safe_ollama_embed_batch(texts)[source]

Embed a batch; if the batch fails, embed items individually with shrink-on-failure.

Return type:

List[List[float]]

save_manifest(manifest)[source]

Write manifest to disk atomically.

Return type:

None

translate_to_english(text)[source]

Translate text to English while preserving numbers and table-like structure.

Return type:

str

update_data()[source]

Ingest new/changed files into Chroma and delete removed files using a local manifest.

Return type:

None

class gemini_application.chatpopup.chatpopup.ChunkRecord(chunk_id, text, metadata)[source]

Bases: object

A single text chunk with id and metadata for storage and citation.

chunk_id: str
metadata: Dict[str, Any]
text: str