BepsBot | AI & LLM Engineering Portfolio

The Mission

Online mental health communities (OMHCs): are vital spaces for peer support, but users often struggle to write comments that are both empathetic and informative. Many comments lack either emotional support (ES) or informational support (IS), and users may feel uncertain about the quality or helpfulness of their responses. Traditional writing assistance tools are not tailored for the sensitive, high-stakes context of mental health support, and can frustrate users or lead to insincere, repetitive, or even risky content. There is a clear need for intelligent, context-aware tools that can guide users to compose high-quality, safe, and supportive comments in real time.

Data & Annotation

Data Collection and Annotation

BepsBot’s data pipeline initiates with the large-scale extraction of mental health discussions from Reddit, encompassing six years of posts and comments. Leveraging the Pushshift API, the system systematically collects and aggregates this data, which is then subjected to comprehensive processing, cleaning, normalization, and formatting. To ensure both data quality and computational efficiency, only comments that meet stringent length and relevance criteria are retained, forming a robust raw data lake.

A domain expert annotation interface is employed to label a representative sample of comments for Informational Support (IS) and Emotional Support (ES) on a three-point ordinal scale (low, medium, high). IS is operationalized as the presence of advice, referrals, or knowledge, while ES is defined by evidence of understanding, encouragement, affirmation, sympathy, or caring. Multiple annotators independently label the data, with rigorous quality control and validation procedures ensuring high inter-rater reliability (Cohen’s κ > 0.85). Discrepancies are adjudicated by a third expert, resulting in a gold-standard dataset for supervised learning.

Model Training

Model Training and Active Learning

The expertly annotated dataset serves as the foundation for training and fine-tuning advanced transformer-based models, such as RoBERTa, for regression-based IS/ES scoring. The modeling pipeline incorporates custom regression heads and loss function optimization, with model performance evaluated using standard metrics (e.g., F1 score, R²). BepsBot employs an active learning loop: new user interactions are pseudo-labeled, filtered for quality, and iteratively incorporated into retraining cycles. This enables continuous model refinement, domain adaptation, and seamless deployment of updated models.

System Design

Decoupled Inference Architecture

BepsBot is a modular AI system leveraging large language models (LLMs), vector databases, and advanced NLP pipelines to assist users in composing high-quality, supportive comments in online mental health communities. The system integrates Retrieval-Augmented Generation (RAG), prompt engineering, and fine-tuned models for real-time feedback and recommendations.

Figure 1: Hybrid architecture combining symbolic NLP (LIWC) and Neural Transformers.

RAG Implementation

Semantic Search & Vector Retrieval

To prevent hallucinations, BepsBot uses a Retrieval-Augmented Generation (RAG) pipeline with ChromaDB to provide the LLM with relevant, expert-labeled context.

Vector Embeddings

Drafts are converted to high-dimensional vectors using all-MiniLM-L6-v2 for semantic matching.

Contextual Prompting

Top-ranked examples are injected into the DeepSeek-V3 prompt to align outputs with clinical best practices.

Core Engines

Machine Learning Lifecycle

BepsBot integrates predictive assessment, generative recommendations, and a self-evolving data loop to maintain high support standards.

Assessment Algorithm

Real-time quantification of empathy using Transformer-based Regression. The model analyzes linguistic markers to provide instant feedback on support quality.

Informational Support (IS)

Measures the degree of advice, suggestions, and helpful knowledge provided in a response.

Emotional Support (ES)

Evaluates expressions of empathy, validation, and emotional presence using sentiment-aware attention layers.

Recommendation Algorithm

A hybrid generative approach that suggests improvements based on high-performing historical peer interactions.

Candidate Generation: Retrieval of top-k similar high-quality comments from the vector database.

Safety Filtering: Multi-layer content moderation to ensure non-toxic, clinically appropriate suggestions.

Automated Active Learning Loop

I developed a continuous improvement pipeline that solves the "cold start" problem in support scoring.

Data Ingestion: Captures real-world user interactions in structured JSON logs.

Pseudo-Labeling: Automated scoring using current RoBERTa-Base predictors (regression heads).

Fine-Tuning: High-confidence samples are integrated into the training set for iterative model updates.

AI Engineering Tech Stack

Python FastAPI PyTorch HuggingFace ChromaDB LangChain DeepSeek-V3 Scikit-Learn LIWC