AI Car



I build AI Car based on Raspberry Pi 4
This is my first project about embedded system.
I used two model to drive my car
  1. Tiny Neural Network for predicting rotation angle of wheels
  2. Yolo v8 for detecting obstacle, crosswalk, etc.
Preview
Pipeline

This document provides a detailed technical overview of the AI pipeline used in the Hospital Recommendation System.

1. Project Overview

The system is designed to provide personalized hospital recommendations based on user-described symptoms. It leverages a hybrid approach combining Large Language Models (LLM) for semantic understanding and Sentence Embeddings for efficient retrieval.


2. Technical Architecture

Phase A: Semantic Subject Extraction (Gemini)

When a user enters symptoms, the system utilizes the Google Gemini to bridge the gap between "layman's terms" and "medical terminology."

  1. Initial Analysis: The raw symptom text is sent to Gemini to understand the medical context.
  2. Subject Refinement: A second targeted prompt is used to extract the specific Expected Medical Department (e.g., "내과", "안과").
  3. Prompt Engineering: The system explicitly requests "plain text" without markdown formatting or special characters to ensure the output is compatible with the downstream embedding model.
  4. Localization: The term "성북구" (Seongbuk-gu) is appended to the extracted department to prioritize local results during the similarity search.
  5. (It can be modified, In my case, I used "성북구" because I live in Seongbuk-gu)

Phase B: Advanced Hospital Embedding Strategy

Unlike simple text search, this system uses a Weighted Multi-Field Embedding approach to represent hospitals in a high-dimensional vector space.

  • Base Model: jhgan/ko-sroberta-sts
    • Type: SBERT (Sentence-BERT) based on the RoBERTa architecture.
    • Specialization: Fine-tuned specifically for Korean Semantic Textual Similarity (STS).
  • Weighted Combination:
    For every hospital in the database (hospital_info.csv), a single representative vector is created by embedding four fields separately and combining them using specific weights:
    FieldWeightDescription
    medical_subject0.4The primary factor for matching symptoms.
    address0.4Ensures geographic relevance.
    hospital_name0.1Provides identity context.
    opening_hours0.1Adds temporal context.
  • Caching: To ensure low-latency responses, these embeddings are pre-computed and stored as .pt (PyTorch) files in the cache/ directory.

Phase C: Semantic Ranking & Retrieval

The retrieval process finds the "mathematical nearest neighbors" to the user's query.

  1. Query Vectorization: The query (Extracted Subject + "성북구") is converted into a vector using the same ko-sroberta-sts model.
  2. Similarity Calculation: The system computes the Cosine Similarity between the query vector and the pre-computed hospital embedding matrix.
  3. Top-K Retrieval: The k hospitals (default: 10) with the highest similarity scores are selected.
  4. Formatting: Results are mapped back to the original CSV data and displayed to the user with details including Name, Subject, and Address.

3. Key Specifications

ComponentSpecification
LLM Modelgemini
Embedding Modeljhgan/ko-sroberta-sts (HuggingFace)
Embedding Dimension768 dimensions
Similarity MetricCosine Similarity
Data FormatCSV (Pandas)
Vector StoragePyTorch Serialization (.pt)

4. Pipeline Flow Summary

User input symptom → Gemini (Subject Extraction) → SBERT (Query Encoding) → Cosine Similarity vs. Weighted Hospital Matrix → Top-10 Ranking → UI Display