AI Smart Assistant

Face Recognition | Voice Activation | GenAI-powered Responses

Developed by Jon Spahiu (an AI Passionate Student)

Scroll down to explore the project details.

Project Overview

Welcome Hack Club visitors!

This AI Smart Assistant uses cutting-edge technologies including face recognition, speech-to-text, retrieval-augmented generation (RAG), and OpenAI’s GPT models to create a secure, intelligent assistant experience.

--------- Core Features ---------

Face Detection and Authentication
Voice Activation and Speech Recognition
Dynamic Answer Generation with RAG and ChatGPT
Real-time Text-to-Speech Response

Keep scrolling for:

- Full Project Explanation

- Demonstration Video

- Presentation Slides

Vote!

If you are part of the Hack Club, please follow and/or vote my project as I am trying to finish publishing it and hope for some shells to support this project!

Link to my project and the iframe is down below if you like to follow or/and vote my project!

Hack Club Summer Project: https://summer.hackclub.com/projects/7286

Github

Check out my github readme file below!

Link to my project and the iframe is down below if you like to follow or/and vote my project!

Hack Club Summer Project: https://summer.hackclub.com/projects/7286

Technical Breakdown

--------- Development Language ---------

100% built in Python — the world's leading language for AI and machine learning projects.

--------- Why Python? ---------

Beginner-friendly and versatile
Industry-standard for AI development
Massive ecosystem of AI libraries and frameworks

--------- Project Workflow ---------

Step	Description
1	Detect known faces and activate the microphone securely
2	Convert user speech into text input
3	Query vector database (FAISS) using RAG for relevant information
4	Generate accurate responses using OpenAI's GPT models
5	Convert the response back into speech for the user

--------- Challenges & Solutions ---------

Component	Solution	Libraries/Tools
Face Recognition	Leveraged facial recognition libraries and integrated with a Pygame visualization window.	face_recognition, Pygame
Speech Recognition	Used Google's Speech Recognition API and pyttsx3 for text-to-speech output.	SpeechRecognition, Pyttsx3
Vector Database	Used Embeddings to convert text to numbers which is sent to Vector Database to compare numbers for accurate results	LangChain, Vector, Embeddings
FAISS	Utilized FAISS for vector database queries to enhance response accuracy.	Meta FAISS, LangChain
Query and Response	Integrated OpenAI GPT models with RAG to produce context-aware answers.	OpenAI API, LangChain (LLM + RAG)
Multithreading	Implemented multithreading to ensure simultaneous process management.	Python Threading

--------- Key Concepts ---------

LLM (Large Language Model): AI models trained on massive datasets to generate human-like text (e.g., ChatGPT).

RAG (Retrieval-Augmented Generation): Combines LLMs with external databases to provide highly accurate, customized answers.

--------- Project Snapshots ---------