Skip to content

yuuyle/pico-chan

Repository files navigation

日本語 | English

Pico-chan v2 - AI Companion Robot for Children (Web Version)

A real-time AI conversation robot designed for 4-year-old children. Web service accessible from tablet browsers.

Features

  • Real-time Voice Conversation - Ultra-low latency with Gemini Live API (Native Audio)
  • Camera Recognition - "Look at this!" - recognizes objects children show
  • Play Modes - Rock-paper-scissors, Chase game
  • Memory - Remembers names, favorites, and past events
  • Session Time Limit - Auto-ends after 30 minutes (child safety)
  • ESP32 WiFi Integration (Optional) - Physical robot control

Architecture

┌────────────────────┐         ┌─────────────────────────────────┐
│   Tablet           │  WSS    │     Google Cloud Run            │
│   (Browser)        │◄───────►│  FastAPI + ADK + Gemini Live   │
│                    │         └─────────────────────────────────┘
│ - React App        │
│ - Camera/Mic       │
│ - Speaker          │
│ - Pico Animation   │
│                    │
│    ┌───────────┐   │
│    │ WebSocket │   │ (Optional)
│    └─────┬─────┘   │
└──────────┼─────────┘
           │ WiFi (192.168.4.1:81)
           ▼
┌────────────────────┐
│ ESP32-S3-WROOM     │
│ (AP Mode: pico-chan)
│ - Motor/LED/Buzzer │
└────────────────────┘

Quick Start (Local Development)

1. Start Backend

cd backend
cp .env.example .env    # Edit Google Cloud settings
pip install -r requirements.txt
uvicorn app.main:app --reload

2. Start Frontend

cd frontend
npm install
npm run dev

Open http://localhost:5173 in your browser.

3. Basic Usage

  1. You will see "Enter your 4-digit PIN". Type 1234 (default) to start.
  2. Once "Connected" is shown, click the microphone, camera, and WiFi (optional) buttons to enable them.
  3. Talk with Pico-chan!
  4. Say "Draw me a picture of ~~" and Pico-chan will generate an image for you.

Environment Variables

Backend (backend/.env)

# Google Cloud (VertexAI)
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1

# Gemini Live API
GEMINI_MODEL=gemini-live-2.5-flash-native-audio
GEMINI_VOICE=Aoede

# App Settings
APP_LANGUAGE=en          # "ja" or "en"
ROBOT_NAME=Pico          # Robot's name
MAX_SESSION_MINUTES=30
APP_PIN=1234

# CORS
ALLOWED_ORIGINS=["http://localhost:5173"]

Frontend (frontend/.env)

VITE_SERVER_URL=ws://localhost:8080
VITE_APP_PIN=1234
VITE_APP_LANGUAGE=en     # "ja" or "en"

Production Deployment

Backend (Cloud Run)

cd backend
./deploy.sh

Frontend (Firebase Hosting)

cd frontend
# Set Cloud Run URL in .env.production
firebase login
./deploy.sh

Project Structure

pico-chan-v2/
├── backend/                # FastAPI Backend
│   ├── app/
│   │   ├── main.py         # Entry point
│   │   ├── core/
│   │   │   ├── agent.py    # ADK Agent definition
│   │   │   └── config.py   # Configuration
│   │   ├── api/
│   │   │   └── websocket.py # WebSocket endpoint
│   │   └── services/
│   │       ├── session_manager.py  # ADK Session
│   │       └── tool_handler.py     # Tools/Animation
│   ├── Dockerfile
│   ├── deploy.sh
│   └── requirements.txt
│
├── frontend/               # React Frontend
│   ├── src/
│   │   ├── App.tsx
│   │   ├── components/
│   │   │   ├── PicoCharacter/   # Animated character
│   │   │   ├── SessionControls/ # Control buttons
│   │   │   ├── SafetyDialog/    # Time limit dialog
│   │   │   └── ...
│   │   ├── hooks/
│   │   │   ├── useLiveAPI.ts    # WebSocket + Audio
│   │   │   └── useSessionTimer.ts
│   │   ├── i18n/                # Internationalization
│   │   │   ├── translations.ts
│   │   │   └── useTranslation.ts
│   │   └── services/
│   │       ├── AudioInput.ts    # Microphone
│   │       ├── AudioOutput.ts   # Speaker
│   │       ├── CameraCapture.ts # Camera
│   │       └── ESP32Controller.ts # WiFi control
│   ├── firebase.json
│   ├── deploy.sh
│   └── package.json
│
└── arduino/                # ESP32 Sketch
    └── pico_chan_esp32/
        └── pico_chan_esp32.ino  # WiFi WebSocket version

Tech Stack

  • Backend: FastAPI, Google ADK, Gemini Live API
  • Frontend: React, TypeScript, Vite
  • Infrastructure: Google Cloud Run, Firebase Hosting
  • Hardware: ESP32-S3-WROOM (WiFi, optional)

Localization

This app supports both Japanese and English:

Setting Japanese English
Backend APP_LANGUAGE ja en
Backend ROBOT_NAME ぴこちゃん Pico
Frontend VITE_APP_LANGUAGE ja en

ESP32 Robot Integration (Optional)

WiFi Connection

  1. Flash sketch to ESP32 (arduino/pico_chan_esp32/)
  2. Connect tablet to WiFi "pico-chan"
    • Password: picochan123
  3. Tap "WiFi" button in the app

ESP32 Pin Configuration

Function Pin
Motor A EN GPIO 5
Motor A IN1/IN2 GPIO 6, 7
Motor B EN GPIO 15
Motor B IN1/IN2 GPIO 16, 17
Ultrasonic TRIG/ECHO GPIO 18, 8
Buzzer GPIO 9
Built-in LED GPIO 48

Required Libraries

  • ArduinoJson
  • WebSockets (by Markus Sattler)

License

MIT

About

robot

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors