日本語 | English
A real-time AI conversation robot designed for 4-year-old children. Web service accessible from tablet browsers.
- Real-time Voice Conversation - Ultra-low latency with Gemini Live API (Native Audio)
- Camera Recognition - "Look at this!" - recognizes objects children show
- Play Modes - Rock-paper-scissors, Chase game
- Memory - Remembers names, favorites, and past events
- Session Time Limit - Auto-ends after 30 minutes (child safety)
- ESP32 WiFi Integration (Optional) - Physical robot control
┌────────────────────┐ ┌─────────────────────────────────┐
│ Tablet │ WSS │ Google Cloud Run │
│ (Browser) │◄───────►│ FastAPI + ADK + Gemini Live │
│ │ └─────────────────────────────────┘
│ - React App │
│ - Camera/Mic │
│ - Speaker │
│ - Pico Animation │
│ │
│ ┌───────────┐ │
│ │ WebSocket │ │ (Optional)
│ └─────┬─────┘ │
└──────────┼─────────┘
│ WiFi (192.168.4.1:81)
▼
┌────────────────────┐
│ ESP32-S3-WROOM │
│ (AP Mode: pico-chan)
│ - Motor/LED/Buzzer │
└────────────────────┘
cd backend
cp .env.example .env # Edit Google Cloud settings
pip install -r requirements.txt
uvicorn app.main:app --reloadcd frontend
npm install
npm run devOpen http://localhost:5173 in your browser.
- You will see "Enter your 4-digit PIN". Type
1234(default) to start. - Once "Connected" is shown, click the microphone, camera, and WiFi (optional) buttons to enable them.
- Talk with Pico-chan!
- Say "Draw me a picture of ~~" and Pico-chan will generate an image for you.
# Google Cloud (VertexAI)
GOOGLE_GENAI_USE_VERTEXAI=TRUE
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
# Gemini Live API
GEMINI_MODEL=gemini-live-2.5-flash-native-audio
GEMINI_VOICE=Aoede
# App Settings
APP_LANGUAGE=en # "ja" or "en"
ROBOT_NAME=Pico # Robot's name
MAX_SESSION_MINUTES=30
APP_PIN=1234
# CORS
ALLOWED_ORIGINS=["http://localhost:5173"]VITE_SERVER_URL=ws://localhost:8080
VITE_APP_PIN=1234
VITE_APP_LANGUAGE=en # "ja" or "en"cd backend
./deploy.shcd frontend
# Set Cloud Run URL in .env.production
firebase login
./deploy.shpico-chan-v2/
├── backend/ # FastAPI Backend
│ ├── app/
│ │ ├── main.py # Entry point
│ │ ├── core/
│ │ │ ├── agent.py # ADK Agent definition
│ │ │ └── config.py # Configuration
│ │ ├── api/
│ │ │ └── websocket.py # WebSocket endpoint
│ │ └── services/
│ │ ├── session_manager.py # ADK Session
│ │ └── tool_handler.py # Tools/Animation
│ ├── Dockerfile
│ ├── deploy.sh
│ └── requirements.txt
│
├── frontend/ # React Frontend
│ ├── src/
│ │ ├── App.tsx
│ │ ├── components/
│ │ │ ├── PicoCharacter/ # Animated character
│ │ │ ├── SessionControls/ # Control buttons
│ │ │ ├── SafetyDialog/ # Time limit dialog
│ │ │ └── ...
│ │ ├── hooks/
│ │ │ ├── useLiveAPI.ts # WebSocket + Audio
│ │ │ └── useSessionTimer.ts
│ │ ├── i18n/ # Internationalization
│ │ │ ├── translations.ts
│ │ │ └── useTranslation.ts
│ │ └── services/
│ │ ├── AudioInput.ts # Microphone
│ │ ├── AudioOutput.ts # Speaker
│ │ ├── CameraCapture.ts # Camera
│ │ └── ESP32Controller.ts # WiFi control
│ ├── firebase.json
│ ├── deploy.sh
│ └── package.json
│
└── arduino/ # ESP32 Sketch
└── pico_chan_esp32/
└── pico_chan_esp32.ino # WiFi WebSocket version
- Backend: FastAPI, Google ADK, Gemini Live API
- Frontend: React, TypeScript, Vite
- Infrastructure: Google Cloud Run, Firebase Hosting
- Hardware: ESP32-S3-WROOM (WiFi, optional)
This app supports both Japanese and English:
| Setting | Japanese | English |
|---|---|---|
Backend APP_LANGUAGE |
ja |
en |
Backend ROBOT_NAME |
ぴこちゃん |
Pico |
Frontend VITE_APP_LANGUAGE |
ja |
en |
- Flash sketch to ESP32 (
arduino/pico_chan_esp32/) - Connect tablet to WiFi "pico-chan"
- Password:
picochan123
- Password:
- Tap "WiFi" button in the app
| Function | Pin |
|---|---|
| Motor A EN | GPIO 5 |
| Motor A IN1/IN2 | GPIO 6, 7 |
| Motor B EN | GPIO 15 |
| Motor B IN1/IN2 | GPIO 16, 17 |
| Ultrasonic TRIG/ECHO | GPIO 18, 8 |
| Buzzer | GPIO 9 |
| Built-in LED | GPIO 48 |
- ArduinoJson
- WebSockets (by Markus Sattler)
MIT