Build a Chat Assistant with AI
Remember in Star Trek when the crew would casually chat with the ship's computer, asking it complex questions and getting thoughtful responses? What seemed like pure science fiction in the 1960s is now something you can build using web technologies you already know. In this lesson, we'll create an AI chat assistant using HTML, CSS, JavaScript, and some backend integration. You'll discover how the same skills you've been learning can connect to powerful AI services that can understand context and generate meaningful responses. Think of AI like having access to a vast library that can not only find information but also synthesize it into coherent answers tailored to your specific questions. Instead of searching through thousands of pages, you get direct, contextual responses. The integration happens through familiar web technologies working together. HTML creates the chat interface, CSS handles the visual design, JavaScript manages user interactions, and a backend API connects everything to AI services. It's similar to how different sections of an orchestra work together to create a symphony. We're essentially building a bridge between natural human communication and machine processing. You'll learn both the technical implementation of AI service integration and the design patterns that make interactions feel intuitive. By the end of this lesson, AI integration will feel less like a mysterious process and more like another API you can work with. You'll understand the foundational patterns that power applications like ChatGPT and Claude, using the same web development principles you've been learning.
β‘ What You Can Do in the Next 5 Minutes
Quick Start Pathway for Busy Developers
- Minute 1: Visit GitHub Models Playground and create a personal access token
- Minute 2: Test AI interactions directly in the playground interface
- Minute 3: Click "Code" tab and copy the Python snippet
- Minute 4: Run the code locally with your token: GITHUB_TOKEN=your_token python test.py
- Minute 5: Watch your first AI response generate from your own code Quick Test Code: Why This Matters: In 5 minutes, you'll experience the magic of programmatic AI interaction. This represents the fundamental building block that powers every AI application you use. Here's what your finished project will look like:
πΊοΈ Your Learning Journey Through AI Application Development
Your Journey Destination: By the end of this lesson, you'll have built a complete AI-powered application using the same technologies and patterns that power modern AI assistants like ChatGPT, Claude, and Google Bard.
Understanding AI: From Mystery to Mastery
Before diving into the code, let's understand what we're working with. If you've used APIs before, you know the basic pattern: send a request, receive a response. AI APIs follow a similar structure, but instead of retrieving pre-stored data from a database, they generate new responses based on patterns learned from vast amounts of text. Think of it like the difference between a library catalog system and a knowledgeable librarian who can synthesize information from multiple sources.
What is "Generative AI" Really?
Consider how the Rosetta Stone allowed scholars to understand Egyptian hieroglyphics by finding patterns between known and unknown languages. AI models work similarly β they find patterns in vast amounts of text to understand how language works, then use those patterns to generate appropriate responses to new questions. Let me break this down with a simple comparison:
- Traditional database: Like asking for your birth certificate β you get the exact same document every time
- Search engine: Like asking a librarian to find books about cats β they show you what's available
- Generative AI: Like asking a knowledgeable friend about cats β they tell you interesting things in their own words, tailored to what you want to know
How AI Models Learn (The Simple Version)
AI models learn through exposure to enormous datasets containing text from books, articles, and conversations. Through this process, they identify patterns in:
- How thoughts are structured in written communication
- Which words commonly appear together
- How conversations typically flow
- Contextual differences between formal and informal communication It's similar to how archaeologists decode ancient languages: they analyze thousands of examples to understand grammar, vocabulary, and cultural context, eventually becoming able to interpret new texts using those learned patterns.
Why GitHub Models?
We're using GitHub Models for a pretty practical reason β it gives us access to enterprise-level AI without having to set up our own AI infrastructure (which, trust me, you don't want to do right now!). Think of it like using a weather API instead of trying to predict the weather yourself by setting up weather stations everywhere. It's basically "AI-as-a-Service," and the best part? It's free to get started, so you can experiment without worrying about running up a huge bill. We'll use GitHub Models for our backend integration, which provides access to professional-grade AI capabilities through a developer-friendly interface. The GitHub Models Playground serves as a testing environment where you can experiment with different AI models and understand their capabilities before implementing them in code.
π§ AI Application Development Ecosystem
Core Principle: AI application development combines traditional web development skills with AI service integration, creating intelligent applications that feel natural and responsive to users. Here's what makes the playground so useful:
- Try out different AI models like GPT-4o-mini, Claude, and others (all free!)
- Test your ideas and prompts before you write any code
- Get ready-to-use code snippets in your favorite programming language
- Tweak settings like creativity level and response length to see how they affect the output Once you've played around a bit, just click the "Code" tab and pick your programming language to get the implementation code you'll need.
Setting Up the Python Backend Integration
Now let's implement the AI integration using Python. Python is excellent for AI applications because of its simple syntax and powerful libraries. We'll start with the code from GitHub Models playground and then refactor it into a reusable, production-ready function.
Understanding the Base Implementation
When you grab the Python code from the playground, you'll get something that looks like this. Don't worry if it seems like a lot at first β let's walk through it piece by piece: Here's what's happening in this code:
- We import the tools we need: os for reading environment variables and OpenAI for talking to the AI
- We set up the OpenAI client to point to GitHub's AI servers instead of OpenAI directly
- We authenticate using a special GitHub token (more on that in a minute!)
- We structure our conversation with different "roles" β think of it like setting the scene for a play
- We send our request to the AI with some fine-tuning parameters
- We extract the actual response text from all the data that comes back
Understanding Message Roles: The AI Conversation Framework
AI conversations use a specific structure with different "roles" that serve distinct purposes: Think of it like directing a play:
- System role: Like stage directions for an actor β it tells the AI how to behave, what personality to have, and how to respond
- User role: The actual question or message from the person using your application
- Assistant role: The AI's response (you don't send this, but it appears in conversation history) Real-world analogy: Imagine you're introducing a friend to someone at a party:
- System message: "This is my friend Sarah, she's a doctor who's great at explaining medical concepts in simple terms"
- User message: "Can you explain how vaccines work?"
- Assistant response: Sarah responds as a friendly doctor, not as a lawyer or a chef
Understanding AI Parameters: Fine-Tuning Response Behavior
The numerical parameters in AI API calls control how the model generates responses. These settings allow you to adjust the AI's behavior for different use cases:
Temperature (0.0 to 2.0): The Creativity Dial
What it does: Controls how creative or predictable the AI's responses will be. Think of it like a jazz musician's improvisation level:
- Temperature = 0.1: Playing the exact same melody every time (highly predictable)
- Temperature = 0.7: Adding some tasteful variations while staying recognizable (balanced creativity)
- Temperature = 1.5: Full experimental jazz with unexpected turns (highly unpredictable)
Max Tokens (1 to 4096+): The Response Length Controller
What it does: Sets a limit on how long the AI's response can be. Think of tokens as roughly equivalent to words (about 1 token = 0.75 words in English):
- max_tokens=50: Short and sweet (like a text message)
- max_tokens=500: A nice paragraph or two
- max_tokens=2000: A detailed explanation with examples
Top_p (0.0 to 1.0): The Focus Parameter
What it does: Controls how focused the AI stays on the most likely responses. Picture the AI having a huge vocabulary, ranked by how likely each word is:
- top_p=0.1: Only considers the top 10% most likely words (very focused)
- top_p=0.9: Considers 90% of possible words (more creative)
- top_p=1.0: Considers everything (maximum variety) For example: If you ask "The sky is usually..."
- Low top_p: Almost definitely says "blue"
- High top_p: Might say "blue", "cloudy", "vast", "changing", "beautiful", etc.
Putting It All Together: Parameter Combinations for Different Use Cases
Understanding why these parameters matter: Different applications need different types of responses. A customer service bot should be consistent and factual (low temperature), while a creative writing assistant should be imaginative and varied (high temperature). Understanding these parameters gives you control over your AI's personality and response style. import asyncio from openai import AsyncOpenAI
Use AsyncOpenAI for better performance
client = AsyncOpenAI( base_url="https://models.github.ai/inference", api_key=os.environ["GITHUB_TOKEN"], async def call_llm_async(prompt: str, system_message: str = "You are a helpful assistant."): """ Sends a prompt to the AI model asynchronously and returns the response. Args: prompt: The user's question or message system_message: Instructions that define the AI's behavior and personality Returns: str: The AI's response to the prompt """ try: response = await client.chat.completions.create( messages=[ "role": "system", "content": system_message, "role": "user", "content": prompt, model="openai/gpt-4o-mini", temperature=1, max_tokens=4096, top_p=1 return response.choices[0].message.content except Exception as e: logger.error(f"AI API error: {str(e)}") return "I'm sorry, I'm having trouble processing your request right now."
Backward compatibility function for synchronous calls
def call_llm(prompt: str, system_message: str = "You are a helpful assistant."): """Synchronous wrapper for async AI calls.""" return asyncio.run(call_llm_async(prompt, system_message))
β Vague system prompt
"You are helpful."
β Detailed, effective system prompt
"You are Dr. Sarah Chen, a senior software engineer with 15 years of experience at major tech companies. You explain programming concepts using real-world analogies and always provide practical examples. You're patient with beginners and enthusiastic about helping them understand complex topics."
Example 1: The Patient Teacher
teacher_prompt = """ You are an experienced programming instructor who has taught thousands of students. You break down complex concepts into simple steps, use analogies from everyday life, and always check if the student understands before moving on. You're encouraging and never make students feel bad for not knowing something. """
Example 2: The Creative Collaborator
creative_prompt = """ You are a creative writing partner who loves brainstorming wild ideas. You're enthusiastic, imaginative, and always build on the user's ideas rather than replacing them. You ask thought-provoking questions to spark creativity and offer unexpected perspectives that make stories more interesting. """
Example 3: The Strategic Business Advisor
business_prompt = """ You are a strategic business consultant with an MBA and 20 years of experience helping startups scale. You think in frameworks, provide structured advice, and always consider both short-term tactics and long-term strategy. You ask probing questions to understand the full business context before giving advice. """
With teacher prompt:
teacher_response = call_llm( "How do I handle user authentication in my web app?", teacher_prompt
Typical response: "Great question! Let's break authentication down into simple steps.
Think of it like a nightclub bouncer checking IDs..."
With business prompt:
business_response = call_llm( "How do I handle user authentication in my web app?", business_prompt
Typical response: "From a strategic perspective, authentication is crucial for user
trust and regulatory compliance. Let me outline a framework considering security,
user experience, and scalability..."
system_prompt = """ You are helping a junior developer who just started their first job at a startup. They know basic HTML/CSS/JavaScript but are new to backend development and databases. Be encouraging and explain things step-by-step without being condescending. """ system_prompt = """ You are a technical mentor. Always structure your responses as:
- Quick Answer (1-2 sentences)
- Detailed Explanation
- Code Example
- Common Pitfalls to Avoid
- Next Steps for Learning """ system_prompt = """ You are a coding tutor focused on teaching best practices. Never write complete solutions for the user - instead, guide them with hints and questions so they learn by doing. Always explain the 'why' behind coding decisions. """ sequenceDiagram participant User as π€ User participant Frontend as π Frontend participant API as π§ FastAPI Server participant AI as π€ AI Service User->>Frontend: Types "Hello AI!" Frontend->>API: POST /hello {"message": "Hello AI!"} Note over API: Validates requestAdds system prompt API->>AI: Sends formatted request AI->>API: Returns AI response Note over API: Processes responseLogs conversation API->>Frontend: {"response": "Hello! How can I help?"} Frontend->>User: Displays AI message sequenceDiagram participant Frontend participant FastAPI participant AI Function participant GitHub Models Frontend->>FastAPI: POST /hello {"message": "Hello AI!"} FastAPI->>AI Function: call_llm(message, system_prompt) AI Function->>GitHub Models: API request GitHub Models->>AI Function: AI response AI Function->>FastAPI: response text FastAPI->>Frontend: {"response": "Hello! How can I help?"} flowchart TD A[User Input] --> B[Frontend Validation] B --> C[HTTP POST Request] C --> D[FastAPI Router] D --> E[Pydantic Validation] E --> F[AI Function Call] F --> G[GitHub Models API] G --> H[Response Processing] H --> I[JSON Response] I --> J[Frontend Update] subgraph "Security Layer" K[CORS Middleware] L[Environment Variables] M[Error Handling] end D --> K F --> L H --> M
api.py
from fastapi import FastAPI, HTTPException from fastapi.middleware.cors import CORSMiddleware from pydantic import BaseModel from llm import call_llm import logging
Configure logging
logging.basicConfig(level=logging.INFO) logger = logging.getLogger(name)
Create FastAPI application
app = FastAPI( title="AI Chat API", description="A high-performance API for AI-powered chat applications", version="1.0.0"
Configure CORS
app.add_middleware( CORSMiddleware, allow_origins=[""], # Configure appropriately for production allow_credentials=True, allow_methods=[""], allow_headers=["*"],
Pydantic models for request/response validation
class ChatMessage(BaseModel): message: str class ChatResponse(BaseModel): response: str @app.get("/") async def root(): """Root endpoint providing API information.""" return { "message": "Welcome to the AI Chat API", "docs": "/docs", "health": "/health" @app.get("/health") async def health_check(): """Health check endpoint.""" return {"status": "healthy", "service": "ai-chat-api"} @app.post("/hello", response_model=ChatResponse) async def chat_endpoint(chat_message: ChatMessage): """Main chat endpoint that processes messages and returns AI responses.""" try:
Extract and validate message
message = chat_message.message.strip() if not message: raise HTTPException(status_code=400, detail="Message cannot be empty") logger.info(f"Processing message: {message[:50]}...")
Call AI service (note: call_llm should be made async for better performance)
ai_response = await call_llm_async(message, "You are a helpful and friendly assistant.") logger.info("AI response generated successfully") return ChatResponse(response=ai_response) except HTTPException: raise except Exception as e: logger.error(f"Error processing chat message: {str(e)}") raise HTTPException(status_code=500, detail="Internal server error") if name == "main": import uvicorn uvicorn.run(app, host="0.0.0.0", port=5000, reload=True) from fastapi.middleware.cors import CORSMiddleware app = FastAPI(name) CORS(app) # This tells browsers: "It's okay for other origins to make requests to this API"
π¨ Development: Allows ALL origins (convenient but insecure)
CORS(app)
β Production: Only allow your specific frontend domain
CORS(app, origins=["https://yourdomain.com", "https://www.yourdomain.com"])
π Advanced: Different origins for different environments
if app.debug: # Development mode CORS(app, origins=["http://localhost:3000", "http://127.0.0.1:3000"]) else: # Production mode CORS(app, origins=["https://yourdomain.com"])
Validate that we received a message
if not message: return jsonify({"error": "Message field is required"}), 400
Navigate to your backend directory
cd backend
Create a virtual environment (like creating a clean room for your project)
python -m venv venv
Activate it (Linux/Mac)
source ./venv/bin/activate
On Windows, use:
venv\Scripts\activate
Install the good stuff
pip install openai fastapi uvicorn python-dotenv
π¨ NEVER DO THIS - API key visible to everyone
client = OpenAI( api_key="ghp_1234567890abcdef...", # Anyone can steal this! base_url="https://models.github.ai/inference"
β DO THIS - API key stored securely
client = OpenAI( api_key=os.environ["GITHUB_TOKEN"], # Only your app can access this base_url="https://models.github.ai/inference"
.env file - This should NEVER be committed to Git
GITHUB_TOKEN=your_github_personal_access_token_here FASTAPI_DEBUG=True ENVIRONMENT=development
Example of what your token looks like (this is fake!)
GITHUB_TOKEN=ghp_1A2B3C4D5E6F7G8H9I0J1K2L3M4N5O6P7Q8R import os from dotenv import load_dotenv
Load environment variables from .env file
load_dotenv()
Now you can access them securely
api_key = os.environ.get("GITHUB_TOKEN") if not api_key: raise ValueError("GITHUB_TOKEN not found in environment variables!") client = OpenAI( api_key=api_key, base_url="https://models.github.ai/inference"
.gitignore - Add these lines
.env *.env .env.local .env.production pycache/ venv/ .vscode/
.env.development
GITHUB_TOKEN=your_development_token DEBUG=True
.env.production
GITHUB_TOKEN=your_production_token DEBUG=False
Method 1: Direct Python execution (includes auto-reload)
python api.py
Method 2: Using Uvicorn directly (more control)
uvicorn api:app --host 0.0.0.0 --port 5000 --reload $ python api.py INFO: Will watch for changes in these directories: ['/your/project/path'] INFO: Uvicorn running on http://0.0.0.0:5000 (Press CTRL+C to quit) INFO: Started reloader process [12345] using WatchFiles INFO: Started server process [12346] INFO: Waiting for application startup. INFO: Application startup complete.
Test with curl (if available)
curl -X POST http://localhost:5000/hello
-H "Content-Type: application/json"
-d '{"message": "Hello AI!"}'
Expected response:
{"response": "Hello! I'm your AI assistant. How can I help you today?"}
test_api.py - Create this file to test your API
import requests import json
Test the API endpoint
url = "http://localhost:5000/hello" data = {"message": "Tell me a joke about programming"} response = requests.post(url, json=data) if response.status_code == 200: result = response.json() print("AI Response:", result['response']) else: print("Error:", response.status_code, response.text)
Enable hot reloading explicitly
if name == "main": app.run(host="0.0.0.0", port=5000, debug=True) # debug=True enables hot reload import logging
Set up logging
logging.basicConfig(level=logging.INFO) logger = logging.getLogger(name) @app.route("/hello", methods=["POST"]) def hello(): data = request.get_json() message = data.get("message", "") logger.info(f"Received message: {message}") if not message: logger.warning("Empty message received") return jsonify({"error": "Message field is required"}), 400 try: response = call_llm(message, "You are a helpful and friendly assistant.") logger.info(f"AI response generated successfully") return jsonify({"response": response}) except Exception as e: logger.error(f"AI API error: {str(e)}") return jsonify({"error": "AI service temporarily unavailable"}), 500 cd backend python api.py https://your-codespace-name-5000.app.github.dev // In your frontend app.js, update the BASE_URL: this.BASE_URL = "https://your-codespace-name-5000.app.github.dev"; https://[codespace-name]-[port].app.github.dev Welcome to the AI Chat API. Send POST requests to /hello with JSON payload containing 'message' field. // Open browser console and test your API fetch('https://your-codespace-name-5000.app.github.dev/hello', { method: 'POST', headers: {'Content-Type': 'application/json'}, body: JSON.stringify({message: 'Hello from Codespaces!'}) .then(response => response.json()) .then(data => console.log(data));
Set environment variable for the current session
export GITHUB_TOKEN="your_token_here"
Or add to your .bashrc for persistence
echo 'export GITHUB_TOKEN="your_token_here"' >> ~/.bashrc graph TD A[User Types Message] --> B[JavaScript Captures Input] B --> C[Validate & Format Data] C --> D[Send to Backend API] D --> E[Display Loading State] E --> F[Receive AI Response] F --> G[Update Chat Interface] G --> H[Ready for Next Message] classDiagram class ChatApp { +messages: HTMLElement +form: HTMLElement +input: HTMLElement +sendButton: HTMLElement +BASE_URL: string +API_ENDPOINT: string +constructor() +initializeEventListeners() +handleSubmit(event) +callAPI(message) +appendMessage(text, role) +escapeHtml(text) +scrollToBottom() +setLoading(isLoading) ChatApp --> DOM : manipulates ChatApp --> FastAPI : sends requests frontend/ βββ index.html # Main HTML structure βββ app.js # JavaScript functionality βββ styles.css # Visual styling
flowchart LR
A[β‘ 5 minutes] --> B[Get GitHub token]
B --> C[Test AI playground]
C --> D[Copy Python code]
D --> E[See AI responses]
Follow the lesson from Microsoft Web-Dev-For-Beginners course