The Flutter Kit logoThe Flutter Kit
Tutorial

Flutter ChatGPT Starter Kit — How to Build an AI Flutter App with OpenAI (2026)

A practical, security-first guide to building Flutter apps with OpenAI — streaming chat, images, vision, and where to deploy the proxy.

Ahmed GaganAhmed Gagan
17 min read

Every week I see another indie developer embed their OpenAI API key directly in their Flutter app, ship it to the App Store, and get a $2,400 bill three days later. This guide is the antidote. I will walk you through the exact architecture I use to ship AI Flutter apps — including a proxy server that keeps your key safe, streaming chat responses in Dart, DALL-E image generation, GPT-4 Vision, rate limiting, error handling, and where to deploy it all.

By the end you will have a working mental model for building AI Flutter apps and a concrete starter stack. If you want the whole thing pre-wired, skip to the end and grab The Flutter Kit.

Rule #1: Your OpenAI API Key Never Ships in the App

This is the lesson I want tattooed on every Flutter developer's monitor. When you embed sk-proj-... in your Dart code — even in an obfuscated build — attackers can and will extract it within minutes using strings, otool, or a Frida hook. The industry has seen six-figure bills from this exact mistake. The fix is not "obfuscate harder." The fix is a proxy server.

Here is the architecture. Your Flutter app talks to your server. Your server holds the OpenAI key and talks to OpenAI. Your server can enforce authentication, rate limits, and quotas. This is non-negotiable for any production AI Flutter app.

LayerResponsibilityHas OpenAI Key?
Flutter appUI, auth token, user inputNever
Proxy serverAuth check, rate limit, forward to OpenAIYes (env var)
OpenAI APIGenerate completions, images, embeddingsN/A

Building the Proxy (Flask Example)

You can use any backend you like — Node.js, Go, Python, Firebase Cloud Functions. I will show Flask because it is dead simple and easy to deploy anywhere. Here is a production-ready proxy with Firebase auth verification, streaming, and rate limiting.

# app.py
import os
from flask import Flask, request, Response, jsonify
from openai import OpenAI
import firebase_admin
from firebase_admin import auth, credentials
from functools import wraps
from collections import defaultdict
from time import time

app = Flask(__name__)
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
firebase_admin.initialize_app(credentials.Certificate("service.json"))

# In-memory rate limiter (use Redis in production)
RATE_LIMITS = defaultdict(list)
MAX_REQ_PER_MIN = 20

def require_user(fn):
    @wraps(fn)
    def wrapped(*args, **kwargs):
        token = request.headers.get("Authorization", "").replace("Bearer ", "")
        try:
            user = auth.verify_id_token(token)
            request.user_id = user["uid"]
        except Exception:
            return jsonify({"error": "unauthorized"}), 401
        now = time()
        RATE_LIMITS[request.user_id] = [
            t for t in RATE_LIMITS[request.user_id] if now - t < 60
        ]
        if len(RATE_LIMITS[request.user_id]) >= MAX_REQ_PER_MIN:
            return jsonify({"error": "rate_limited"}), 429
        RATE_LIMITS[request.user_id].append(now)
        return fn(*args, **kwargs)
    return wrapped

@app.post("/chat")
@require_user
def chat():
    body = request.json
    messages = body["messages"]
    def stream():
        for chunk in client.chat.completions.create(
            model="gpt-4o",
            messages=messages,
            stream=True,
        ):
            delta = chunk.choices[0].delta.content or ""
            if delta:
                yield f"data: {delta}\n\n"
        yield "data: [DONE]\n\n"
    return Response(stream(), mimetype="text/event-stream")

That is the whole proxy in under 50 lines. The key moves: verify the Firebase ID token, rate-limit per user, proxy the request to OpenAI, stream the response back as Server-Sent Events (SSE).

Streaming Chat Responses in Dart

Now the Flutter side. The single most important feature for an AI chat app is streaming — users expect the typewriter effect, not a ten-second silent wait. Here is a clean Dart service that consumes the SSE stream from our proxy.

// lib/services/ai_service.dart
import 'dart:async';
import 'dart:convert';
import 'package:http/http.dart' as http;
import 'package:firebase_auth/firebase_auth.dart';

class AiService {
  static const _baseUrl = 'https://your-proxy.example.com';

  Stream<String> streamChat(List<Map<String, String>> messages) async* {
    final token = await FirebaseAuth.instance.currentUser?.getIdToken();
    if (token == null) throw Exception('Not signed in');

    final req = http.Request('POST', Uri.parse('$_baseUrl/chat'))
      ..headers['Authorization'] = 'Bearer $token'
      ..headers['Content-Type'] = 'application/json'
      ..body = jsonEncode({'messages': messages});

    final res = await http.Client().send(req);
    if (res.statusCode != 200) {
      throw Exception('AI error: ${res.statusCode}');
    }

    await for (final chunk in res.stream.transform(utf8.decoder).transform(const LineSplitter())) {
      if (!chunk.startsWith('data: ')) continue;
      final data = chunk.substring(6);
      if (data == '[DONE]') break;
      yield data;
    }
  }
}

In your UI, wrap this in a StreamBuilder or a BLoC that accumulates chunks. Every yield is a new token arriving — typically 50-200ms between chunks from GPT-4o.

// lib/features/chat/chat_bloc.dart
class ChatBloc extends Bloc<ChatEvent, ChatState> {
  ChatBloc(this._ai) : super(ChatState.initial()) {
    on<SendMessage>((event, emit) async {
      final messages = [...state.messages, {'role': 'user', 'content': event.text}];
      emit(state.copyWith(messages: messages, isStreaming: true));

      var assistantText = '';
      await for (final token in _ai.streamChat(messages)) {
        assistantText += token;
        emit(state.copyWith(
          messages: [...messages, {'role': 'assistant', 'content': assistantText}],
        ));
      }
      emit(state.copyWith(isStreaming: false));
    });
  }
  final AiService _ai;
}

DALL-E Image Generation

Add a /image endpoint to the proxy, forward to OpenAI's image API, return the URL, display it in Flutter with CachedNetworkImage. The proxy piece looks like this:

@app.post("/image")
@require_user
def image():
    prompt = request.json["prompt"]
    result = client.images.generate(
        model="gpt-image-1",
        prompt=prompt,
        size="1024x1024",
        n=1,
    )
    return jsonify({"url": result.data[0].url})

Flutter side: a simple HTTP call, then render with CachedNetworkImage. Cache the URL to Firestore so users can revisit generated images without regenerating.

GPT-4 Vision (Image Understanding)

Vision lets users upload a photo and ask questions about it — calorie tracking, plant identification, receipt parsing, skincare analysis. Send the image as a base64 data URL or a signed Firebase Storage URL.

@app.post("/vision")
@require_user
def vision():
    image_url = request.json["image_url"]
    question = request.json["question"]
    result = client.chat.completions.create(
        model="gpt-4o",
        messages=[{
            "role": "user",
            "content": [
                {"type": "text", "text": question},
                {"type": "image_url", "image_url": {"url": image_url}},
            ],
        }],
    )
    return jsonify({"answer": result.choices[0].message.content})

In Flutter, pick the image with image_picker, upload to Firebase Storage, get a signed URL, send it with the question to the proxy. Under 50 lines of Dart for the whole flow.

Rate Limiting and Abuse Prevention

The proxy above has in-memory rate limiting, which resets when the server restarts. For production, use Redis or your database. At minimum, enforce:

  • Per-user request limits — e.g., 20 chat requests / minute, 5 images / minute
  • Per-user daily quota — e.g., free users get 20 AI calls/day, premium unlimited
  • Token accounting — track input + output tokens per user to prevent runaway costs
  • Subscription gate — check RevenueCat entitlement before expensive calls

A single malicious user running a loop can hit $500 in OpenAI cost overnight. These limits are not optional.

Error Handling Patterns

Real AI apps fail in specific ways. Handle them explicitly in your BLoC or error boundary:

ErrorHTTP CodeUser-Facing Message
Rate limit hit429"Slow down — try again in a minute"
OpenAI outage502/503"AI is temporarily unavailable. Please retry."
Content policy violation400"That request was blocked by our content policy."
Timeout (>60s)504"The response took too long. Try a shorter prompt."
Quota exceeded (user)402"You have hit your daily free limit. Upgrade for unlimited."

Where to Deploy the Proxy

I have deployed OpenAI proxies to all of these. Here is the honest pros/cons breakdown for 2026.

PlatformCostProsCons
Render$7/mo starterDead-simple Git deploy, good logs, built-in TLSCold starts on free tier
Railway~$5-$20/moGreat DX, instant deploys, good secret handlingPricing scales quickly
Fly.ioFree tier + usageEdge deploys, Docker-native, fast cold startsSteeper config learning curve
VercelFree hobby tierFastest deploy loop, serverless functions60s function timeout limits long streams
Firebase FunctionsPay-per-callAlready in your stack, built-in authCold starts, streaming SSE is awkward

My default recommendation for a solo Flutter indie: Render or Fly.io for a persistent streaming proxy. If you are already all-in on Firebase, Cloud Functions for Firebase (v2) works — it supports streaming with HTTP callable functions in 2026, and The Flutter Kit ships exactly this setup.

Cost Math: What Will OpenAI Cost You?

Rough 2026 pricing with GPT-4o: ~$2.50 per million input tokens, ~$10 per million output tokens. A typical chat turn is 300 input + 300 output tokens = $0.00375 per turn. 10 turns of chat costs about 4 cents. DALL-E images are ~$0.04 each. Vision adds the image token cost on top of text.

Practical budget: at 20 chat turns/day/user, free users cost you ~$0.08/day = $2.40/month. If you charge $9.99 premium with a 5% conversion rate, you are net positive. Always run the unit economics before scaling.

What The Flutter Kit Gives You Out of the Box

The Flutter Kit is the shortcut. Everything described above — proxy with Firebase Cloud Functions, streaming chat UI with BLoC, DALL-E image generation, vision support, rate limiting, subscription gating via RevenueCat — ships configured and ready. Add your OpenAI key to the Cloud Functions secrets, set your bundle ID, run flutterfire configure, and you have a working AI Flutter app in under an hour.

It is the Flutter ChatGPT starter kit I wish existed when I was building my first AI app. $69 one-time for unlimited commercial projects. Grab it on the checkout page or explore the features to see every integration.

Final Checklist

  • API key lives on the server — never in the Flutter bundle
  • Auth on every AI endpoint (Firebase ID tokens)
  • Per-user rate limits + daily quotas
  • Streaming responses via SSE, not wait-and-dump
  • Graceful error handling for 429, 502, content policy, timeouts
  • Subscription gate on expensive endpoints (images, vision)
  • Token accounting for unit-economic visibility
  • Cached image URLs in Firestore to avoid regeneration

Ship with all eight boxes checked and you have a production-grade AI Flutter app. Ship without them and you are on the hook for unbounded costs and security incidents. Build it right, or buy a kit that already did.

Share this article

Ready to ship your Flutter app faster?

The Flutter Kit gives you a production-ready Flutter codebase with onboarding, paywalls, auth, AI integrations, and more. Stop building boilerplate. Start building your product.

Get The Flutter Kit