Generative AI picture display

Planning

The plan is to develop a scalable software that allows for a small computer (like a Raspberry Pi nano) to run a client, that displays AI generated pictures.
The client should be able to receive voice commands that will process to generate a new picture.
To make it as flexible as possible, the client should run on a standard browser.
The server side should be able to run on a separate computer to allow a local AI model to generate the images.

A first draft of the possible architecture:

We concluded that we want to avoid physical button, and use voice commands locally to control de generation of images, for that, a software like Whisper could translate the audio to text.
To control when the voice controls are activated, instead of a physical button, a wake word could be used (check this discussion on github), and Voice Acticity Detection (VAD) is also important to know when the command ends.
More complex configuration, like setting the desired server provider and changing modes can be left to an app (or webapp)

There are two modes for the client, one that requests AI images from voice commands, and one that request a random image.
The carousel mode is going to have a timer associated that could be configured beforehand, and will request images every so often.

The project divides into 3 codebases.

migueldeoleiros' wiki

Table of Contents

Recent notes

Check Point vpn in Linux

Linux swap

Tailscale

Bluetooth Low Energy in Flutter

Bluetooth Low Energy

Generative AI picture display

Digital Signage

Digital Photo Frame

Smart Mirror

Planning

frame_server

frame_client

frame_app

Graph View

Explorer

Backlinks

Table of Contents

Recent notes

Generative AI picture display

Related notes

Planning

Graph View

Explorer

Backlinks