Blog

How-to: Create AI product ads with consistent characters using Google’s new “nano banana” and Veo3 models
Google’s flagship video generating model Veo3 is around for so time and if you are a frequent tiktok or other video social media platform user you for sure stumbled upon the Yeti, Bigfoot or talking baby videos that have been created with Veo3. They feature quite realistic movements and also lipsynch speech and sound effects. In term of AI video generation it is the benchmark at the moment.
However it got problems with keeping character consistent over more then 1 prompt/video generation. So when you look very closely the Yeti or Bigfoot has variations over videos from the same account.
A model named “nano banana” was hyping in the AI see as it made appearance on LmArena and was showing excellent results in terms of realism, consistency and quality. It war rumored that Google is behind this new flagship model and last week we got the confirmation: https://developers.googleblog.com/en/introducing-gemini-2-5-flash-image/ . What sets the model apart from many competitors is the ability to keep character consistency and make super fast context aware edits (for the record: similar workflows would be also possible with flux and flux-kontext models). see example of character consistence images of me with the Past Forward tool:
What AI tools you need:
- AI Studio from Google for the flash 2.5 image generation
- Google Flow for the Veo3 video generation (also possible in the Gemini app)
- (alternatively the google model you need are also available in fal.ai)
Character
random character i created using flux 1.1 ultra. (You can use whatever image model you feel best comfortable with, midjourny has obviously the best result still)
Product
For testing i chose my previously with AI generated Jack Daniels gummy bears:
Step 1: Combining in a product scene using flash image 2.5:
Generate an realistic image like in a advertising campaign of the person in the image provided sitting in a forest in front of a campfire. in the back we can see his tent – he is obviously on a camping trip. the man is eating gummy bears from the bag shown in the other image. the brand and the visual of the gummy bear bag should be clearly recognizable like in a product ad.
This is the outcome (first try):
You can easily adapt the scene more to your needs with simple prompts like remove the whiskey bottle, change the sweater color to green etc.
Step 2: Creating the product video with Veo3
We take now the image as input reference for the Veo3 video. With the prompt we bring the image to life and add voice over to the video – like in a real advertising. For more advance use case you can also use Json prompting and my tool I especially created for this: https://veo3json.moweco.com/
A man eating the jack Daniels gummy bears from his bag sitting in the forest in front of a campfire saying: “enjoy real freedom with the new whiskey flavored gummy bears”
camera: professional like in an advertising campaign, slowly moving towards the man sitting
light: natural light, evening mood
sequence 1:
man eating from the bag and then saying “enjoy real freedom with the new whiskey flavored gummy bears” and smiling. 0-6s
sequence 2: big product shot of the gummy bear bag on white background. on the right top side we see then a big yellow background insert “Available now”
Unfortunately Google flow wouldn’t let my upload and use realistic images of people (cause of country restriction in Europe. So you might want to use a VPN or like me, use the fal.ai Veo3 endpoint for the creation (used to work as well with the Gemini app).
This is my result (1st attempt – I could have created more versions to optimize and also to get rid of the typo in the end frame or to get the product image exactly in the end frame – but just to demonstrate what a first version looks like):
The result is far from being a real advertisement someone would use, but I just wanted to briefly show the process in general. Especially the Veo3 output needs more refinements to get a real descent result. Also here the character consistency unfortunately breaks. But with some more tweaks I am sure you can get advertising quality like result with those tools.
31. August 2025
Why I started to write here again after almost 10 years of a break…
When I was starting to write texts at this place it was called blogging and the term influencer was not even coined (although I was an early tech focused influencer/blogger with mobilepulse.de). In May 2006 the first postings went live and I continued in a good regular pace till around 2014 to inform about project progress and other minor important things. Somehow then in 2015 I completely stopped to post. Why? I don’t really know, maybe I lost interest in sharing news, got bored of the way of sharing or of the writing process. Honestly I cannot recall 100% why I stopped 10 years ago but I it was a wave of blogs that went silent in that time. Social Media especially social networks took over completely online consumption and writing a blog somehow got out of fashion, I guess.
First visual appearance of this blog/website
But there are 2 aspects that made me again write: In 2022/2023 I traveled for a couple of month South East Asia and South America and used newly created pages here to document my travels. It was some sort of travel diary and was great fun to write. From this on I started to write again on a regular basis. The topic I mostly covered was definitely AI and my journey to work with AI. This is very exiting to write as it also documents the progress of AI in the past 2 years. Why I do write this things down? It is a good way for myself to document, reflect and experiment with tools, tasks and approaches. And there is hopefully also value in it for the ones that read the articles.
What’s next? I don’t know, maybe I will again loose motivation for writing here or this will go on for some time. We will see…
22. July 2025
My Vibe coding Test #2: Now it works?
After my first post on this topic (vibe coding as it is called now) some time went by and the tools got a little better. So time for give it another challenge that is beyond create a landingpage for my AI consultancy.
The project I want to create is a multi agent research system that can create deep research and analysis like many AI chats and agents today are capable. I wanted to do the entire project in javascript and research the libary langchain.js before as the most capable for this tasks to fulfill.
Before starting I create a product plan of my actual solution and from that a prompt to feed into the different AI powered coding tools. This time I use again some of the popular web based coding assistants like bolt.new, v0 and lovable and as well local tools like Cline and Cursor.
The prompt
```
# Multi-Agent Research Assistant Development Task

You are an expert AI engineer tasked with building a multi-agent research assistant using LangChain.js, TypeScript, and Next.js. Your guidance is needed to create a complete, production-ready system that coordinates multiple AI agents to answer complex research questions.

## System Architecture

Create a research assistant with these specialized agents:
1. ResearchAgent - Searches the web for relevant information on the query
2. SummarizerAgent - Condenses and structures retrieved information
3. FactCheckerAgent - Verifies information accuracy through secondary searches
4. SynthesizerAgent - Compiles verified information into a comprehensive answer with citations

## Technical Requirements

- Use TypeScript throughout the project
- Implement LangChain.js Runnables API for each agent's logic
- Connect to OpenAI models via LangChain wrappers
- Utilize SerpAPI for web searching capabilities
- Create a shared memory system for inter-agent communication
- Implement a clean, modular architecture with appropriately separated concerns
- Build both backend logic and frontend interface using Next.js
- Use shadcn/UI components with a dark-themed AI aesthetic
- Document code thoroughly with JSDoc comments explaining agent functions

## Development Sequence

1. First create the SharedMemory class for maintaining state between agents
2. Implement each agent in order: Research → Summarizer → FactChecker → Synthesizer
3. Create a coordinator function that orchestrates the agent workflow
4. Build the Next.js API endpoint to expose the research pipeline
5. Develop a responsive frontend that displays the research process and results

## Implementation Details

For the SharedMemory system:
- Create a TypeScript class with appropriate interfaces
- Include methods for reading/writing different data types
- Implement persistent storage if needed

For each agent:
- Define a clear input/output contract
- Handle error cases gracefully
- Add detailed logging for debugging
- Include unit tests for critical functions

For the frontend:
- Create a search input with submission handling
- Display current research progress status
- Show final results with expandable source details
- Implement a clean, responsive design

## API Structure

Implement a `/api/research` endpoint that:
- Accepts POST requests with a query string
- Returns progressive updates if possible
- Provides a final result object with the answer and supporting data

Please start by implementing the SharedMemory system, then each agent in sequence. Show each implementation completely before moving to the next component.
```
bolt.new
bolt.new is recently really pissing me off being super buggy, losing code checkpoints and other fancy tricks, like splitting up the tasks into multiple steps so more tokens get spend. Anyhow this is the solution it came up with:
Looks not super sleek but OK. The AI implemented all the components like shared memory, the agents and the frontend but however did not connect frontend with backend properly and only sent back mockup texts. After telling the AI doing so it updated the code and ended up in an endless loop of errors and fixing them. So the code is not usable. This time and it was also in the prompt the AI didn’t try to expose my API keys to the public website and was using a .env file.
v0
I used to work a lot with v0 lately and really created some good working applications with is so I was quite confident that it will also take this challenge. But very soon I will get disappointed. The first version I completely dumped cause the AI was not able to separate properly frontend from backend code and produced a complete mess. The second attempt did a little better and at least outputted a working frontend:
But also here the same problems that the AI can’t keep an overview of the own code and mixes path, backend code, frontend code that I never got a working solution form this approach as well. I gave it a 3rd try but the solution was again not working at and had again massive problems connecting frontend and backend code bases. This time also the implemented .env could not be readed properly.
So overall I am very disappointed of what v0 was able to (not) deliver this time.
lovable
lovable recently release its version 2 and is very hyped at the moment for being the best among the web based coding tools. I was not using before that often and most of the time for template/frontend tasks. The AI had really hard time bringing a version 1 to the screen and had around 5+ loops where fixing its own mistakes. So I really spent some time until I could see something:
The design looks very nice and follows my prompt inputs. Also there the AI built all the code with agents etc. but did not connect frontend and backend initially. After telling it to do so it tried to do. Here comes a bummer from lovable: editing files in the editor is only possible if you sync the project to a github repository. I cannot understand why this is mandatory, but after doing so it still didn’t let me edit my .env file. So I did this via a commit from github. This is super annoying but works at the end. After some back and force with the serpapi I got to a working solution finally. Here is the working result:
VScode + Cline
New around the corner and also really popular right now is the vscode extension cline. It is an open source solution and can work with any of the popular AI models through their API. So it is more tricky to set up at first then most of the web based tools. Most people now use is with Google’s gemini pro 2.5 version which offers a proper free API call tier (claim it here). The downturn is that the daily limit of 25 calls and the input context window/day of 1 Mio tokens is done quite soon cause cline send API calls non stop. This is really a very annoying fact as it is absolutely not necessary and no other comparable solution does it that way. So it took my several days only using the free tier API contingent to get so something visible. Cline also had problems with packages, using it the right way and had long work back on forth on getting a simple frontend running.
As you can see it return 0 results as it somehow uses the SERPAPI for web research in a wrong way. After some manual fixing in the css, after Cline was not able to bring the css styling to life, I finally managed to let the interface look like this (still not working properly):
Cursor
I tend to use Cursor a lot for day to day smaller tasks but has experience already my problems when things get more complicated. In direct comparison to Cline the AI works a lot more smother and faster just hammering out the needed files in light-speed. It only took around 3-4 minutes until I got a first version. I had some problems then with packages and namings but got from this tool the fastet working solution. Plus it only cost me 2 (!!!!) premium Cursor request to do this (with cline I needed around 8,5Mio outbound tokens!!!). The output is not the most beautiful and some parts of the info are missing (sources), but it is working and was super fast to create:
The results
- v0: failed
- bolt.new: failed
- lovable: success
- Cline: semi failed
- Cursor: success
To summarize this little experiment: I am super disappointed what v0 and bolt.new produced, which was after investing some time and tokens far far away from a working solution. Lovable really leveled up their game and after some quirks it produces the most fancy looking working solution among all tools. Cline is OKisch to use and I would assume if I would have spent more time to fixing the last issues, I would have also get a working solution from that tool. Clear winner for me is Cursor this time as it was super fast, cheap and was able to produce a working solution with the least effort on my side.
7. May 2025
The very worst of AI: scammer shops put together with AI
Overly I am very positive to see what impact AI has on work efficiency and other aspects of our life’s. But as with every technological revolution there are super fast miss-use, fraud and other soft crimes around the corner that are only possible because of this new awesome new technology. This use-case is of course a minority but I find it also important to draw our attention towards it, as in the news and discussions there is mostly only hype and over-exaggeration on what AI will change for good presented.
So I saw an ad on social media, that was clearly an Ai generated image from an elderly man that claims his leather shops is about to close and he is selling now his last hand made pieces. This is a very nice and heartbreaking story – but it is completely made up.
The shop where the grandpa like old man is selling his products is a simply standard theme shopify store that was quickly built together. The images are clearly all created from AI and Hermann is not existing in reality.
Furthermore the product are prized really really high around a couple of 100 euros each. But they somehow don’t look like hand made leather product from Germany but rather like factory stuff from Asia.
Researching a bit I found lots of complaints about the shop and also a official warning from an internet watch no-profit from Austria. But in reality some people seem to really fell for this fake shop and fake narrative which turn out to be just one more of those drop shipping shops selling rubbish from China.
Of course there is no impress, also the domain was acquired using an anonymisier tool, so there are no traces to the real operator of this little scam venture, who will probably get away with it and make some extra money.
15. April 2025
I built a video game – with the help of AI
Since I got my hands on the first programmable calculator I was hooked into creating games. Later in my teenage years when I learned programming I created some basic games with the Windows API and Visual Basic. Since my career took a different path into marketing an media I always had a side eye on the gaming industry and it was for some time my dream to work in the industry. Never happened – now I am not disappointed but I am still very interested in gaming and also in creating my own games.
My programming skill in advanced languages like C were never there to be able to create a “real” game so this never happened. But now with the rise of all sort of AI tools: in code generation, in art work, in music & sound I saw a good foundation to get something into the world with this new tool sets. So here is my experience report of 3 months working as a side project on a little 2d retro shooter game with helping hands from AI.
The basic idea
I am big fan of the original Alien movies and was always kinda disappointed what not so great games were released under the franchise over the years. So I wanted to create something in this scifi sphere. Some years ago I create a simple game using the web/javascript game engine phaser, so this time again I was planning to create a web based game with this libary.
The game should be not to complicated, having a retro touch from the early 90s games and a easy to play top down shooter. I also let elements of the Doom and Quake games inspire me for certain aspects (e.g. the player sprite is based on a Doom guy sprite and the music is very Quake 2 inspired). And the name of it should be: Alien Marines.
Code
For coding the game I heavily relied on the AI driven IDE Cursor. This is at the moment the best product in this category in the market. I also tried competitor tools like Windsurf, Zendcoder, PearAI and others but from my experience Cursor is still the #1. There are some controversies regarding the pricing and regulation of API calls but putting this things aside, it just works best. The start was simply this prompt:
```
create a project overview for a 2d game using the phaser html5 game engine. the game should have the look & feel from vampire survivors but should take place in the alien franchise. the player is a marine soldier, the enemies are xenomorphes and the environment is a fictional space ships with rooms and passages.

create the project outline first and give an overview of the next steps and the necessary files, folder to create
```
The agent mode of Cursor is here really helpful and it can create a small to mid sized code project at once for you. Sounds nice at first, but I really had my problems with the build tools and libary usage so that I decided to start from scratch again with a plain HTML file and the javascript game code. I learned coding in the early web ages so I feel way more comfortable with simple files than a huge tool chain.
Coding larger projects with AI is often running smooth but when the code base grows the problems start to arise. I wrote already some notes on this topic: AI software development – recap from a non dev person. Working with on a bigger project makes that all worse. From an LLm perspective I mostly used Claude Sonet 3.5 and since the release version 3.7. The model from Anthropic are commonly graded as the best for coding and I can just confirm this. For cost reasons I also used DeepSeek R1 but had here serious problems with hallucinations of the model. The code base was around 5k lines of code at that time and the model constantly used variables or functions that weren’t there. This is really frustrating after some time and requires a lot of hands on bug fixing.
Another issue I ran into all the time is rooted in the phaser libary which recently updated their interfaces to use emitter and particles effects. The LLMs state of knowledge was some version before and I always got not functioning code here. Again this was to fix manually. Besides having really speed improvements with AI generated code I also needed to spend hours of fixing bugs or refactor parts manually. Also reviewing the code quality, it is not the best I have to say. There is not always the same principles used in naming or processes that probably would make maintenance in a team setting more difficult.
In summary the speed and efficiency improvement using and AI enabled IDE versus coding fully yourself with help of stackoverflow or other online help resources is incredible. Unfortunately I haven’t tracked the time I spent into building the game but just the fact that I am not game developer and has not deeply worked with the phaser libary before is a proof that AI works here. As a complete developer newbie I am certain something comparable is not possible as it was from time to time required to fix bugs and refactor parts.
Player and enemy sprites
Working on sprites was a bit tricky as I have never done that before. I used Photoshop and later libresprite and for the start I re-used some sprite parts from older alien games and from Doom. For example this is my player sprite sheet:
or one based on a SNES version of a facehugger:
Of course I wanted to also add my own twist in terms of xenomorph creates and also don’t want to be sued by a big movie studio 🙂 So I started to work with different AI image generator to create my own xeno creatures. I tried out flux and ideogram but that was not working at all. So I switched to Dall-E 3 and the results turned out really good. But there were some steps need to get from a creature to a sprite sheet of it:
- Create a xeno creature sideview in front of a white background so that you can easily work with in the further steps
- Chose on image and put it into a image2video model to create a movement sequence. I used minimax and kling-video for this.
- Chose one video and create keyframes of it
- Chose some keyframe and put together a sort of animation and add manually effects (acid in my case)
The process sounds simple but I sometime created 30 to 40 base images from the creature just to have on to work with further. In the animation this was even worth because the models consequently ignored my prompts to make the creature just walk from one end of the screen to the other one. Like this buddy here:
After quite some time and a lot of resources spent on generating images and video the final result is a sprite sheet like this one here ready to be used in the game:
Some more examples of the creatures I created but not used:
Background Images
In my game I am not using a tile based level design but just one big background image. Doing so it was quite simple to get fitting background images out of AI image generators. This time choosing flux as model was the right way and the results were pretty good. The prompting this time war not too complicated but anyhow I needed around 5-10 runs to get a usable image from the model back. Sometime the model just messes up dimension or makes the outline walls too big.
Here is an example prompt:
```
digital art. create a game floor graphic for a top down 2d game in a science fiction xenomorph scenario. 

Dark futuristic alien planet with organic structures like trees and plants . dark colors, top view, place parts of a dead xenomorph skeleton in the middle
```
And some example background images:
Hud and user interface
An area were AI image generation was not that helpful was with creating elements for the user interface. Games usually have nice menus and also the current player stats are displayed normally in a nice way. The easiest but not the most appealing way is to rely on text, which I did in version 1. Later on I wanted to create a graphical HUD for the main game. Creating these things with AI is really hard because there is no sign of any consistency and even with tools like LoRa you can’t do that for user interface elements. So you get a bunch of examples and you need to go back to good old Photoshop and put things like a puzzle together. Some UI examples from the AI:
The first image although made it into the game and is the base for the main player HUD interface at the top of the game.
Mood Images
An area where the AI model really can show their strength is with mood images. Here we are not super strict with positioning, overall layout or specific elements of the image and we can tolerate the creative effusion of the models. Again here I used flux and the only problem I encounter here was that when prompting for scenes with the marine and xenomorph enemies sometimes also the marine gets xeno elements like big teeth or a tail. Overall I was super happy with the AI output here. Some examples that didn’t made it into the game (and you can probably easy spot why):
and some that did:
Music
For the soundtrack I was heavily inspired by the heavy metal soundtrack of Quake2. Also here I used AI tools to create the music for the game, mainly https://www.udio.com/ and https://suno.com. From an output quality perspective I must say that only songs created by suno made it into the game. The prompt input was very limited (only 200 chars) and so it was not super easy to define the style and also the context of the songs. Sometimes the produced lyrics sound a little odd but I think that’s OK for such kinda game. Here is an example song in a classic heavy metal style:
and a more modern one with hard dubstep like beats:
I used for the soundtrack around 15 tracks in a style mixture of classic 80s heavy metal, 2000s cross over, heavy dubstep styles all with the topic of fighting xenomorphs in space.
For the sound/sfx I didn’t use AI tools at all because I simply couldn’t find useful ones, so I relied here heavily on free resources from pixbay.
Progress
As I mentioned before, I cannot really determine how many hours I put into this project but I made a lot of progress in February this year where my business allocation wasn’t that high this time.
First version of the game November/Dezember 2024:
And a version that is playable here now:
I think the progress of the game is clearly visible. A version from early February is also available here.
My recap
My recap of this side project is that I was really hooked into the process most of the time (especially starting with February 2025) and the current output wouldn’t been possible for me to achieve without the use of AI. First of all I am no game developer, I am not graphic designer, musician and game designer but AI combined with a lot of trial & error and playing the game myself for hours made a nice product possible. I hope someone enjoys playing it in the end.
Some people might argue that I am not a real game designer/developer and the game is not “my” game at all – there is some truth behind this as of course I heavily used AI to achieve the output but at any point of time I was controlling and steering all the process and the produced product. I think it is an evident demonstration of what is today possible with AI tools and how the process of crafting something is speeded up tremendously and also opened up to more and more people.
I will publish the game also on itch.io for gaming insider feedback and also will continue to work on it as a side project.
Play Alien Marines
30. March 2025
The sad reality of magical AI powered dev environments
If you involved in AI you probable stumbled upon big promises that everyone can now create its own SaaS business with not even knowing how to write a single line of code. The tools that should make this promise come true are integrated online developments environments (that have been existing before the big generative AI wave) supercharged now with AI code generation powers. The big difference to github copilot, cursor or windsurf is that those tools run a complete node based dev environment on their servers that you can easily access through a web interface and place your code project there.
The currently most popular player in this field are:
- https://v0.dev
- https://lovable.dev/
- https://bolt.new/
- https://replit.com/
How do they work…
The layout and base functions of the tools are very similar: you have a chat window to communicated with the AI agent and another set of windows to monitor the code and the rendered output of it. So far I see it all of them use a node.js tech stack with different frameworks put on top of it. Here are some screen shots of the main interfaces from replit, v0 and bolt.new:
From the first impression v0 and replit seem to be more sophisticated in terms of the UI and also the functionality. But lets bring them all to a simple test…
A simple project to test the capabilities of the tools
For my consulting business I created an openAI assistant that helps my client to deal and analyze marketing trends. Now I wanted a simple web interface that queries the assistant via the openAI API and give the answer in a simple chat interface back. So this is really not a big deal I had thought but I spent hours and (spoiler alert) not one tool was able to complete the task.
The start was indeed quite promising as I got a first app scaffolding back from each tool when prompting my demand:
create a interface for the openai assistant api. the user can send messages to a defined assistant and it answers. the conversation between user and assistant should be visualized like in typical ai applications. the user can create variuous chat threads that are then stored and visualized in the sidebar. also add authentication for the user using superbase functions.
When we speak about UI and the looks, here v0 clearly wins as it creates the best looking interface. All of the tools rely on tailwind.css.
Problems, problems, problems
I connected a simple supabase database for simple user authentication and persistent features and then the problems started: v0 was not able to remember its own proposed database schema and made from that point ongoing mistakes with wrong columns names etc.
All of the tools exposed my openAI API to the frontend in a first version. I needed to explicit tell it to use a server proxy for the API requests and not store the key in frontend code. This is quite annoying and can be also dangerous for a tech newbie. Luckily openAI itself blocks those request by default.
What do I to here? It seems like v0 is not aware what its own capabilities are and asks the user to install some packages via npm.
In the end I spent around 2h and 10+ revisions with the task on each tool and try to get it working. But: none of it even got somewhere close to the end. I repeat: NONE. None of it was able to correctly query the openAI assistant API to even submit the user request. This is really a bummer and to prove the semi talented non dev people can do it better I coded the needed piece myself in a 200line of code PHP script that works just fine.
My recap
So to sum up, my first impression of those magical tools is disillusioned as they just don’t deliver. I don’t know if my test example was so complicated but I think it was definitely not. A solution in just a few lines of old school scripting language brought me further as investing my time in those hyped tools. Although I see already a lot of those micro SaaS popping up that surely have been mostly created by non dev people with those tools.
[Update 26.02.]
I decided to take a 2nd attempt in trying to get my project with one of the AI code generators to work. This time I took my working php script as a base and just asked the AI to rewrite it. Here is the prompt I gave to v0:
i wrote a simple php script to query the openai assistant api. take the code as base and create a new project in node/nextjs doing the same functionality. create a server and a frontend part. do not save the api keys in frontend code but use a server code to query the api. reuse the logic in the process.php for the server code and rewrite in in js/ts
implement a new color scheme based on your standard colors
As you can see I explicit asked to write frontend and backend code separated which was this time also well performed. My openAI were not exposed in public like the first try as well. After 3 revision I got a working solution. This is a big surprise to be honest as with my first approach I just stopped completely pissed at revision 20+.
17. February 2025
AI software development – recap from a non dev person
I am a non deep tech person. But I used to learn programming at university and I wrote code as part of my job in my early years. So I know how to code, how to set up a team based software dev project etc. With the progress of my career I shifted more and more in management roles and of course you don’t do anything code related there anymore. So it must have been somewhen around 2016 when I wrote my last line of code – until 2024.
First steps in 2023…
But playing around with all sort of AI technology let me again dive into the field of software development and I dig into writing code again. Of course with the help of AI, which actually means that AI wrote the code and I checked for functionality and output. My first steps with that approach was late 2023 with chatGPT and the output was not quite satisfying. Basically the code produced by the AI was very ugly and just worked after manual bug-fixing. I was not very impressed about AI in software dev back then.
But as the model improved it really also progressed the application in software generation. Comparing the quality of the model from end 2023 to mid 2024 is a huge step forward. I used now again chatGPT but as well Claude. The generated code was not the best in class, but working and additional conversations with the AI did actual work quite smooth to further evolve the code and create bigger projects. Claude Sonnet 3.5 especially is really the leading model here. I was able to create with it simple tools for my consulting business like a QR code generator, a funnel generator and other small single page concepts.
Going a step further: AI first IDE
For this small tiny projects AI generated code works pretty well. I also started to use cursor – an AI first integrated development environment based on VSCode. This make the entire process of querying the AI and put the results back into the code base super smooth. This is at the moment the approach that works the best. There are additional tools, where you can create prompts based on design inputs, which you can then put into Cursor or an AI interface to generate code.
The downside
Beside being really surprise from the progress of AI based software development, there are also major downsides I also experience myself. First of all, you don’t know the code, as it is not yours but from the AI generated. To change minor things, which would normally a 1 minute tasks take longer or you just ask again the AI. This can be really problematic when second you run into the typical problem of a bug, that the AI is not able to fix. This is a condition that you will sooner or later experience and the AI is not helpful any more. Means you have to dig into a code, that you hardly know, and fix a bug manually. Software dev probably know this situation when there is the need to fix a bug in someone else’s code and there is no or very little focus on code quality or style. So this can be very painful and time consuming.
So it is in my opinion super important that you actually have coding skills to overcome to above described problem situation and work properly with AI generated code.
12. January 2025

Some recent Image AI Prompt Highlights

This is a collection of my so far best result images I used to create using GenAI tools. Most of the examples you will see below have been created using Dall-E or Flux. I also included the prompt, so that you can copy the image if you would like to re-create them.

Create an image of a figure wearing a dark, gothic costume with significant, stylistic elements. The figure's face is painted to resemble a skull, with deep black eye sockets and a white base, while the lips appear smudged. The most striking features are the large, curled ram horns on top of the head, adorned with an elaborate chain headpiece that drapes over the forehead and sides. Chains, varying in size and style, hang around the neck creating layers. The costume includes

A hyper-realistic and detailed 3D render in 8K resolution. The scene depicts a massive moon dominating the left side of the frame, casting a dramatic glow. The ground is covered with thick smoke and raging fires, creating a sense of chaos and destruction. At the center-top of the frame, a futuristic satellite hovers in space, emitting a powerful laser beam aimed at the ground below. The atmosphere is dark and apocalyptic, with intricate textures and dynamic lighting. The composition is in a cinematic 9:16 aspect ratio

A fierce Viking warrior in a battle ready pose, holding his sword aloft with intense focus. His face is twisted in a fierce expression, beads of sweat and spit at the corners of his mouth. He wears rugged fur clothing, with wild, untamed hair and a thick beard. The surrounding environment is harsh, with dark clouds swirling above, crashing waves, and jagged rocks in the distance. Dust and mist fill the air, enhancing the atmosphere of tension and ferocity. His posture is powerful and tense, ready for the coming battle

A menacing figure of Krampus, depicted with long, twisted horns and a grotesque, demonic face. His body is covered in dark, shaggy fur, and he has one human foot and one cloven hoof. Krampus carries chains and birch rods, symbolizing his role in punishing naughty children. The background is a creepy, wintery landscape, with snow-covered trees and a dark, ominous sky. The overall atmosphere is eerie and foreboding, capturing the essence of this mythical creature.

A massive great white shark with its jaws wide open, and hanging from its terrifying teeth is a wooden sign. The sign reads, 'If you can read this, it's too late' in bold, weathered letters, giving a chilling sense of impending doom. The murky water surrounds the shark, with dark, shadowy figures lurking just below the surface, amplifying the tension. The shark’s massive mouth is poised to close, and the sign adds a sense of finality and danger

A comet impact on Earth, viewed from the ground. T-Rex in the immediate foreground and a pterosaur crashing down. More dramatic cinema action. Dramatic sky in deep orange and red. Impact creates an exploding fireball. Massive shockwave. Trees uprooted. Debris flying. Huge plume of smoke and dust. Ground shaking violently. Sparks and glowing debris flying in all directions. Intense and majestic.

A fort gt,  blue with aluminum wheels, Carbon fiber rear wings. Dynamic curves and aggressive stance feature intricate aerodynamic details. The car is positioned in perspective, highlighting sculpted lines and jewel-like headlights. Floating around it, precise sketches show the design process, with annotations and measurements. Against pristine white background, subtle color accent the logo, grille

Create an image of a bas-relief featuring two figures, one demonic and one angelic, symbolizing the contrast between good and evil. The demonic figure is on the left and has striking red and black tones. The skin is dark red and adorned with intricate black markings that suggest sinuous scales or armor. This figure's wings rise dramatically, feathers detailed and dark, matching the horns that curl from his forehead.

RAW hyper-realistic photo, post-apocalyptic urban street at sunset. A large sunset and a slight red cloudy sun below it, above the view. The road was flooded. Abandoned buildings lined the streets, and power lines stretched overhead, adding to the atmosphere of silence. Debris floats on the surface of the water, hinting at the turmoil of the past. Despite the destruction, there is a haunting beauty of how nature reclaims this urban environment under a dramatic sunset. UHD

"A dark, medieval castle set atop a rugged cliff, surrounded by a gloomy and misty atmosphere. The castle has multiple tall spires and intricate gothic architecture. A winding, narrow stone pathway leads up to the entrance, with moss-covered stones and overgrown roots on the sides. A tall waterfall cascades dramatically down the cliff beside the castle, merging with the fog and mist in the background.

Create; "Detailed black and white pencil drawing illustration of a fierce mid-race Spartan warrior, wearing a traditional Greek helmet with a plume, carrying a large round shield and a sword. His muscular build and flowing cape are visible, and his armor includes protective arm and leg gear The background is minimal, with dust and energy trails emphasizing his powerful movement. The style is highly detailed, strongly contrasting the intensity and power of an ancient warrior.

Create a moody, high-contrast black-and-white illustration of a man sitting alone in a dimly lit room. The setting should feature venetian blinds casting sharp, striped shadows across his face and body, highlighting his silhouette against the darkness. Position the man seated in a leather armchair, with a contemplative pose, partially obscured by shadow. The scene should evoke a film noir atmosphere with strong, dramatic lighting that accentuates the blinds and adds depth to the scene

A realistic, detailed, and accurate HDR 3D render of an ancient ocean ecosystem. The main subject of the image is the seafloor, where Dickinsonia is visible. Anomalocaris is seen hunting nearby, while other small primitive marine life forms are visible in the background. The image is slightly hazy, with dim sunlight filtering through, creating an eerie, ancient atmosphere.

4. December 2024

Image 2 Video AI Generator Comparison
A very next step when playing around with generative AI tools, is to make a video from a given image. In my case I wanted to animate the above image of mysself (AI generated with Flux/LoRA) in a natural way to make me speak out.
My first try out where with RunwayML – unfortunately only Model #2, not the newer version 3. The results are not the great: the movement is not natural mostly weird and also the transition in the face (morphing) looks rather spooky. So this first try was failure.
Try #2 with minimax video
A very new image2video model is minimax. You can easy access it via fal.ai. The prompt was very simply, just instruct to make me speak out and show some natural gestures. The output is way better then the one from RunwayML. It looks smoother and more natural. I wouldn’t say it is truely realistic but it must be around 90% acurate.
19. November 2024
Flux/LoRA Prompts for business photos
In the previous post I explained how to train Flux/LoRA to create images of yourself. This is a quite straightforward process and after that we can create via prompts the images we want to. In my case I did my first try outs with some business portraits for myself. The results a good, sometimes a bit to blurry and smoothened out. But there is an issue that the Flux AI tends to add you at least a 2nd time into the picture if you place yourself into a typical scenery with more then just 1 person. I also found a simple solution to overcome this. Here are my example prompts and the outcomes:
Professional business portrait of erich wearing a dark grey suite, sitting confidently at a modern office desk. Background shows a contemporary office with glass windows and city views. Well-lit with soft, natural light, highlighting a friendly, approachable smile, wearing a formal suit.
Professional business portrait of erich, standing in a modern office meeting room with a table and screen in the background. Wearing business attire with a confident, relaxed posture, arms crossed and smiling warmly. Soft, professional lighting enhances a welcoming expression.
Professional portrait of erich, seated behind a modern executive desk, surrounded by minimalistic office decor like a laptop, notebook, and pen holder. Well-lit room with large windows and subtle artwork in the background. Dressed in a formal suit or business casual, with a focused, thoughtful expression.
Professional yet relaxed business portrait of erich, standing in a collaborative office space with colleagues visible in the background, blurred slightly. Wearing business casual attire, arms relaxed, with a warm, approachable expression. Modern office setting with plants and glass walls, lit with natural light.
Here we have the case, that I have been put a 2nd time into the picture.
Professional business portrait of erich, caught in a natural conversation with a colleague in a modern office lounge area. Wearing business attire, seated on a stylish office sofa with hands gesturing slightly as if explaining something. Background features office decor with plants, well-lit with ambient lighting.
How to overcome the multiple images of yourself in scenes
This is actually quite simple, you just need to tell the AI via prompt that only one person is yourself and the others should be given random faces:
erich is sitting in a high class restaurant and having dinner with a lovely woman wearing an elegant black dress. erich is smiling into the camera. he is wearing smart casual clothings. in the background we see the typical scenery of a restaurant with tables, people etc. only the person sitting on the table with the woman is erich. add random faces to all the others
And after the prompt piece added:
6. November 2024

Blog

What AI tools you need:

Character

Product

Step 1: Combining in a product scene using flash image 2.5:

Step 2: Creating the product video with Veo3

The prompt

bolt.new

v0

lovable

VScode + Cline

Cursor

The results

The basic idea

Code

Player and enemy sprites

Background Images

Hud and user interface

Mood Images

Music

Progress

My recap

How do they work…

A simple project to test the capabilities of the tools

Problems, problems, problems

My recap

[Update 26.02.]

First steps in 2023…

Going a step further: AI first IDE

The downside

Try #2 with minimax video

How to overcome the multiple images of yourself in scenes