Battle of the AIs
April 22nd, 2026 • Electronics • No Comments »
There is little doubt that the use of AI in coding is improving developer productivity both in the production of code and also diagnosing issues. AI can bring a large corpus of data into one place.
This leads us to the question of which one should we use?
Cloud or Local
The answer to this question will really depend upon the answer to two questions:
- How much can we afford to pay
- Is the code private
This comparison is based upon the following premise, the code is open source and cost needs to be controlled.
Copilot will be used for the cloud service and qwen3-coder-next served by ollama will be used for the local AI.
Copilot has been selected as it is available free for GitHub users. It was also available on a low cost GitHub account. qwen3-coder-next currently appears as one of the top models for software development.
All of the models were accessed through a Command Line Interface (CLI).
The Task
All three models were asked to complete the same task, implement a modal message box on a existing application running on a M5Stack Tab5. All of the models were provided with the same source code and the same specification. The specification was detailed:
## Message Dialog
A modal message dialog can be displayed over the current screen contents at any time by calling:
“`cpp
Display::ShowMessageDialog(const char *title, const char *message);
“`### Appearance
The dialog is a centred 600 × 300 pixel rounded-rectangle popup (corner radius 16 pixels) drawn over the existing screen contents.
| Layer | Description |
|—|—|
| Outer border | White, 2 pixels |
| Inner fill | Black |From top to bottom the dialog contains three elements:
#### Title
– Drawn in bold white using `fonts::Font4`.
– Centred horizontally within the dialog.
– Vertically centred within a 56-pixel title band at the top of the dialog.
– The bold effect is achieved by rendering the string twice: once at the nominal position and once shifted one pixel to the right.
– A white horizontal rule separates the title band from the message body.
– The `title` parameter is mandatory and must not be `nullptr` or empty.#### Message body
– Drawn in white using `fonts::Font4`.
– Centred horizontally within the dialog.
– The text block is vertically centred in the area between the horizontal rule and the OK button.
– Multi-line messages are supported using `\n` as the line separator. Each line is drawn individually at 20-pixel line spacing.#### OK button
– A rounded-rectangle button (160 × 44 pixels, corner radius 8 pixels) centred horizontally at the bottom of the dialog, 20 pixels above the dialog’s lower edge.
– White border, black fill, white label text (“OK”).### Behaviour
1. `ShowMessageDialog` draws the dialog immediately, then returns to the caller without blocking.
2. A short-lived FreeRTOS task (`MsgDialogTask`, stack 4096 bytes, priority 5) is spawned to wait for dismissal.
3. The normal panel touch callback (`OnPanelTouch`) is unregistered for the duration; touch events are routed exclusively to the dialog’s own handler (`OnMessageDialogTouch`).
4. `DisplayTask` continues to receive and apply queued `DisplayMessage` state updates while the dialog is visible, but suppresses all draw calls until the dialog is dismissed.
5. When the user taps the OK button, `MsgDialogTask` redraws the full main interface (clearing the dialog) and re-registers `OnPanelTouch`. All 32 storeline dirty flags are forced to `true` before `DrawAllStorelines()` to guarantee a full repaint.
6. `_prevTouched` is set to `true` before re-registering `OnPanelTouch` to prevent the finger-lift from the OK button generating a spurious panel press.### Thread safety
`ShowMessageDialog` may be called from any task context, including from within a touch callback (e.g. the `LoadCallback` invoked by `TouchTask`). The non-blocking spawn pattern avoids the deadlock that would result from a blocking wait inside `TouchTask`.
All draw calls inside `ShowMessageDialog` and `MsgDialogTask` are serialised by `_displayMutex`.
### Example usage
“`cpp
Display::ShowMessageDialog(“Load Error”, “File not found:\n/sdcard/missing.ssem”);
Display::ShowMessageDialog(“Information”, “Program loaded successfully.”);
“`
Most of the work had already been done so the AI only had to provide the actual implementation.
The Models
The following models were used in this comparison:
- qwen3-coder-next (local)
- gemini-3-flash-preview (cloud)
- GPT-4.1 (cloud)
- Claude 4.6 (cloud)
All of the AIs were given the same prompt:
The Documentation/DisplaySpecification.md file has been updated to include the specification for a modal MessageBox.
Update the project to implement the MessageBox feature.
The local model was run on a laptop with 128 GB of RAM with Ollama consuming around 56 GB when serving qwen3-coder-next.
Results
qwen3-coder-next – 41 Minutes
This model needed some supervision and had to be prompted on several occasions. The model made the following errors:
- Introduced two compiler warnings
- Corrupted the display
- Did not handle blank lines correctly
- Exception due to incorrect code
Prompting and providing additional information enabled the AI to resolve these issues.
Gemini-3-flash-preview – 9 minutes
Completed the task correctly with no errors.
GPT-4.1 – 12 minutes
This model completed the task with only 2 errors:
- Corrupted display
- Main interface not refreshed after the MessageBox was dismissed
Claude 4.6 – 5 minutes
Claude completed the task with no errors.
Conclusion
The local model was the slowest to complete the task. It consumed a fair amount of memory but surprisingly did not use a large amount of CPU power. The use of local resources maintains privacy and means there are no usage limits.
Gemini completed the task with no errors but this has a restricted number of tokens available per day.
Accessing Claude through the Copilot CLI was certainly the quickest but this consumes a premium request for each task.
ChatGPT 4.1 is also available through the Copilot CLI. While not perfect, needing guidance, this model offered a balance between cost and speed. There is no additional cost when accessed through Copilot CLI.
Copilot with ChatGPT 4.1 seems a reasonable balance between speed, accuracy and cost for open source projects. Claude 4.6 has proven itself a number of times and this task shows how this can benefit projects where code privacy is not an issue although there is an associated cost.





