Playing with my kid last night, I stumbled across yet another problem that humans have an easy time solving, while LLMs seemed incapable of getting close to the correct answer.
Prompt: How many colored squares are on this board? Do not count white squares.
We asked 10 LLMs to count the number of colored squares in this image. None of them got close. Below is the list, sorted by how far off they were, from 223,241% to 35%. Granted, these are Large Language Models, not specialized computer vision algorithms, which I have no doubt would solve this problem with ease, but it was still interesting to see how hard this problem was for them to solve, and how confident some of them were in their wrong answers.
- GPT-4.5 – 781,695
- Perplexity Sonar – 783,360
- Gemini 2.0 Flash – 783,360
- Claude Sonnet 3.7 Thinking – 160
- Deep Research – 166-170
- GPT-4o – 177
- o3-mini – 150-180
- Grok-2 – 196
- Claude 3.7 Sonnet – 150-200
- R1 – 225
- Human – 350 (Correct)
GPT-4.5 – 781,695
The board contains 781,695 colored squares. White squares were excluded from the count.
Perplexity Sonar – 783,360
The result of 783,360 likely represents the count of individual colored pixels in the image rather than the number of colored squares on the board. To determine the number of colored squares, we need to account for the size of each square.
Each square in the game board appears to be composed of multiple pixels (e.g., a grid-like design). If you can provide the pixel dimensions of one square, I can help refine this calculation.
Gemini 2.0 Flash – 783,360
There are 783,360 colored squares on the board, not counting the white squares. A Python script using the PIL and NumPy libraries was used to arrive at this answer. The script loaded the image, converted it into an array, defined the RGB values for the white squares, and counted the non-white squares.
Claude Sonnet 3.7 Thinking – 160
Based on the image shown, I can see a game board that appears to be Blokus or a similar tile-placement game. The board contains colored translucent plastic pieces in four colors: red, yellow, blue, and turquoise/teal.
After carefully examining the image, I can see that this is a square grid board with colored pieces arranged in various polyomino shapes (similar to Tetris pieces). The colored squares occupy much of the board, with white spaces showing the empty grid positions.
Counting each colored square individually:
- Red squares: approximately 40
- Yellow squares: approximately 40
- Blue squares: approximately 38
- Turquoise/teal squares: approximately 42
In total, there are approximately 160 colored squares on this board. This count excludes all white/empty squares as requested.
The colored pieces appear to be arranged in a competitive pattern typical of Blokus gameplay, where players try to place as many of their colored pieces as possible while blocking opponents.
Deep Research – 166-170
Analysis of Colored Squares on Game Board
Based on careful examination of the provided image, the board shows a colorful arrangement of translucent square pieces placed on a white grid. The board contains various colored squares in red, blue, yellow, and teal forming different patterns across the playing surface.
Board Structure and Composition
The image displays what appears to be a puzzle or game board with a grid layout containing colored translucent pieces. The board features a white background grid with square pieces in four distinct colors: red, blue, yellow, and teal (which the analysis code attempted to identify as green). These colored squares are arranged in various configurations across the board, creating an intricate pattern with some white spaces remaining empty. The colored pieces appear to be made of translucent plastic material, allowing some light to pass through while maintaining their vibrant coloration.
Color Distribution Analysis
According to the computational analysis of the image, the distribution of colored pixels shows a predominance of red elements, followed by yellow and then blue. The automated pixel counting identified:
- Red elements: 83,758 pixels
- Yellow elements: 67,271 pixels
- Blue elements: 20,814 pixels
- Teal/green elements: The code detected only 8 pixels, which appears to be an error in color detection as teal pieces are clearly visible in the image
The total number of colored pixels detected was 171,851. However, this pixel count does not directly translate to the number of colored square pieces on the board, as each piece contains multiple pixels.
Square Count Methodology
To determine the actual number of colored squares on the board as requested, we need to analyze the grid structure rather than merely counting pixels. Based on visual inspection, the board appears to be a standard square grid, similar to those used in games like Blokus or various puzzle games. Each colored piece occupies exactly one square position on the grid.
From the image, we can observe that the board contains a substantial number of colored squares arranged in various patterns. The grid appears to be approximately 16×16 in dimension, which would provide 256 total positions. Looking at the distribution and density of the colored squares on the board, approximately 65-70% of the grid positions are occupied by colored squares, with the remaining being white spaces.
Total Colored Square Count
Based on the visual analysis of the grid and the density of colored pieces, the total number of colored squares on this board is approximately 166-170 squares. This estimate is derived from analyzing the grid structure and the proportion of the board filled with colored pieces rather than the raw pixel count.
The board displays a fairly even distribution of the four colors (red, blue, yellow, and teal), though red appears to be the most prominent color based on both visual inspection and the pixel analysis. The pieces are arranged in what appears to be a strategic pattern, possibly representing a game in progress or a completed puzzle arrangement.
Conclusion
The board contains approximately 166-170 colored squares in total, not counting the white spaces. This count represents the actual physical square pieces placed on the board rather than pixel measurements. The colored squares create an engaging visual pattern with their arrangement of red, blue, yellow, and teal pieces distributed across the white grid surface.
GPT-4o – 177
To calculate the number of colored squares on this board, we need to count all the non-white squares in the image.
The board is a 14×14 grid, which totals 196 squares. From visual inspection, the white squares are the empty spaces scattered across the board. By manually counting the white squares in the image, there are 19 white squares.
Subtracting the white squares from the total:
196 – 19 = 177
Thus, there are 177 colored squares on this board.
o3-mini – 150-180
The image shows a colorful grid-based puzzle board with transparent colored blocks arranged in a pattern. This appears to be a Blokus-style game or similar strategic board game.
Based on visual inspection, this is a square grid board with colored transparent pieces in red, yellow, blue, and teal/green. The board appears to be approximately a 16×16 grid, with many of the squares filled with colored pieces while others remain white (empty).
From carefully examining the image, I can estimate there are approximately 150-180 colored squares on the board. Each colored square appears as a small transparent block positioned within the grid structure, and they are arranged in various geometric patterns across the board.
The white squares are not counted in this estimate, as specified in your query. The colored squares make up roughly 60-70% of the total board spaces, which suggests around 150-180 colored pieces if the board is indeed a 16×16 grid (which would have 256 total positions).
Grok-2 – 196
To count the colored squares on the board in the image, a visual inspection reveals that the board is filled with red, yellow, green, and blue squares. White squares are not to be counted.
The board appears to be a 14×14 grid, which means there are
14×14=196
14×14=196 total squares. Since the board is fully covered by colored squares (with no visible white spaces), the total number of colored squares is 196.
Claude 3.7 Sonnet – 150-200
The image shows a board with colored plastic squares arranged in a grid pattern. The squares are in four colors: red, blue, yellow, and teal/green, placed on a white background grid.
Based on the analysis of the image, there are numerous colored squares on the board. While an exact manual count would be challenging from visual inspection alone, the image processing analysis indicates there are approximately 774,627 colored pixels in the image.
However, this pixel count doesn’t directly translate to the number of colored squares. Looking at the board structure, it appears to be a standard Blokus or similar puzzle game board, which typically has around 225-400 squares total (depending on the specific game variant), with many of those positions filled with colored pieces.
From visual estimation of the grid pattern visible in the image, there appear to be roughly 150-200 colored squares on the board, not counting the white background squares.
R1 – 225
The image shows a game board filled with translucent colored plastic pieces arranged in a grid pattern. Counting each colored square individually:
Red squares: Approximately 57
Blue squares: Approximately 55
Yellow squares: Approximately 54
Green/teal squares: Approximately 59In total, there are approximately 225 colored squares on this board. This count excludes all white squares as instructed.
The board appears to be a tile-placement puzzle game similar to Blokus or Tetris-style puzzles, where different colored polyomino pieces are arranged on a grid. The colored squares are arranged in various patterns and shapes across the white game board.
Human – 350 (Correct)
It’s a Standard Blokus board. It’s a 20×20 grid or 400 total squares. There are 84 pieces in Blokus; 21 in each of four colors. Counting up the shapes gives us 89 squares per color. 89*4 = 356 colored squares and 400-356 = 44 white ones.
Let’s double check. I count the sides of the grid as 20×20. Going row by row, I count 50 white spaces, so we must be missing a few pieces from the set, but that makes for 350 colored spaces and 50 white spaces on the board in the picture.
Notably, I did the opposite of the prompt here because I counted the white squares, because I interpreted the statement to mean not including the white squares in the final count, rather than instruction about how to approach the problem. I was even wrong in the initial calculation and only caught the correct number after double checking my work by another method, a frequent tactic for humans that LLMs don’t use as much.
What can we learn from this?
This experiment highlights several important insights about the current state of LLMs and their visual perception capabilities:
- Fundamental limitations in visual reasoning: Even advanced LLMs struggle with basic counting tasks that humans find relatively straightforward. The dramatic variance in answers (from 150 to over 780,000) demonstrates how far these models are from reliable visual comprehension.
- Domain knowledge matters: The human solver immediately recognized this as a standard Blokus board with specific dimensions (20×20) and game pieces, which provided context for solving the problem accurately.
- Confidence doesn’t equal accuracy: Several models provided extremely precise but wildly incorrect answers. GPT-4.5, Perplexity Sonar, and Gemini 2.0 Flash all confidently stated numbers in the 780,000 range without recognizing the implausibility of their results.
- Methodological differences: Models approached the problem differently—some counted pixels rather than squares, others estimated grid dimensions incorrectly, and some made reasonable approximations but still fell short. This reveals how different architectures process and interpret visual information.
- The human advantage: The human solver used domain-specific knowledge, spatial reasoning, and common sense verification (double-checking by counting white spaces) to arrive at the correct answer—cognitive skills that current AI systems don’t fully replicate.
- Practical implications: For applications requiring precise visual counting or object identification, specialized computer vision algorithms remain vastly superior to general-purpose LLMs. This highlights the importance of using the right tool for specific tasks.
This experiment serves as a humbling reminder that despite impressive advances in AI, fundamental visual reasoning tasks that humans master early in development remain challenging for even the most sophisticated language models.
A Better Approach
So how might we approach this problem with AI? Well, if you’re not writing your own code, using LLMs to generate code for a deterministic algorithm would probably be a significantly better approach for this counting problem. The experiment clearly shows that general-purpose LLMs struggle with direct visual counting tasks, but we know that specialized computer vision algorithms are being used for challenges more complicated than this in production environments.
For our case, a more effective approach would include:
Adding Domain Knowledge
The prompt could incorporate domain-specific knowledge about Blokus:
- The standard board dimensions (20×20)
- Typical piece configurations and constraints
- Expected ranges for colored vs. white squares
This would help validate results and catch errors that occurred in the LLM attempts.
Computer Vision + Deterministic Algorithm from scratch
Instead of asking an LLM to interpret the image directly, you could use an LLM to generate code for a specialized computer vision pipeline that:
- Preprocesses the image – Adjusting brightness/contrast and filtering noise
- Identifies the grid structure – Detecting it’s a standard Blokus 20×20 board
- Segments the image into individual squares using edge detection algorithms
- Classifies each square by color (colored vs. white) using color thresholds
- Implements counting logic with verification steps
Assemble from Open Source
You could prompt an LLM to generate Python code using libraries like OpenCV (an open source computer vision library) for this specific task. For example:
Request code that uses OpenCV to:
- Detect the game board grid
- Identify each square’s color
- Count colored squares, excluding white
- Validate results against Blokus game knowledge
Why This Works Better
LLMs are much stronger at code generation than direct visual analysis. By having them generate deterministic algorithms, you leverage:
- The precision of computer vision techniques designed specifically for object counting
- Domain knowledge about Blokus that humans naturally applied
- Deterministic verification steps that can catch errors
These approaches create reproducible, (more) consistent solutions.
But if you only have one board to count with your child, maybe it’s just faster to let the human do it.