Extracting All Colors in Images with Python

Shegocodes
5 min readNov 28, 2022

--

Improving your computer’s ability to see a wider range of colors.

Recently I made some improvements to the Pillow library so it can detect a wider range of colors in a given image. As we all know, there are many colors in the universe and I want to help computers recognize more of them. I highly recommend running my Jupyter Notebook first to visualize the colors detected.

Simply running these lines of code did not do much for me:

from PIL import Image
img = Image.open(image_path)
colors = img.convert('RGB').getcolors(maxcolors=256)

Here is a reference to Pillow’s documentation, specifically their image module: https://pillow.readthedocs.io/en/stable/reference/Image.html

Screenshot of source.

To tackle this problem, first I built a web scraper to collect all the different color ranges and its associated color codes. I managed to collect a total of 865 different colors to use for my demo. This data is saved in my colors.csv file. Don’t worry, this part is already done and is accessible in my Github. The ranges can be found here for your reference: https://www.colorhexa.com/

Here is what the colors database looks like.

Second, I made a function to resize the original image so neither its width or height is greater than 100 while also maintaining its aspect ratio to prevent too much distortion. Smaller dimensions helped improve runtime significantly.

def resize_image(width, height, threshold):
"""
Function takes in an image's original dimensions and returns the
new width and height while maintaining its aspect ratio where
both are below the threshold. Purpose is to reduce runtime and
not distort the original image too much.

Parameters
----------
width : int
original width of image
height : int
original height of image
threshold : int
max dimension size for both width and height
"""
if (width > threshold) or (height > threshold):
max_dim = max(width, height)
if height == max_dim:
new_width = int((width * threshold) / height)
new_height = threshold
if width == max_dim:
new_height = int((height * threshold) / width)
new_width = threshold
return new_width, new_height
else: return width, height

Third, I wrote a function to create a hash map to keep track of all the different colors detected and its total number of occurrences across all pixels defined by iterating through the resized image.

def detect_colors(image_path):
"""
Function returns colors detected in image.

Parameters
----------
image_path : str
path to imagefile for detection

Return
------
sorted list of tuples (color, total number detections)
"""

# Read image
image = Image.open(image_path)

# Convert image into RGB
image = image.convert('RGB')
# Get width and height of image
width, height = image.size
print(f'Original dimensions: {width} x {height}')

# Resize image to improve runtime
width, height = resize_image(width, height, threshold=100)
print(f'New dimensions: {width} x {height}')
image = image.resize((width, height))

# Iterate through each pixel
detected_colors = {} # hash-map
for x in range(0, width):
for y in range(0, height):
# r,g,b value of pixel
r, g, b = image.getpixel((x, y))
rgb = f'{r}:{g}:{b}'
if rgb in detected_colors:
detected_colors[rgb] += 1
else:
detected_colors[rgb] = 1

# Sort colors from most common to least common
detected_colors = sorted(detected_colors.items(), key=lambda x:x[1], reverse=True)
return detected_colors

Fourth, I calculated the absolute differences between detected color codes in the image and (R,G,B) values from reference (colors.csv). Then I stored all the differences in a list of dictionaries and used the shortest distance method to get the best match.

def get_color_codes(detected_colors):
"""
Function finds the best matches between detected color codes
and source color codes from: https://www.colorhexa.com
Parameters
---------
detected_colors : list
list of detected colors in image
color_codes : list
list of best matches
"""

color_codes = []
for idx,detected_color in enumerate(detected_colors):
detected_color = detected_color[0].split(':') # Calculate absolute differences
color_map = []
for idx,row in colors.iterrows():
r = abs(int(detected_color[0]) - row['R'])
g = abs(int(detected_color[1]) - row['G'])
b = abs(int(detected_color[2]) - row['B'])

# Query row values
color = row['color'],
code = row['code'].replace('#', '')

# Map results
color_map.append({
'color':color,
'code':code,
'distance':sum([r,g,b])
})

# Get best match (shortest distance)
best_match = min(color_map, key=lambda x:x['distance'])

# Get color code
color_code = best_match['code']
if color_code not in color_codes:
color_codes.append(color_code)

return color_codes

To improve runtime further, I recommend splicing up the list of detected colors since I realized that not every consecutive pixel is going to be a different color; only slight variations. Extracting and analyzing every 10th pixel will do.

color_codes = get_color_codes(detected_colors[0::10]) # list splice

Finally, I returned all the closest matches with its associated color names. Results are stored in a pandas dataframe.

def get_association(color_codes):
"""
Function returns color name associated w/ detected color codes.
Parameters
----------
color_codes : list
list of detected color codes in image
Return
------
res : list
list of color names associated with respective color codes
"""

res = []
for color_code in color_codes:
colorfile = os.path.join('colors', color_code + '.png')
# Query color name associated with color code
colorname = colors[colors['code'] == f'#{color_code}']
color_name = ['color'].values[0]
# Append results...
res.append({
'color name':color_name,
'color code':f'#{color_code}'
})

# Generate pandas dataframe
if len(res) == 0: return []
elif len(res) == 1: res = pd.DataFrame(res, index=[0])
else: res = pd.DataFrame(res, index=None)
return res
Read color strips from left to right, then top to bottom to get most dominant to least dominant.

I got a total of 89 detected color tones and shades from Andy Warhol’s Marilyn Monroe, but usually the top 10 would work for most use cases. I simply wanted to display them all to show my code’s robustness.

Warning: Lightning and shadows can affect how the algorithm performs and detects colors. Other than that, thanks for reading and my source code and Jupyter Notebook can be accessed via Github. Please give me a follow and let me know if you run into any issues with my code. Feel free to contact me here.

Thank you so much for making it to the end of this page. If you found me helpful, please feel free to support me and Shegocodes by giving me a follow here on Medium and/or buying me a cup of coffee so I can continue to contribute to open source work and build. Happy Coding!

--

--

Shegocodes
Shegocodes

Responses (1)