In our digital world, colors are defined by codes. These color codes are essential for web design, digital art, and many other applications. But what if you could predict a color code based on a text description? This was the question I sought to answer with a recent machine learning project.
I developed a neural network that does just that – predict the RGB color code based on the color name. In this post, I will walk you through my approach.
First things first, I installed the necessary libraries in my environment:
!pip install numpy pandas keras tensorflow scikit-learn
I used Python for this project, and these libraries include tools for numerical operations, data handling, and machine learning.
My dataset for this project was a CSV file with columns for color names and their corresponding RGB values. The RGB values were normalized to be between 0 and 1 for optimal neural network performance.
import pandas as pd
data = pd.read_csv('unique-v1.csv')
data[['r', 'g', 'b']] = data[['r', 'g', 'b']] / 255
Then, I used the Keras Tokenizer class to convert the color names into sequences of integers.
from keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer()
tokenizer.fit_on_texts(data['color'])
sequences = tokenizer.texts_to_sequences(data['color'])
To represent the words, I used pre-trained GloVe embeddings from Stanford. These embeddings transform the tokenized words into high-dimensional vectors that capture their semantic meanings.
!wget http://nlp.stanford.edu/data/glove.6B.zip
!unzip glove.6B.zip
The code for loading the embeddings is fairly straightforward. I ended up with an embedding matrix that I could use in my neural network.
To evaluate the performance of my model, I split the data into a training set and a validation set using a 80:20 ratio.
The architecture of my model was relatively simple. I started with an embedding layer, followed by a global average pooling layer and a dense layer. The final layer was another dense layer with 3 units (for r, g, b) and a sigmoid activation function to ensure the output values were between 0 and 1.
from tensorflow.keras.layers import GlobalAveragePooling1D
model = Sequential([
Embedding(input_dim=len(tokenizer.word_index) + 1, output_dim=embedding_dim, weights=[embedding_matrix], trainable=False),
GlobalAveragePooling1D(),
Dense(64, activation='relu'),
Dense(3, activation='sigmoid')
])
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(x_train, y_train, validation_data=(x_val, y_val), epochs=500)
Now for the fun part! With my trained model, I could take a color name and predict its RGB value. Here is how I did it for the color "yellow":
def get_rgb(color_text):
# ... (omitted some code for brevity)
predicted_color_rgb = model.predict(new_color_sequence)
predicted_color_rgb = (predicted_color_rgb * 255).astype(int)
img = Image.new('RGB', (300,300), color = tuple(predicted_color_rgb[0]))
display(img)
get_rgb("yellow")
Voila! My neural network was able to generate the RGB value for "yellow" and create an image with the predicted color.
Lastly, I saved both the model and tokenizer for future use.
model.save('color_model.h5')
import pickle
with open('tokenizer.pickle', 'wb') as handle:
pickle.dump(tokenizer, handle, protocol=pickle.HIGHEST_PROTOCOL)
I hope you found this exploration into the intersection of machine learning and color prediction intriguing. Let's keep pushing the boundaries of what we can achieve with these powerful tools!