Is a picture worth a thousand words? — A data driven approach

Spoiler alert — No, it’s not!

I get the point, it is just an English language adage but come on, isn’t is disturbing? A thousand words is a lot.

Word Cloud of captions

I feel unless you know more about the picture (more words about the picture) and have some context to it, it is not worth a lot (say, 1000) of words.

So, if a picture is not worth a thousand words, how many words worth is it exactly?

Recently, I found Conceptual Captions, a new dataset with 3.3M images annotated with captions.

Conceptual Caption images and their raw descriptions are harvested from the web, and therefore represent a wider variety of styles. More precisely, the raw descriptions are harvested from the Alt-text HTML attribute associated with web images. — https://ai.google.com/research/ConceptualCaptions

Here are a few (2) examples of images with corresponding captions from the dataset.

american football player looks downfield during the second half of a football game against sports team
american football player looks downfield during the second half of a football game against sports team
young business woman on a bench
young business woman on a bench

I cleaned and processed the Conceptual Captions Dataset to count the number of words for every caption. This is how the data looked after processing.

Count of total words

I created a histogram in Tableau. Following are the results.

As we can see, the histogram is positively skewed. Around 89% of the total 3.32M captions ranged between 5 to 15 words.

Stats at a glance

Here are major statistics at a glance. Of the total 3.32 images, The range of caption lied between 3 to 50 words. The average words in a caption were 9.95, approximately equal to 10.

So, I feel unless you have more context to an image, it is only worth around 10 words. In other words, To make sense of any random image, you need 10 words on an average. That is a saving of 990 bytes (considering ASCII encoding).

Code/Viz on Github@kanishk307

--

--

--

Love podcasts or audiobooks? Learn on the go with our new app.

Build a BERT Sci-kit Transformer

A review of Dropout as applied to RNNs part 2.

Learning NLP Language Models with Real Data

Analysis of Computer Vision Techniques in Malware Classification

Applications of Machine Learning in Computer Network Security

文字辨識方法統整

Building a Text Classifier using RNN

The Inception of Machine learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Kanishk Jain

Kanishk Jain

More from Medium

Absenteeism at Work

Houston, We Have a Testing Data Problem

Be up to date with technologies and tools — useful Newsletters.

Creating my first report in Power BI