Go to any local attraction and you'll see people taking selfies with statues, towers, all sorts of sights. Once, walking past tourists taking pictures of Greyfriars Bobby in Edinburgh, I reckoned that some of those will be posted on Twitter. I should be able to find these, and similar pictures, automatically. How difficult could it be?
Problem: Porn. If you do a search for tweets containing "selfie" in the raw firehose, then you'll find about it's about 60% Porn. I suspect the equivalent live search on twitter is doing some sort of filtering.
Even apart from the problem of Porn, there is no gaurantee they'll even use the term "selfie", so we need a different kind of filter.
My hypothesis was that the selfies I was after would have one person in the foreground on the bottom left or right, with the object of interest in the background. I coded this up, and left it running for a bit.
Umm, ok.
I seem to have written a chat app detector. What else fits my description? Tv shows?
Framed quotes, inspirational or otherwise.
With the above, it's easy to see how it matched, but some others are a bit more obscure. For each of the following, have a look at the tweet image for a bit before clicking through to the highlighted face.
You can see more of these examples and other types in my github repo.
But, did I actually find anything? It took me a about an hour of trawling through around a thousand tweets it had found, but here's a perfect example of what I wanted:
A one in a thousand hit-rate is not great, but given this was barely a days worth of implementation and tweaking, it's not bad. It's also obviously improvable. For example:
This is a work in progress, and I'm quite happy with what I have so far, so might leave this for a bit. Always other projects on the go!
Finally, I'll end with one example I found which doesn't quite fit my original intent but which I like nonetheless: