Playing with image embeddings

Quite a while ago I worked on image retrieval and made a little experiment with algebraic operations on image embeddings extracted from convolutional network. I downloaded a set of publicly available photos, extracted feature vectors using pretrained ResNet 50 and applied cosine distance KNN search using linear combinations of the following query vectors: …

Semantic clustering of images

Recent studies in the field of computer vision have shown the abilities of deep convolutional neural networks to learn high level image representation. These representations have reach semantic and can be very handy in various visual tasks. One of such task that can make use of high level features is semantic clusterization. Let’s find some pictures in Google Web Search, feed them to CNN and apply k-means to the extracted feautures and take the biggest clusters, here is what I got for different queries : …

Why top-5 error is more fair metric than top-1 for ILSVRC models

For the most publicly available models trained on ImageNet dataset top-5 and top-1 errors are reported. The best perfoming model from ILSVRC 2015 has 6.7% top-5 error and 23% top-1 error evaluated using singe center crop. It’s pretty interesting to find out that human error turned out to be not much better, top-5 error is around 5.1% according to the result of some experiments descibed in the article. An ensembe of models show even better results, six models of different depth leads to 3.57% top-5 error (1st place in ILSVRC 2015). Top-1 error still seems to be very big and...…

Preparing multilabel dataset for training ConvNet with Caffe

Preparing multilabel training set for caffe framework is a bit nontrivial. So, if you have multiple, possibly varying number of ground truth labels for each training example then here is how you can do it using LMDB store. For LMDB data source you need to separate your data input and your labels by creating two LMDB (one for the data and the second one for the labels). You also have to define two data layers in your network definition, set the same batch size for both of them and disable shuffling for the alignment. …

Sentiment analysis with CoreNLP

In the rise of social media customer’s opinions has become extremely valuable for businesses selling their products, financial markets and social researches. To extract opinions from customer’s reviews, comments or other kind of text data you might want to know what sentiment analysis is. Sentiment analysis and opinion mining is the field of study that analyzes people’s opinions, sentiments, evaluations, attitudes, and emotions from written language.1 So the the basic task of sentiment analysis is classifying text into some emotive categories. The most common set of categories are: positive, neutral and negative. Methods for sentiment analysis There is a number...…