ACM Multimedia 2012 Main Conference Notes

This is a scratchpad … will fill the rest when I have time.

Papers

I really enjoyed the work from Heng Liu, Tao Mei et. al. “Finding Perfect Rendezvous On the Go: Accurate Mobile Visual Localization and Its Applications to Routing”. They combine existing research in a very interesting mixture. They use a visual localization method based on bundler to detect where in the city a mobile phone user is. The application scenario I liked best was their collaborative localization for rendezvous :)

The best paper award went to Zhi Wang, Lifeng Sun, Xiangwen Chen, Wenwu Zhu, Jiangchuan Liu, Minghua Chen and Shiqiang Yang for “Propagation-Based Social-Aware Replication for Social Video Contents”. They use the contacts mined over social networking to replicate content for better streaming and content distribution. The presentation was great, the research solid, still it’s not a topic I’m very interested in. However, for content providers it seems very useful.

Shih-Yao Lin et. al. presented a system to recognize the users motion using the kinect and imitate them via a marionette in “Action Recognition for Human-Marionette Interaction”. I hoped to get more information about the interactions between users and marionettes, still very stylish presentation and artsy topic.

Hamdi Dibeklioglu et. al. showed how to infer the age of a person when they are simling in “A Smile Can Reveal Your Age: Enabling Facial Dynamics in Age Estimation”. I find fascinating to hear about small cues that can tell a lot about a person or a situation.

Fascinating work by Victoria Yanulevskaya et. al. (“In the Eye of the Beholder: Employing Statistical Analysis and Eye Tracking for Analyzing Abstract Paintings”). They link the emotional impact of a painting to the eye movements of the observer. Very interesting and in line with my current focus. I wonder if also expertise etc. can be recognized using sensors.

Another very art focused paper I enjoyed was “Dinner of Luciérnaga-An interactive Play with iPhone App in Theater” by Yu-Chuan Tseng. Theater visitors can interact with the play using their smart phone (getting also feedback on the device ….).

Posters, Demos, Competitions

The winner of the Multimedia Grand Challenge was very well deserved. “Analysis of Dance Movements using Gaussian Processes” by Antoine Liutkus et. al. decomposed dance moves using Gaussian processes in movements with slow periodicity, high periodicity and moves that happened just once. Fascinating and applicable to so many fields … :)

A very neat demo was presented by Wei Zhang et. al.: “FashionAsk: Pushing Community Answers to Your Fingertips”.

Other Notes

As expected from any conference in Japan :), the organisation was flawless. In case any if the organisers is reading this. Thanks again. Nara is a perfect place for a venue like this (deer, world heritage sites, good food …).

More curiously, although there was a lot of talk about social media and some lively discussions on twitter, I seemed to be the only participant on ADN at least posting with hashtag.

ACM Multimedia 2012 Tutorials and Workshops

I attended the Tutorials “Interacting with Image Collections – Visualisation and Browsing of Image Repositories” and “Continuous Analysis of Emotions for Multimedia Applications” on the first day.

The last day I went to “Workshop on Audio and Multimedia Methods for Large Scale Video Analysis” and to the “Workshop on Interactive Multimedia on Mobile and Portable Devices”.

This is meant as a scratchpad … I’ll add more later if I have time.

Interacting with Image Collections – Visualisation and Browsing of Image Repositories

Schaefer gave a overview about how to browse large scale image repositories. Interesting, yet of not really related to my research interests. He showed 3 approaches for retrieval: mapping-based, clustering-based and graph-based. I would have loved if he could have gone a bit more in detail in the mobile section at the end.

Continuous Analysis of Emotions for Multimedia Applications

Hatice Gunes and Bjoern Schuller introduced a state of the art in emotion analysis. Their problems seem very similar to what we have to cope with in activity recognition, especially in terms of segmentation and continuous recognition problems. Their inference pipeline is comparable to ours in context recognition.

Where Affective Computing seems to have an edge is in the standardized data sets. There are already quite a lot (mainly focusing on video and audio). I guess it’s also easier compared to the very multi-modal datasets we deal with in activity recogntion.

Hatice Gunes showed two videos of two girls, one is faking a laugh the other one is authentic. Interestingly enough, the whole audience was wrong in picking the authentic laugh. The fake laughing girl was overdoing it and laughed constantly. However, authentic laughter has a time component (coming in waves: increasing, decreasing, increasing again etc.).

The tools section contained the obvious candidates (opencv, kinect, weka …). Sadly they did not mention the new set of tools I love to use. Check out Pandas and iPython.

Good overview about the state of the art. I would have loved to get more information about the subjective nature of emotion. For me it’s not as obvious as activity (already there is a lot of room of ambiguity). Also, depending on personal experience and cultural background, the emotional response to specific stimuli can be diverse.

Semaine Corpus

Media Eval

EmoVoice Audio Emotion classifier

qsensor

London eye mood

Workshop on Audio and Multimedia Methods for Large Scale Video Analysis

Workshop on Interactive Multimedia on Mobile and Portable Devices