Ubicomp ISWC Impressions

Usually, I’m not such a big fan of conference openings, yet Friedemann Mattern provided a great intro giving an overview about the origins of Pervasive and Ubicom mentioning all important people and showing nice vintage pictures from Hans Gellersen, Alois Ferscha, Marc Langheinrich, Albrecht Schmidt, Kristof Van Laerhoven etc.

Deeply impressed by the organization, social and general talk quality, I was a bit sceptical before the merger of Pervasive / Ubicom and collocating ISWC, yet it was completely unfounded.

We got some great feedback for Kazuma’s and Shoya’s demos. They both did a great job introducing their work about:

We got also a lof of interest and feedback to Andreas Bulling’s and my work about recognizing document types using only eye gaze. By the way, below are the talk slides and the abstract of the paper. ##ISWC Talk Slides##

##Abstract## Reading is a ubiquitous activity that many people even per- form in transit, such as while on the bus or while walking. Tracking reading enables us to gain more insights about ex- pertise level and potential knowledge of users – towards a reading log tracking and improve knowledge acquisition. As a first step towards this vision, in this work we investigate whether different document types can be automatically de- tected from visual behaviour recorded using a mobile eye tracker. We present an initial recognition approach that com- bines special purpose eye movement features as well as ma- chine learning for document type detection. We evaluate our approach in a user study with eight participants and five Japanese document types and achieve a recognition perfor- mance of 74% using user-independent training.

Full paper link: I know what you are reading – Recognition of Document Types Using Mobile Eye Tracking

Excited about Ubicomp and ISWC

This year I’m really looking forward to Ubicomp and ISWC, it’s the first time that Ubicomp and Pervasive merged into one conference and it’s the first time the venue sold out with 700 participants.

I cannot wait to chat with old friends and experts (most are both :)).

The field slowly matures. Especially, the wearable research is really pushing towards prime-time. Most prominently, Google Glass is getting a lot of focus also discussing its impacts on privacy. Yet, there is more and more talk about fitness bracelets/trackers and smart watches. I expect that we see more intelligent clothes and activity recognition work in commercial products in the coming years.

By the way, we have 3 poster papers and 2 demos at Ubicomp and a short paper at ISWC.

###Ubicomp Demos and Posters###

###ISWC paper###

Drop by at the demo,poster sessions and/or see me my talk on Thursday.

On a side note, Ubicomp really picks great locations. This year it’s Zurich, next year Seattle and the year after it will be in Osaka. Seems I might be staying longer in Japan, than I originally planned ;)

ICDAR 2013 Talk Slides Online

The slides for my two talks today are online now.

##The Wordometer##

##Reading activity recognition using an off-the-shelf EEG##

For more details, check out the papers:

They are both published at ICDAR 2013.

Recognizing Reading Activities

Just finished my keynote talk at CBDAR (Workshop of ICDAR), got a lot of questions and have a lot of new research ideas :)

I’m pretty ignorant about Document Analysis (and Computer Vision in general), so it’s great to talk to some experts in the field. Pervasive Computing and Document Analysis are very complementary and as such interesting to combine.

Here are my talk slides, followed by the talk abstract.

##Real-life Activity Recognition - Talk Abstract##

Most applications in intelligent environments so far strongly rely on specific sensor combinations at predefined positions, orientations etc. While this might be acceptable for some application domains (e.g. industry), it hinders the wide adoption of pervasive computing. How can we extract high level information about human actions and complex real world situations from heterogeneous ensembles of simple, often unreliable sensors embedded in commodity devices?

This talk mostly focuses on how to use body-worn devices for activity recognition in general, and how to combine them with infrastructure sensing and computer vision approaches for a specific high level human activity, namely better understanding knowledge acquisition (e.g. recognizing reading activities).

We discuss how placement variations of electronic appliances carried by the user influence the possibility of using sensors integrated in those appliances for human activity recognition. I categorize possible variations into four classes: environmental placements, placement on different body parts (e.g. jacket pocket on the chest, vs. a hip holster vs. the trousers pocket), small displacement within a given coarse location (e.g. device shifting in a pocket), and different orientations.For each of these variations, I give an overview of our efforts to deal with them.

In the second part of the talk, we combine several pervasive sensing approaches (computer vision, motion-based activity recognition etc.) to tackle the problem of recognizing and classifying knowledge acquisition tasks with a special focus on reading. We discuss which sensing modalities can be used for digital and offline reading recognition, as well as how to combine them dynamically.

Wordometer and Document Analysis using Pervasive Sensing

wordometer

In the last couple of months, I got more and more interested in learning, especially reading. Loving tech and sports, I got easily hooked on the Quantified Self movement (I own a Zeo Sleeping Coach and several step counters). Seeing how measuring myself transformed me. I lost around 4 kg and feel healthier/fitter, since I started tracking. I wonder why we don’t have similar tools for our learning behavior.

So we created a simple Wordometer in our Lab, using the SMI mobile eyetracker and document image retrieval (LLAH). We simply detect reading (very distinct horizontal or vertical movements) and afterwards count line breaks. Assuming a fixed number of words per line, voilà here is your Wordometer. The document image retreival is used to keep the accuracy at around 5-7 % (comparable to the pedometers measuring your steps each day).

Of course, there are a couple of limitations:

  1. Mobile Eyetrackers are DAMN expensive. Yet, the main reason being that there is no demand and they are manufactured in relatively low numbers. A glass frame, 2 cameras and 2 infra-red sources that’s it (together with a bit of image processing magic).
  2. Document Image Retrieval means you need to register all documents with a server before reading them. I won’t go into details as this limitation is the easiest to get rid of. We are currently working on a method without it. At the beginning it was easier to include (and improve the accuracy rate).
  3. Not everybody likes to wear glasses. With the recent mixed reception of Google Glass, it seems that wearing glasses is way more a fashion statement than wearing a big “smart” phone or similar. So this tech might not be for everybody.

Overall, I’m still very exited on what a cheap, public avaiable Wordometer will do to the reading habits of people and their “knowledge life”. We’ll continue working on it ;)

We are also using eyetracking, EEG and other sensors to get more information about what/how a user is reading. Interestingly, it seems using the Emotiv EEG we can detect reading versus not reading and even some document types (manga versus textbook).

Disclaimer: This work would not be possible without two very talented students: Hitoshi Kawaichi and Kazuyo Yoshimura. Thanks for the hard work :D

For more details, check out the papers:

They are both published at ICDAR 2013.

Kai @ CHI

So it’s my first time at CHI. Pretty amazing so far … Will blog about more later.

Chi PosterKai@CHI

I’m in the first poster rotation, starting this afternoon: “Towards inferring language expertise using eye tracking” Drop by my poster if you’re around (or try to spot me, I’m wearing the white “Kai@CHI” Shirt today :)).

Here’s the abstract of our work, as well as the link to the paper.

“We present initial work towards recognizing reading activities. This paper describes our efforts detect the English skill level of a user and infer which words are difficult for them to understand. We present an initial study of 5 students and show our findings regarding the skill level assessment. We explain a method to spot difficult words. Eye tracking is a promising technology to examine and assess a user’s skill level.”

Activity Recognition Dagstuhl Report Online

If you wonder how we spent German tax money, the summary of the Activity Recognition Dagstuhl seminar is now online.

Human Activity Recognition in Smart Environments (Dagstuhl Seminar 12492)


Here’s the abstract:

This report documents the program and the outcomes of Dagstuhl Seminar 12492 “Human Activity Recognition in Smart Environments”. We established the basis for a scientific community surrounding “activity recognition” by involving researchers from a broad range of related research fields. 30 academic and industry researchers from US, Europe and Asia participated from diverse fields including pervasive computing, over network analysis and computer vision to human computer interaction. The major results of this Seminar are the creation of a activity recognition repository to share information, code, publications and the start of an activity recognition book aimed to serve as a scientific introduction to the field. In the following, we go into more detail about the structure of the seminar, discuss the major outcomes and give an overview about discussions and talks given during the seminar.

Some of my favorites from the 29c3 recordings

Over the last weeks, I finally got around to watch some of the 29c3 recordings. Here are some of my favourites. I will update the list accordingly.

I link to the official recording available from the CCC domain. The talks however are also on youtube. Just search for the talk title.

In General, I found most talks focused on security, sadly not really my main interest. I missed some research and culture talks that were present the last years. Examples from the last years:Data Mining for Hackers awesome talk!! or one of Bicyclemark episodes. Bicylcemark we miss you :)

English

Out of the hacking talks, for me by far the most entertaining was Hacking Cisco Phones. Scary and so cool. Ang Cui and Michael Costello are also quite good presenters. The hand-drawn slides give the visuals also a nice touch. I won’t spoil the contents. just watch it.

So far my most favorite talk is Romantic Hackers by Anne Marggraf-Turley and Prof. Richard Marggraf-Turley. About surveillance andy hackers in the Romantic period. I was not aware that the privacy problems and the ideas about pervasive surveillance had been discussed and encountered so early in human history. Very Insightful and fun.

The Tamagochi Talk was fun. Although the speaker seemed to be a bit nervous (listening to her voice), she gave some great insides how Tamagochis work and how to hack them.

The keynote from Jacob Applebaum, Not my department is a call to action for the tech community discussing about the responsibilities we have regarding our research and how it might be used. Although Applebaum is a great public speaker and the topic is of utmost importance, for some people new to the discussion it might seem a bit out of context and difficult to understand.

German

If you can speak German or want to practice it, check them out … Of course, the usual subjects Fnord News Show and Security Nightmares are always great candidates to watch.

I’m also always looking forward to the yearly Martin Haase Talk. Unfortunately, the official release is not online yet. Interesting especially for language geeks.

The talk Are fair computers possible? explores what needs to change in manufacturing standards etc. to produce computers without child labor and fair employment conditions for all workers involved.

ACM Multimedia 2012 Main Conference Notes

This is a scratchpad … will fill the rest when I have time.

Papers

I really enjoyed the work from Heng Liu, Tao Mei et. al. “Finding Perfect Rendezvous On the Go: Accurate Mobile Visual Localization and Its Applications to Routing”. They combine existing research in a very interesting mixture. They use a visual localization method based on bundler to detect where in the city a mobile phone user is. The application scenario I liked best was their collaborative localization for rendezvous :)

The best paper award went to Zhi Wang, Lifeng Sun, Xiangwen Chen, Wenwu Zhu, Jiangchuan Liu, Minghua Chen and Shiqiang Yang for “Propagation-Based Social-Aware Replication for Social Video Contents”. They use the contacts mined over social networking to replicate content for better streaming and content distribution. The presentation was great, the research solid, still it’s not a topic I’m very interested in. However, for content providers it seems very useful.

Shih-Yao Lin et. al. presented a system to recognize the users motion using the kinect and imitate them via a marionette in “Action Recognition for Human-Marionette Interaction”. I hoped to get more information about the interactions between users and marionettes, still very stylish presentation and artsy topic.

Hamdi Dibeklioglu et. al. showed how to infer the age of a person when they are simling in “A Smile Can Reveal Your Age: Enabling Facial Dynamics in Age Estimation”. I find fascinating to hear about small cues that can tell a lot about a person or a situation.

Fascinating work by Victoria Yanulevskaya et. al. (“In the Eye of the Beholder: Employing Statistical Analysis and Eye Tracking for Analyzing Abstract Paintings”). They link the emotional impact of a painting to the eye movements of the observer. Very interesting and in line with my current focus. I wonder if also expertise etc. can be recognized using sensors.

Another very art focused paper I enjoyed was “Dinner of Luciérnaga-An interactive Play with iPhone App in Theater” by Yu-Chuan Tseng. Theater visitors can interact with the play using their smart phone (getting also feedback on the device ….).

Posters, Demos, Competitions

The winner of the Multimedia Grand Challenge was very well deserved. “Analysis of Dance Movements using Gaussian Processes” by Antoine Liutkus et. al. decomposed dance moves using Gaussian processes in movements with slow periodicity, high periodicity and moves that happened just once. Fascinating and applicable to so many fields … :)

A very neat demo was presented by Wei Zhang et. al.: “FashionAsk: Pushing Community Answers to Your Fingertips”.

Other Notes

As expected from any conference in Japan :), the organisation was flawless. In case any if the organisers is reading this. Thanks again. Nara is a perfect place for a venue like this (deer, world heritage sites, good food …).

More curiously, although there was a lot of talk about social media and some lively discussions on twitter, I seemed to be the only participant on ADN at least posting with hashtag.

ACM Multimedia 2012 Tutorials and Workshops

I attended the Tutorials “Interacting with Image Collections – Visualisation and Browsing of Image Repositories” and “Continuous Analysis of Emotions for Multimedia Applications” on the first day.

The last day I went to “Workshop on Audio and Multimedia Methods for Large Scale Video Analysis” and to the “Workshop on Interactive Multimedia on Mobile and Portable Devices”.

This is meant as a scratchpad … I’ll add more later if I have time.

Interacting with Image Collections – Visualisation and Browsing of Image Repositories

Schaefer gave a overview about how to browse large scale image repositories. Interesting, yet of not really related to my research interests. He showed 3 approaches for retrieval: mapping-based, clustering-based and graph-based. I would have loved if he could have gone a bit more in detail in the mobile section at the end.

Continuous Analysis of Emotions for Multimedia Applications

Hatice Gunes and Bjoern Schuller introduced a state of the art in emotion analysis. Their problems seem very similar to what we have to cope with in activity recognition, especially in terms of segmentation and continuous recognition problems. Their inference pipeline is comparable to ours in context recognition.

Where Affective Computing seems to have an edge is in the standardized data sets. There are already quite a lot (mainly focusing on video and audio). I guess it’s also easier compared to the very multi-modal datasets we deal with in activity recogntion.

Hatice Gunes showed two videos of two girls, one is faking a laugh the other one is authentic. Interestingly enough, the whole audience was wrong in picking the authentic laugh. The fake laughing girl was overdoing it and laughed constantly. However, authentic laughter has a time component (coming in waves: increasing, decreasing, increasing again etc.).

The tools section contained the obvious candidates (opencv, kinect, weka …). Sadly they did not mention the new set of tools I love to use. Check out Pandas and iPython.

Good overview about the state of the art. I would have loved to get more information about the subjective nature of emotion. For me it’s not as obvious as activity (already there is a lot of room of ambiguity). Also, depending on personal experience and cultural background, the emotional response to specific stimuli can be diverse.

Semaine Corpus

Media Eval

EmoVoice Audio Emotion classifier

qsensor

London eye mood

Workshop on Audio and Multimedia Methods for Large Scale Video Analysis

Workshop on Interactive Multimedia on Mobile and Portable Devices