Hacking Glass

Disclaimer: Rooting and flashing your device voids your warranty and can brick your Glass. Also you won’t receive OTA updates afterwards. This is not an instruction manual. I just use it as a scratch pad to give a record what I did and what worked for me. The commands below will erase all data on your device. Proceed at your own risk.

##Rooting and Flashing Images## To get root follow the instructions from Google. Unfortunately, the fastboot under Mac OS does not work. I could use a virtual machine on my Mac with ubuntu to get root and flash images.

adb devices
adb reboot-bootloader
fastboot devices

fastboot oem unlock

I needed to execute the last command twice. The first time it just asked me if I was sure if I want to void my warranty etc.

Next I flashed the boot image from the Glass developer page.

fastboot flash boot boot.img
fastboot reboot
adb root
adb shell

If you want to update to a new OTA (in my case XE12) and rooted your device, you can download the zip with all necessary images from Google. It’s cool that they support rooting and flashing (even if it voids your warranty).

fastboot flash boot boot.img
fastboot flash system system.img
fastboot flash recovery recovery.img
fastboot flash userdata userdata.img
fastboot erase cache

##Reading out the Proximity Sensor## I’m most interested in accessing the proximity sensor facing the eye. So thanks to Philip Scholl’s and Shoya’s help, I was able to do it. The device is under

adb root
adb shell
> cat /sys/bus/i2c/devices/4-0035/proxraw

Gives back one raw proximity value from the sensor. Unfortunately without timestamp. If you want to read out the proximity data from Android Apps etc. you need to change the access rights.

>chmod 664 /sys/bus/i2c/devices/4-0035/proxraw

##Privacy Enhancement##

As I will be visiting the Chaos Communication Congress next weekend, I wanted to “privacy enhance” GLASS for the event. I want to wear Glass but don’t really need the camera functionality.

So I used my 3Doodler to make a simple attachment to block the camera of.


The tricky part is that the light sensor for adjusting screen brightness is directly under the camera. If it’s blocked the screen will be very dark. Here’s the “privacy enhanced” Google Glass version.

Glass Enhanced

and a picture taken by it. It’s not completely black due to the issue with the light sensor, yet I think it’s a start ;)

Glass Enhanced

Here is the very basic stencil I used to build the attachment.


Bits and Bytes instead of a Bookshelf

Recently, I gave an interview for the German online issue of the Scientific American (Spectrum der Wissenschaft) for a special about reading habits (in German, paywall).

As I’m interested in the topic, I often hear that “endless scrolling” is bad as it destroys the mental map we make of books and pages or that reading from backlit screens is eye-straining inducing headaches. Personally, I cannot really understand these complaints. I’m doing most of my reading on tablet devices or computer screens though I never experienced these problems directly.

However, active reading -the process of working with the text through highlight, notes, marks- is still better on paper. In Human Computer interaction terms, people talk about affordances. The affordance of paper is very high for active reading. So I find myself still printing out drafts or review papers, especially if it’s a close call and I need to concentrate on the contents. In later case, I even like to change the reading environment, moving away from my laptop/desktop to a meeting table or a bank outside. I believe this helps me concentrate, deliberately shutting out any distractions.

We just started to modify the reading experience using electronic devices. So far most of the applications and reading devices directly mimic the book. We have “ebooks”, “e-reading” software use pages and page turns etc. I believe there is a lot of room for improvement related to reading on screens.

Given the possibility to assess the user’s mental state using Cognitive Activity Recognition, we can change content, structure and style of reading materials dynamically. Most straight forward, if an application detects that a reader looses interest, it could prompt her with an interactive challenge/video or similar. Changing fonts, colors and lettering according to mood and context could be also interesting. There is a fairly new playground opening up for anybody curious and interested in defining new forms of reading.

Interesting further reading:

Shilit et. al. Beyond Paper: Supporting Active Reading with Free Form Digital Ink Annotations

Hartson. Cognitive, physical, sensory, and functional affordances in interaction design

Piper et. al. Tabletop Displays for Small Group Study: Affordances of Paper and Digital Materials

Amazing Okinawa - Attending the ASVAI Workshop

The ASVAI workshop gave a good overview about several research efforts part of and related to the JST CREST and the JSPS Core-to-Core Sanken Program.

Prof. Yasushi Yagi showed how to infer intention from gait analysis. Interestingly, he showed research about the relationship of gaze and gait.

Dr. Alireza Fathi presented cool work about ego centric cameras. He showed how to estimate gaze using ego centric cameras during cooking tasks and psychological studies.

Prof. Hanako Yoshida explores social learning in infants (equipping children with mobile eye trackers … awesome!), inferring developmental stages giving more insights in the learning process.

Prof. Masahiro Shiomi spoke about his research trying to adapt robot behavior to fit into social public spaces ( videos about people running away from a robot included ;) ). Currently, they focus on service robots and model their behavior according to successful human service personnel.

Prof. Yoichi Sato presented work related to detecting visual attention. They use visual saliency on video to train an appearance-based eye tracking. Really interesting work, I had a chance to talk a bit more with Yusuke Sugano, cool research :)

Of course, Koichi also gave an overview about our work. If you want to read more, checkout the IEEE Computer article.

I’m looking forward to the main conference. Here’s a tag cloud using the abstracts of ACPR and ASVAI papers:

Tag cloud

We present demonstrations and new results of the eye tracking on commodity tablets/smart phones and a sharing infrastructure for our document annotation for smart phones.

A Week with Glass

##First Impressions## The Glass device feels expensive and a little bit futuristic. I’m impressed by the build quality and design. It has also a “google” feel to it, e.g. “funny” jokes in the manual (“don’t use glass for scuba diving …"). The display works extremely well and although glass is made for micro-interactions (quickly checking an email/sms, google now updates, making pictures), I could watch videos and read longer emails/documents on it without trouble and any sight problems (I experienced no headaches as happend with other setups, see below). It would be perfect for boring meetings if other people could not see what you are doing … I assume the Glass design team made the conscious decision, to let other people know if you interact with glass. People can see if the screen is on and even recognize what’s on the screen if they get close enough.

##Grandparents and Mother with Google Glass##

I have a basic test for technology or research topics in general. I try to explain it to my grandparents and mother to see if they understand it and find it interesting. Various head-mounted displays, tablets and activity recognition algorithms were tested this way … E.g. they were not so big fans of tablets/slates or smart phones until they played with an iPhone and iPad.

oma opa

Surprisingly, my grandparents did not have the reservation they have towards other computing devices. Usually, they have the feeling that they could destroy something and are extra careful/hesitant. Yet, Google Glass looks like glasses, so it was easy for them to setup and use. The system worked quite well (although so far only English is supported), speech recognition and touch interface were simple to learn after a quick 5 min. introduction. I was surprised myself …

Sadly, the speech interface does a poor job with German names, e.g. googleing for “Apfelkuchen Rezept” (Apple cake recipe) did not work as intended.

Yet, both of them saw potential in Glass and could imagine wearing it during the day. I was most astound by the application cases they came up with.

opa pills

My grandfather took a picture of his pills he needs to take after each meal. He told me, he always wonders if he has taken them or not and sometimes checks 2-3 times after a meal to be certain. Taking a picture and using the touch panel to browse recent pictures (with timestamp), he can easily figure out when he took them the last time.

My grandmother would love to use Glass for gardening. It happens sometimes, that she gets a phone call during garden work and then she has to change shoes, take of gloves etc. and hurry to the portable phone. Additionally, she likes to get the advice of my mum or friends about where to put which flower seeds etc. so she asked me if it’s possible to show the video stream from Glass to other people over the Internet :)

We also did a practise test, My grandmother and mother wore Glass during shopping in Karlsruhe. Both of them wear glasses, so not too many people noticed or looked at them. I think they assumed it’s some kind of medical device or sight improvement etc.

oma ka mum ka

My mother used the time line in glass to track when she made the pictures and traced back when she saw something nice to figure out at which store the item was. She tried taking pictures of price tags. Unfortunately, the resolution on the screen is not high enough to read the price, yet this could be easily fixed with a zoom function for pictures. Interestingly, she also carries a smart phone, yet she never got the idea to use it for shopping like Glass.

##Public Reactions##

As mentioned my mum and grandmother wore Glass nearly unnoticed. This is quite different to my experience … If I wear it in public, most people in Karlsruhe and Mannheim (the two cities I tried) eyed at me with wary faces (you can see the questions in their eyes : “What is he wearing ?? Some medical device ?? NERD!! “). This was particularly bad when I spoke with a clerk or a person directly, as they kept staring at Glass instead of looking into my eyes ;) Social reception was better when I was with my family. Strangely, people asked mostly my grandmother what I was wearing. Very few approached me directly. Reactions fell into 3 broad categories:

  1. “WOW Cool … Glass! How is it? Can I try??” – Note : Before it’s released in public, I strongly recommend not wearing it on any campus with a larger IT faculty. I did not account for that and it was quite difficult to get over Karlsruhe University Campus :)
  2. “Stop violating my privacy!” – During the week I had only one person directly approach me about privacy concerns. The person was quite angry at first. I believe it’s mostly due to misinformation (something Google needs to take serious), as he believed Glass would stream automatically everything to Google and listen to all the conversations etc.. After I showed him the functionality of the device, how to use it and how to see if somebody is using it, he was calmer and actually liked it (could see the potential of a wearable display).
  3. “What’s wrong with this guy?” – Especially if I was traveling alone people stared at me. I asked 1 or 2 of the most obnoxious persons starring at me about it and they answered they thought I was wearing a medical device and they wondered “what’s wrong with me” as I looked otherwise “normal”.

##Some Issues##

The 3 biggest issues I had with it:

  1. Weight and placement - You need to get used to its weight. As I’m not wearing prescription glasses, it feels strange to me wearing something on my nose. It’s definitely heavier than glasses. After a couple of hours it is ok. Also it’s always in your peripheral view, you need to get used to it.
  2. Battery life - Ok, I played a lot with it, given I could use Glass only for a week. At the end (when me playing with it got fewer) I could get barely a day of usage. I expect that’s something they can easily fix. Pst… you can also plug-in a portable USB battery to charge during usage :)
  3. Social acceptance - This is the hardest one to crack. Having used Glass, I don’t understand most of the privacy fears people raise. It’s very obvious if a person is using the device/taking a picture etc. If I want to take covert pictures/videos of people, I believe it’s easier to do with today’s smart phones or spy cameras (available on Amazon for example) …

##Some more Context##

When I unboxed Glass, I remembered how Paul, my phD. advisor, and Thad (Glass project manager) chatted about how in future everybody would wear some kind of head-mounted display and a computing device always connected to the Internet, helping us with everyday tasks - augmentations to our brain.

In the past, Paul was not a huge enthusiast about wearable displays and I agreed with him. I attempted to use the MicroOptical (the display used by Thad) several times and had always terrible headaches afterwards … Just not for me.


Around 2004 - 2010, I played with various wearable setups to use during everyday life during my phD. each only for a week or couple of days. If you work on wearable computing you have to try at least. As seen in the picture above, the only setup working for me was a Prototype HMD from Zeiss with the Qbic, an awesome belt-integrated linux pc by ETH (black belt buckle in the picture), and Twiddler 2. Yet, I stopped using it as the glasses were quite heavy, maintaining/adjusting the software was a hassle (compared to the advantages) and -I have to admit- due to social pressure, imagine living as a cyborg in a small Bavarian town, mostly occupied by law and business students … I found my small, black, analog notebook more handy and less intimidating to other people. Today, I’m an avid iPhone user (Things, Clear, Habit List, Textastic, Prompt and Lendromat …).

##To sum up## In total I was quite sceptical at first, the design reminded me too much on the Microoptical and the headaches I got using it. Completely unfounded! Even given the social acceptance issue, I cannot wait to get Glass for a longer test. However, I really need a good note taking app, running vim on glass would already be a selling point for me, replacing my black notebook (and maybe smart phone?). I undusted my Twiddler2 (took a long time to find it in the cellar) with hacked bluetooth connection, started practicing again and hope I can try it soon with Vim for Glass :D This is definitely not an application case for the mass market … My grandparents told me that they believe there is a broader demand for such a device also by “normal” people (they actually want to use it!). So let’s see.

Plus the researcher in my cannot wait to get easy accessible motion sensors onto the heads of a lot of people. Combined with the sensors in your pocket it’s activity recognition heaven!

Let’s discuss on Hacker News if you want.


Ubicomp ISWC Impressions

Usually, I’m not such a big fan of conference openings, yet Friedemann Mattern provided a great intro giving an overview about the origins of Pervasive and Ubicom mentioning all important people and showing nice vintage pictures from Hans Gellersen, Alois Ferscha, Marc Langheinrich, Albrecht Schmidt, Kristof Van Laerhoven etc.

Deeply impressed by the organization, social and general talk quality, I was a bit sceptical before the merger of Pervasive / Ubicom and collocating ISWC, yet it was completely unfounded.

We got some great feedback for Kazuma’s and Shoya’s demos. They both did a great job introducing their work about:

We got also a lof of interest and feedback to Andreas Bulling’s and my work about recognizing document types using only eye gaze. By the way, below are the talk slides and the abstract of the paper. ##ISWC Talk Slides##

##Abstract## Reading is a ubiquitous activity that many people even per- form in transit, such as while on the bus or while walking. Tracking reading enables us to gain more insights about ex- pertise level and potential knowledge of users – towards a reading log tracking and improve knowledge acquisition. As a first step towards this vision, in this work we investigate whether different document types can be automatically de- tected from visual behaviour recorded using a mobile eye tracker. We present an initial recognition approach that com- bines special purpose eye movement features as well as ma- chine learning for document type detection. We evaluate our approach in a user study with eight participants and five Japanese document types and achieve a recognition perfor- mance of 74% using user-independent training.

Full paper link: I know what you are reading – Recognition of Document Types Using Mobile Eye Tracking

Excited about Ubicomp and ISWC

This year I’m really looking forward to Ubicomp and ISWC, it’s the first time that Ubicomp and Pervasive merged into one conference and it’s the first time the venue sold out with 700 participants.

I cannot wait to chat with old friends and experts (most are both :)).

The field slowly matures. Especially, the wearable research is really pushing towards prime-time. Most prominently, Google Glass is getting a lot of focus also discussing its impacts on privacy. Yet, there is more and more talk about fitness bracelets/trackers and smart watches. I expect that we see more intelligent clothes and activity recognition work in commercial products in the coming years.

By the way, we have 3 poster papers and 2 demos at Ubicomp and a short paper at ISWC.

###Ubicomp Demos and Posters###

###ISWC paper###

Drop by at the demo,poster sessions and/or see me my talk on Thursday.

On a side note, Ubicomp really picks great locations. This year it’s Zurich, next year Seattle and the year after it will be in Osaka. Seems I might be staying longer in Japan, than I originally planned ;)

ICDAR 2013 Talk Slides Online

The slides for my two talks today are online now.

##The Wordometer##

##Reading activity recognition using an off-the-shelf EEG##

For more details, check out the papers:

They are both published at ICDAR 2013.

Recognizing Reading Activities

Just finished my keynote talk at CBDAR (Workshop of ICDAR), got a lot of questions and have a lot of new research ideas :)

I’m pretty ignorant about Document Analysis (and Computer Vision in general), so it’s great to talk to some experts in the field. Pervasive Computing and Document Analysis are very complementary and as such interesting to combine.

Here are my talk slides, followed by the talk abstract.

##Real-life Activity Recognition - Talk Abstract##

Most applications in intelligent environments so far strongly rely on specific sensor combinations at predefined positions, orientations etc. While this might be acceptable for some application domains (e.g. industry), it hinders the wide adoption of pervasive computing. How can we extract high level information about human actions and complex real world situations from heterogeneous ensembles of simple, often unreliable sensors embedded in commodity devices?

This talk mostly focuses on how to use body-worn devices for activity recognition in general, and how to combine them with infrastructure sensing and computer vision approaches for a specific high level human activity, namely better understanding knowledge acquisition (e.g. recognizing reading activities).

We discuss how placement variations of electronic appliances carried by the user influence the possibility of using sensors integrated in those appliances for human activity recognition. I categorize possible variations into four classes: environmental placements, placement on different body parts (e.g. jacket pocket on the chest, vs. a hip holster vs. the trousers pocket), small displacement within a given coarse location (e.g. device shifting in a pocket), and different orientations.For each of these variations, I give an overview of our efforts to deal with them.

In the second part of the talk, we combine several pervasive sensing approaches (computer vision, motion-based activity recognition etc.) to tackle the problem of recognizing and classifying knowledge acquisition tasks with a special focus on reading. We discuss which sensing modalities can be used for digital and offline reading recognition, as well as how to combine them dynamically.

Wordometer and Document Analysis using Pervasive Sensing

wordometer In the last couple of months, I got more and more interested in learning, especially reading. Loving tech and sports, I got easily hooked on the Quantified Self movement (I own a Zeo Sleeping Coach and several step counters). Seeing how measuring myself transformed me. I lost around 4 kg and feel healthier/fitter, since I started tracking. I wonder why we don’t have similar tools for our learning behavior.

So we created a simple Wordometer in our Lab, using the SMI mobile eyetracker and document image retrieval (LLAH). We simply detect reading (very distinct horizontal or vertical movements) and afterwards count line breaks. Assuming a fixed number of words per line, voilà here is your Wordometer. The document image retreival is used to keep the accuracy at around 5-7 % (comparable to the pedometers measuring your steps each day).

Of course, there are a couple of limitations:

  1. Mobile Eyetrackers are DAMN expensive. Yet, the main reason being that there is no demand and they are manufactured in relatively low numbers. A glass frame, 2 cameras and 2 infra-red sources that’s it (together with a bit of image processing magic).
  2. Document Image Retrieval means you need to register all documents with a server before reading them. I won’t go into details as this limitation is the easiest to get rid of. We are currently working on a method without it. At the beginning it was easier to include (and improve the accuracy rate).
  3. Not everybody likes to wear glasses. With the recent mixed reception of Google Glass, it seems that wearing glasses is way more a fashion statement than wearing a big “smart” phone or similar. So this tech might not be for everybody.

Overall, I’m still very exited on what a cheap, public avaiable Wordometer will do to the reading habits of people and their “knowledge life”. We’ll continue working on it ;)

We are also using eyetracking, EEG and other sensors to get more information about what/how a user is reading. Interestingly, it seems using the Emotiv EEG we can detect reading versus not reading and even some document types (manga versus textbook).

Disclaimer: This work would not be possible without two very talented students: Hitoshi Kawaichi and Kazuyo Yoshimura. Thanks for the hard work :D

For more details, check out the papers:

They are both published at ICDAR 2013.

Kai @ CHI

So it’s my first time at CHI. Pretty amazing so far … Will blog about more later.

Chi PosterKai@CHI

I’m in the first poster rotation, starting this afternoon: “Towards inferring language expertise using eye tracking” Drop by my poster if you’re around (or try to spot me, I’m wearing the white “Kai@CHI” Shirt today :)).

Here’s the abstract of our work, as well as the link to the paper.

“We present initial work towards recognizing reading activities. This paper describes our efforts detect the English skill level of a user and infer which words are difficult for them to understand. We present an initial study of 5 students and show our findings regarding the skill level assessment. We explain a method to spot difficult words. Eye tracking is a promising technology to examine and assess a user’s skill level.”